Use Einsum

When to use this page

Use this page to understand me.einsum — the core computation primitive in mechestim.

Prerequisites

Your First Budget

Common patterns

import mechestim as me

with me.BudgetContext(flop_budget=10**8) as budget:
    A = me.ones((256, 256))
    B = me.ones((256, 256))
    x = me.ones((256,))

    # Matrix-vector multiply: cost = 2 × m × k
    y = me.einsum('ij,j->i', A, x)           # 2 × 256 × 256 = 131,072 FLOPs

    # Matrix multiply: cost = 2 × m × k × n
    C = me.einsum('ij,jk->ik', A, B)         # 2 × 256 × 256 × 256 = 33,554,432 FLOPs

    # Outer product: cost = i × j (no summation, no ×2)
    outer = me.einsum('i,j->ij', x, x)       # 256 × 256 = 65,536 FLOPs

    # Trace: cost = 2 × i
    tr = me.einsum('ii->', A)                 # 2 × 256 = 512 FLOPs

    # Batched matmul: cost = 2 × b × m × k × n
    batch = me.ones((4, 256, 256))
    out = me.einsum('bij,bjk->bik', batch, batch)  # 2 × 4 × 256 × 256 × 256 FLOPs

    print(budget.summary())

Cost formula

The cost of an einsum is the sum of per-step costs along the optimal contraction path. Every einsum — even a simple two-operand one — goes through the opt_einsum path optimizer (a symmetry-aware fork of opt_einsum).

For each pairwise step:

cost = (product of all index dimensions) × op_factor

where op_factor = 2 when indices are summed (multiply + add) and op_factor = 1 when no indices are summed (outer product / pure assignment).

For 'ij,jk->ik' with shapes (256, 256) and (256, 256): - Indices: i=256, j=256, k=256 - j is summed out, so op_factor = 2 - Cost: 2 x 256 x 256 x 256 = 33,554,432

For multi-operand einsums (3+ tensors), mechestim automatically decomposes the contraction into optimal pairwise steps. The total cost is the sum of per-step costs.

me.dot and me.matmul

me.dot(A, B) and me.matmul(A, B) are equivalent to the corresponding einsum and have the same FLOP cost.

Symmetric tensors

There are two separate symmetry declarations — one for inputs, one for outputs:

Input symmetry — wrap with me.as_symmetric() before passing to einsum. The optimizer automatically uses symmetry to choose the best contraction order and charges reduced costs:

with me.BudgetContext(flop_budget=10**8) as budget:
    S = me.as_symmetric(np.eye(10), symmetric_axes=(0, 1))  # 55 unique elements
    v = me.ones((10,))

    result = me.einsum('ij,j->i', S, v)  # cost reduced by input symmetry

Output symmetry — pass symmetric_axes to einsum() to declare that the result is symmetric. This wraps the output as a SymmetricTensor so downstream operations benefit from reduced costs. It does NOT affect the cost of this einsum — it's a declaration about the result's structure:

with me.BudgetContext(flop_budget=10**8) as budget:
    X = me.array(np.random.randn(100, 10))

    # X^T X is always symmetric — declare output axes (0,1) as symmetric
    C = me.einsum('ki,kj->ij', X, X, symmetric_axes=[(0, 1)])

    print(type(C))  # <class 'SymmetricTensor'>
    # C can now be passed to other operations with automatic cost savings

For the full symmetry guide, see Exploit Symmetry Savings.

Inspecting costs

me.einsum_path() previews the contraction plan without executing or spending any budget:

path, info = me.einsum_path('ijk,ai,bj,ck->abc', T, A, B, C)

print(info)
# Step  Subscript         FLOPs  Dense FLOPs  Symmetry Savings
# ────  ────────────────  ─────  ───────────  ────────────────
# 0     ijk,ai->ajk       ...    ...          ...
# 1     ajk,bj->abk       ...    ...          ...
# 2     abk,ck->abc       ...    ...          ...
# ────  ────────────────  ─────  ───────────  ────────────────
# Total                   ...    ...          ...x speedup

print(f"Naive cost:     {info.naive_cost:,}")
print(f"Optimized cost: {info.optimized_cost:,}")
print(f"Speedup:        {info.speedup:.1f}x")

me.flops.einsum_cost() returns the same cost that einsum() would deduct — one source of truth:

cost = me.flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
print(f"Matmul cost: {cost:,}")  # 33,554,432

Custom contraction paths

By default mechestim finds the optimal contraction order automatically. You can override this by passing an explicit path — a list of int-tuples specifying which operand positions to contract at each step:

import mechestim as me

A = me.ones((3, 4))
B = me.ones((4, 5))
C = me.ones((5, 6))

# Plan first, execute later
path, info = me.einsum_path('ij,jk,kl->il', A, B, C)
print(f"Optimal path: {path}")  # e.g. [(0, 1), (0, 1)]

# Execute with the planned path
with me.BudgetContext(flop_budget=10**8) as budget:
    result = me.einsum('ij,jk,kl->il', A, B, C, optimize=path)

You can also specify a completely custom path. Each tuple names the positions (in the current operand list) to contract; the result is appended to the end:

# Force B×C first (positions 1,2), then A×result (positions 0,1)
result = me.einsum('ij,jk,kl->il', A, B, C, optimize=[(1, 2), (0, 1)])

# Force A×B first (positions 0,1), then result×C (positions 0,1)
result = me.einsum('ij,jk,kl->il', A, B, C, optimize=[(0, 1), (0, 1)])

Different paths may have different FLOP costs. Use me.einsum_path() to compare — it returns the cost without executing or spending budget.

⚠️ Common pitfalls

Symptom: Unexpectedly high FLOP cost

Fix: Check all index dimensions. A subscript like 'ijkl,jklm->im' multiplies all five dimension sizes together (times op_factor). Use me.flops.einsum_cost() or me.einsum_path() to preview costs before executing.

Exploit Symmetry — full guide to symmetry mechanisms
Path Optimizer — algorithms, symmetry support, and upstream attribution
Symmetric Tensors API — SymmetricTensor, SymmetryInfo, as_symmetric
Plan Your Budget — query costs before executing
FLOP Counting Model — how costs are computed