# mechestim

> NumPy-compatible math primitives with FLOP counting. mechestim is NOT a drop-in NumPy replacement — every arithmetic operation is counted and charged against a FLOP budget.

# Getting Started

## Installation

# Installation

## When to use this page

Use this page when setting up mechestim for the first time.

## Install as a dependency

```bash
uv add git+https://github.com/AIcrowd/mechestim.git
```

## Install for development

```bash
git clone https://github.com/AIcrowd/mechestim.git
cd mechestim
uv sync --all-extras
```

## ✅ Verify installation

```bash
uv run python -c "import mechestim as me; print(me.__version__)"
```

## 🔍 What you'll see

```
0.2.0+np2.1.3
```

The version string includes the pinned NumPy version suffix. If you see a version number, mechestim is installed correctly.

## ⚠️ Common pitfalls

**Symptom:** `ImportError: numpy version mismatch`

**Fix:** mechestim requires NumPy >=2.1.0,<2.2.0. Using `uv` handles this automatically. If you installed manually, check your NumPy version:

```bash
uv run python -c "import numpy; print(numpy.__version__)"
```

## 📎 Related pages

- [Your First Budget](./first-budget.md) — run your first FLOP-counted computation


## Your First Budget

# Your First Budget

## When to use this page

Use this page after installing mechestim to run your first FLOP-counted computation.

## Prerequisites

- [Installation](./installation.md)

## Quickest possible start

You do not need to set up a budget context to start counting FLOPs. mechestim activates a global default context the first time any counted operation runs. The default budget is 1e15 FLOPs (configurable via the `MECHESTIM_DEFAULT_BUDGET` environment variable).

Save this as `first_budget.py`:

```python
import math
import mechestim as me

depth = 10   # number of layers
width = 256  # hidden dimension

# No BudgetContext needed — the global default activates automatically
scale = math.sqrt(2.0 / width)   # Kaiming init scale; free (no FLOPs)
weights = [
    me.array(me.random.randn(width, width) * scale)
    for _ in range(depth)
]
x = me.random.randn(width)

h = x
for W in weights:
    h = me.einsum('ij,j->i', W, h)  # matrix-vector multiply: 2 × 256 × 256 = 131,072 FLOPs
    h = me.maximum(h, 0)             # ReLU activation: counted

result = me.sum(h)                   # reduction: counted

# Print a Rich-formatted summary across all namespaces
me.budget_summary()
```

Run it:

```bash
uv run python first_budget.py
```

## 🔍 What you'll see

```
┌─────────────────────────────────────────────────────────┐
│              mechestim FLOP Budget Summary               │
├──────────────┬───────────────┬────────────┬─────────────┤
│ Namespace    │ Budget        │ Used       │ Remaining   │
├──────────────┼───────────────┼────────────┼─────────────┤
│ (default)    │ 1.00e+15      │ 1,969,152  │ ~1.00e+15   │
└──────────────┴───────────────┴────────────┴─────────────┘

  (default) — by operation
    einsum        1,310,720  ( 66.6%)  [10 calls]
    random.randn    655,616  ( 33.3%)  [11 calls]
    maximum           2,560  (  0.1%)  [10 calls]
    sum                 256  (  0.0%)   [1 call]
```

**Reading the output:**

- **Namespace:** `(default)` is the auto-created global context; named contexts appear here once you add them
- **Used / Remaining:** how much of the budget has been consumed across all calls
- **By operation:** breakdown of costs per operation type and call count
- The 10-layer MLP spends two-thirds of FLOPs on `einsum` (matrix multiplies) and one-third on random number generation; activations (`maximum`) are comparatively cheap

## Explicit context with a namespace

When you want a tighter budget or cleaner grouping in summaries, wrap operations in a `BudgetContext` and give it a namespace:

```python
import math
import mechestim as me

depth = 10
width = 256

with me.BudgetContext(flop_budget=50_000_000, namespace="mlp-forward") as budget:
    scale = math.sqrt(2.0 / width)
    weights = [
        me.array(me.random.randn(width, width) * scale)
        for _ in range(depth)
    ]
    x = me.random.randn(width)

    h = x
    for W in weights:
        h = me.einsum('ij,j->i', W, h)  # matrix-vector multiply: 2 × 256 × 256 = 131,072 FLOPs
        h = me.maximum(h, 0)             # 256 FLOPs each pass

    result = me.sum(h)                   # 256 FLOPs

me.budget_summary()
```

```
┌──────────────────────────────────────────────────────────┐
│              mechestim FLOP Budget Summary                │
├───────────────┬────────────┬───────────┬─────────────────┤
│ Namespace     │ Budget     │ Used      │ Remaining       │
├───────────────┼────────────┼───────────┼─────────────────┤
│ mlp-forward   │ 50,000,000 │ 1,969,152 │ 48,030,848      │
└───────────────┴────────────┴───────────┴─────────────────┘

  mlp-forward — by operation
    einsum        1,310,720  ( 66.6%)  [10 calls]
    random.randn    655,616  ( 33.3%)  [11 calls]
    maximum           2,560  (  0.1%)  [10 calls]
    sum                 256  (  0.0%)   [1 call]
```

## Decorator form

Use `@me.budget` to attach a budget directly to a function. The namespace defaults to the function name if omitted:

```python
import math
import mechestim as me

@me.budget(flop_budget=50_000_000, namespace="mlp-forward")
def run_mlp(depth: int = 10, width: int = 256):
    scale = math.sqrt(2.0 / width)
    weights = [
        me.array(me.random.randn(width, width) * scale)
        for _ in range(depth)
    ]
    x = me.random.randn(width)

    h = x
    for W in weights:
        h = me.einsum('ij,j->i', W, h)
        h = me.maximum(h, 0)

    return me.sum(h)

run_mlp()
me.budget_summary()
```

Each call to `run_mlp()` draws from the same `mlp-forward` namespace budget. Call `me.budget_summary_dict()` to retrieve the summary as a plain dict for programmatic use:

```python
data = me.budget_summary_dict()
# {'flop_budget': ..., 'flops_used': ..., 'flops_remaining': ..., 'operations': {...}}

# For per-namespace breakdown:
data = me.budget_summary_dict(by_namespace=True)
# data["by_namespace"]["mlp-forward"]["flops_used"] -> 1313536
```

## Configuring the global default budget

The global default budget is 1e15 FLOPs (1 quadrillion). You can change this via the `MECHESTIM_DEFAULT_BUDGET` environment variable:

```bash
# Set a smaller default budget (e.g., 1 billion FLOPs)
export MECHESTIM_DEFAULT_BUDGET=1e9
uv run python your_script.py
```

The env var is read once when the global default is first created (on the first counted operation). It accepts any numeric value that Python's `float()` can parse (e.g., `1e9`, `1000000000`, `5e12`).

## ⚠️ Common pitfalls

**Symptom:** `BudgetExhaustedError`

**Fix:** Your operations exceed the budget you set. Increase `flop_budget` on the `BudgetContext` (or decorator), reduce computation, or rely on the global default context which has a 1e15 FLOP ceiling (configurable via `MECHESTIM_DEFAULT_BUDGET`).

**Note on `NoBudgetContextError`:** This error no longer triggers in normal use. The global default context activates automatically on first use, so bare calls outside any `with` block are safe.

## 📎 Related pages

- [Migrate from NumPy](../how-to/migrate-from-numpy.md) — convert existing NumPy code to mechestim
- [Plan Your Budget](../how-to/plan-your-budget.md) — query operation costs before executing


# How-To Guides

## Migrate from NumPy

# Migrate from NumPy

## When to use this page

Use this page when converting existing NumPy code to mechestim.

## Prerequisites

- [Installation](../getting-started/installation.md)
- [Your First Budget](../getting-started/first-budget.md)

## The basics

Change your import and wrap computation in a BudgetContext:

**Before (NumPy):**

```python
import numpy as np

W = np.random.randn(256, 256)
x = np.random.randn(256)
h = np.dot(W, x)
h = np.maximum(h, 0)
```

**After (mechestim) — simplest form:**

```python
import mechestim as me

# No setup needed — global default budget tracks FLOPs automatically
W = me.random.randn(256, 256)
x = me.random.randn(256)
h = me.dot(W, x)
h = me.maximum(h, 0)

me.budget_summary()  # see what you spent
```

**After (mechestim) — with explicit budget control:**

```python
import mechestim as me

with me.BudgetContext(flop_budget=20_000_000) as budget:
    W = me.random.randn(256, 256)
    x = me.random.randn(256)
    h = me.dot(W, x)
    h = me.maximum(h, 0)
```

## What stays the same

- Function signatures match NumPy for supported operations
- Broadcasting rules are identical
- Array indexing, slicing, and assignment work normally

## What changes

| NumPy | mechestim | Notes |
|-------|-----------|-------|
| `import numpy as np` | `import mechestim as me` | Drop-in replacement |
| Call ops anywhere | Works anywhere too | A global default budget auto-activates; use explicit `BudgetContext` for limits and namespacing |
| `np.linalg.svd(A)` | `me.linalg.svd(A, k=10)` | Truncated SVD with explicit `k` |
| Plain `ndarray` only | `SymmetricTensor` available | Wrap with `me.as_symmetric()` for cost savings |
| All NumPy ops available | Most available, 32 blacklisted | I/O and config ops raise `AttributeError` |
| No cost tracking | Automatic FLOP counting | Every counted op deducts from budget |

## Common pitfalls

**Symptom:** `AttributeError` when calling an I/O or config function (e.g., `me.save`, `me.seterr`)

**Fix:** 32 operations are blacklisted because they are I/O, configuration, or datetime functions with no FLOP cost. See [Operation Categories](../concepts/operation-categories.md) for the full list. Use `numpy` directly for these.

**Symptom:** Using `np.linalg.svd` instead of `me.linalg.svd`

**Fix:** If you import NumPy alongside mechestim, make sure to use `me.` for operations you want counted. Operations called through `np.` bypass FLOP counting entirely.

## 📎 Related pages

- [Operation Categories](../concepts/operation-categories.md) — what's supported and what isn't
- [Operation Audit](../reference/operation-audit.md) — full list of all operations


## Use Einsum

# Use Einsum

## When to use this page

Use this page to understand `me.einsum` — the core computation primitive in mechestim.

## Prerequisites

- [Your First Budget](../getting-started/first-budget.md)

## Common patterns

```python
import mechestim as me

with me.BudgetContext(flop_budget=10**8) as budget:
    A = me.ones((256, 256))
    B = me.ones((256, 256))
    x = me.ones((256,))

    # Matrix-vector multiply: cost = 2 × m × k
    y = me.einsum('ij,j->i', A, x)           # 2 × 256 × 256 = 131,072 FLOPs

    # Matrix multiply: cost = 2 × m × k × n
    C = me.einsum('ij,jk->ik', A, B)         # 2 × 256 × 256 × 256 = 33,554,432 FLOPs

    # Outer product: cost = i × j (no summation, no ×2)
    outer = me.einsum('i,j->ij', x, x)       # 256 × 256 = 65,536 FLOPs

    # Trace: cost = 2 × i
    tr = me.einsum('ii->', A)                 # 2 × 256 = 512 FLOPs

    # Batched matmul: cost = 2 × b × m × k × n
    batch = me.ones((4, 256, 256))
    out = me.einsum('bij,bjk->bik', batch, batch)  # 2 × 4 × 256 × 256 × 256 FLOPs

    print(budget.summary())
```

## Cost formula

The cost of an einsum is the sum of per-step costs along the optimal contraction path. Every einsum — even a simple two-operand one — goes through the [opt_einsum path optimizer](../api/opt-einsum.md) (a symmetry-aware fork of [opt_einsum](https://github.com/dgasmith/opt_einsum)).

For each pairwise step:

```
cost = (product of all index dimensions) × op_factor
```

where `op_factor = 2` when indices are summed (multiply + add) and `op_factor = 1` when no indices are summed (outer product / pure assignment).

For `'ij,jk->ik'` with shapes `(256, 256)` and `(256, 256)`:
- Indices: i=256, j=256, k=256
- `j` is summed out, so op_factor = 2
- Cost: 2 x 256 x 256 x 256 = 33,554,432

For multi-operand einsums (3+ tensors), mechestim automatically decomposes the contraction into optimal pairwise steps. The total cost is the sum of per-step costs.

## me.dot and me.matmul

`me.dot(A, B)` and `me.matmul(A, B)` are equivalent to the corresponding einsum and have the same FLOP cost.

## Symmetric tensors

There are two separate symmetry declarations — one for inputs, one for outputs:

**Input symmetry** — wrap with `me.as_symmetric()` before passing to einsum. The optimizer automatically uses symmetry to choose the best contraction order and charges reduced costs:

```python
with me.BudgetContext(flop_budget=10**8) as budget:
    S = me.as_symmetric(np.eye(10), symmetric_axes=(0, 1))  # 55 unique elements
    v = me.ones((10,))

    result = me.einsum('ij,j->i', S, v)  # cost reduced by input symmetry
```

**Output symmetry** — pass `symmetric_axes` to `einsum()` to declare that the result is symmetric. This wraps the output as a `SymmetricTensor` so downstream operations benefit from reduced costs. It does NOT affect the cost of this einsum — it's a declaration about the result's structure:

```python
with me.BudgetContext(flop_budget=10**8) as budget:
    X = me.array(np.random.randn(100, 10))

    # X^T X is always symmetric — declare output axes (0,1) as symmetric
    C = me.einsum('ki,kj->ij', X, X, symmetric_axes=[(0, 1)])

    print(type(C))  # <class 'SymmetricTensor'>
    # C can now be passed to other operations with automatic cost savings
```

For the full symmetry guide, see [Exploit Symmetry Savings](./exploit-symmetry.md).

## Inspecting costs

`me.einsum_path()` previews the contraction plan without executing or spending any budget:

```python
path, info = me.einsum_path('ijk,ai,bj,ck->abc', T, A, B, C)

print(info)
# Step  Subscript         FLOPs  Dense FLOPs  Symmetry Savings
# ────  ────────────────  ─────  ───────────  ────────────────
# 0     ijk,ai->ajk       ...    ...          ...
# 1     ajk,bj->abk       ...    ...          ...
# 2     abk,ck->abc       ...    ...          ...
# ────  ────────────────  ─────  ───────────  ────────────────
# Total                   ...    ...          ...x speedup

print(f"Naive cost:     {info.naive_cost:,}")
print(f"Optimized cost: {info.optimized_cost:,}")
print(f"Speedup:        {info.speedup:.1f}x")
```

`me.flops.einsum_cost()` returns the same cost that `einsum()` would deduct — one source of truth:

```python
cost = me.flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
print(f"Matmul cost: {cost:,}")  # 33,554,432
```

## Custom contraction paths

By default mechestim finds the optimal contraction order automatically. You can override this by passing an explicit path — a list of int-tuples specifying which operand positions to contract at each step:

```python
import mechestim as me

A = me.ones((3, 4))
B = me.ones((4, 5))
C = me.ones((5, 6))

# Plan first, execute later
path, info = me.einsum_path('ij,jk,kl->il', A, B, C)
print(f"Optimal path: {path}")  # e.g. [(0, 1), (0, 1)]

# Execute with the planned path
with me.BudgetContext(flop_budget=10**8) as budget:
    result = me.einsum('ij,jk,kl->il', A, B, C, optimize=path)
```

You can also specify a completely custom path. Each tuple names the positions (in the current operand list) to contract; the result is appended to the end:

```python
# Force B×C first (positions 1,2), then A×result (positions 0,1)
result = me.einsum('ij,jk,kl->il', A, B, C, optimize=[(1, 2), (0, 1)])

# Force A×B first (positions 0,1), then result×C (positions 0,1)
result = me.einsum('ij,jk,kl->il', A, B, C, optimize=[(0, 1), (0, 1)])
```

Different paths may have different FLOP costs. Use `me.einsum_path()` to compare — it returns the cost without executing or spending budget.

## ⚠️ Common pitfalls

**Symptom:** Unexpectedly high FLOP cost

**Fix:** Check all index dimensions. A subscript like `'ijkl,jklm->im'` multiplies all five dimension sizes together (times op_factor). Use `me.flops.einsum_cost()` or `me.einsum_path()` to preview costs before executing.

## 📎 Related pages

- [Exploit Symmetry](./exploit-symmetry.md) — full guide to symmetry mechanisms
- [Path Optimizer](../api/opt-einsum.md) — algorithms, symmetry support, and upstream attribution
- [Symmetric Tensors API](../api/symmetric.md) — `SymmetricTensor`, `SymmetryInfo`, `as_symmetric`
- [Plan Your Budget](./plan-your-budget.md) — query costs before executing
- [FLOP Counting Model](../concepts/flop-counting-model.md) — how costs are computed


## Exploit Symmetry

# Exploit Symmetry Savings

## When to use this page

Use this page to reduce FLOP costs when working with symmetric tensors.

## Prerequisites

- [Use Einsum](./use-einsum.md)

## Why symmetry matters

Many computations in mechanistic estimation involve symmetric tensors — covariance matrices, quadratic forms, higher-order moment tensors. When a tensor is symmetric, there is redundancy in the computation that mechestim can exploit to reduce the FLOP cost.

---

## SymmetricTensor — first-class symmetric tensors

`SymmetricTensor` is an `ndarray` subclass that carries symmetry metadata. When you pass a `SymmetricTensor` to any mechestim operation, the cost is automatically reduced based on the number of unique elements.

### Creating a SymmetricTensor

```python
import mechestim as me
import numpy as np

# Wrap an existing symmetric array
data = np.array([[2.0, 1.0], [1.0, 3.0]])
S = me.as_symmetric(data, symmetric_axes=(0, 1))

# symmetric_axes=(0, 1) means axes 0 and 1 are symmetric
# For partial symmetry on a 4-tensor:
# S = me.as_symmetric(data, symmetric_axes=[(0, 1), (2, 3)])
```

`as_symmetric` validates that the data is actually symmetric (within tolerance `atol=1e-6, rtol=1e-5`). If it isn't, `SymmetryError` is raised.

### Checking symmetry

Use `me.is_symmetric()` to check whether data is symmetric without raising an exception:

```python
# Standalone function — works on any ndarray
if me.is_symmetric(result, symmetric_axes=(0, 1)):
    result = me.as_symmetric(result, symmetric_axes=(0, 1))

# Custom tolerance
me.is_symmetric(data, (0, 1), atol=1e-3)

# Multiple groups
me.is_symmetric(data, [(0, 1), (2, 3)])
```

`SymmetricTensor` also has an `.is_symmetric()` method. Called without arguments, it re-checks the carried axes. You can also pass different axes to check:

```python
S = me.as_symmetric(data, symmetric_axes=(0, 1))
S.is_symmetric()                          # True — checks (0, 1)
S.is_symmetric(symmetric_axes=(0, 1, 2))  # check a different set of axes
```

### Automatic cost savings

Once a tensor is wrapped as `SymmetricTensor`, all downstream operations automatically get reduced costs:

```python
with me.BudgetContext(flop_budget=10**8) as budget:
    S = me.as_symmetric(np.eye(10), symmetric_axes=(0, 1))

    # Pointwise: costs 55 FLOPs (10*11/2 unique elements) instead of 100
    result = me.exp(S)
    print(f"exp cost: {budget.flops_used}")  # 55

    # Solve: uses Cholesky cost (n^3/3) instead of LU cost (2n^3/3)
    x = me.linalg.solve(S, np.ones(10))
```

### Symmetry propagation

Symmetry metadata propagates automatically through operations using conservative algebraic rules. No runtime symmetry checks are performed — only index arithmetic on the metadata.

| Operation | Result type | Rule |
|-----------|------------|------|
| `me.exp(S)`, `me.log(S)`, ... | `SymmetricTensor` | Unary pointwise: same groups pass through |
| `me.add(S, T)` (same groups) | `SymmetricTensor` | Binary pointwise: groups are **intersected** |
| `me.add(S, T)` (different groups) | depends | Only groups present in *both* inputs survive |
| `S * scalar` | `SymmetricTensor` | Scalar ops preserve all groups |
| `S.copy()` | `SymmetricTensor` | Metadata preserved |
| `S[0]` (integer index) | `SymmetricTensor` or ndarray | Indexed dim removed from group; remaining dims keep symmetry if 2+ survive |
| `S[0:k]` (slice) | `SymmetricTensor` or ndarray | Resized dim pulled from group; same-size dims keep symmetry |
| `me.sum(S, axis=0)` | `SymmetricTensor` or ndarray | Reduced dim removed from group; remaining dims renumbered |
| `me.sum(S)` (all axes) | plain ndarray | Scalar result — no symmetry |
| `S @ T` | plain ndarray | `AB != (AB)^T` in general |

#### Slicing rules in detail

When you slice a `SymmetricTensor`, symmetry propagates based on what happens to each dimension:

```python
import mechestim as me
import numpy as np

# 4D tensor with full ((0,1,2,3)) symmetry, shape (10,10,10,10)
A = me.as_symmetric(sym_data, symmetric_axes=(0, 1, 2, 3))

A[0]          # shape (10,10,10) — dim 0 removed → ((0,1,2)) symmetry
A[0:5]        # shape (5,10,10,10) — dim 0 resized, pulled from group → ((1,2,3))
A[0:10]       # shape (10,10,10,10) — same size, group intact → ((0,1,2,3))
A[0:5, 0:5]   # shape (5,5,10,10) — conservative: both pulled → ((2,3))
A[arr]        # advanced indexing → plain ndarray (symmetry stripped)
```

#### Reduction rules in detail

```python
A.sum(axis=0)              # ((0,1,2,3)) → ((0,1,2)) on 3D result
A.sum(axis=(0,1))          # ((0,1,2,3)) → ((0,1)) on 2D result
A.sum(axis=0, keepdims=True)  # dim 0 now size 1, pulled → ((1,2,3)) on 4D result
A.sum()                    # scalar → no symmetry
```

#### Binary intersection rules

```python
# Both have ((0,1)) → intersection is ((0,1))
me.add(S1, S2)

# S1 has ((0,1),(2,3)), S2 has ((0,1)) → intersection is ((0,1))
me.add(S1, S2)

# S has ((0,1)), B is plain ndarray → intersection is empty → plain ndarray
me.add(S, B)

# Broadcasting: if a dim is stretched (size 1→n), it's pulled from its group
# before intersection
```

#### Symmetry loss warnings

When an operation causes symmetry to be lost (partially or fully), mechestim emits a `SymmetryLossWarning`. This helps you spot places where you might want to manually re-tag with `as_symmetric()`.

```python
# Warnings are shown once per call site (Python default filter)
S = me.as_symmetric(data, symmetric_axes=(0, 1))
row = S[0]  # SymmetryLossWarning: Symmetry lost along dims (0, 1): ...

# Suppress warnings globally
me.configure(symmetry_warnings=False)

# Or use standard Python warning filters
import warnings
warnings.filterwarnings("ignore", category=me.SymmetryLossWarning)
```

### Symmetry-aware linalg

Several linalg operations use cheaper algorithms when given a `SymmetricTensor`:

| Operation | Cost with symmetric input | Cost without |
|-----------|--------------------------|-------------|
| `me.linalg.solve(S, b)` | `n^3/3 + n*nrhs` (Cholesky) | `2n^3/3 + n^2*nrhs` (LU) |
| `me.linalg.det(S)` | `n^3/3` (Cholesky) | `n^3` (LU) |
| `me.linalg.inv(S)` | `n^3/3 + n^3/2` | `n^3` |

`me.linalg.inv(S)` returns a `SymmetricTensor` (the inverse of a symmetric matrix is symmetric).

### End-to-end example

```python
import mechestim as me
import numpy as np

n, d = 10, 100

with me.BudgetContext(flop_budget=10**8) as budget:
    X = np.random.randn(d, n)

    # Build covariance — einsum returns SymmetricTensor
    C = me.einsum('ki,kj->ij', X, X, symmetric_axes=[(0, 1)])

    # Chain of unary ops — symmetry preserved, each costs n*(n+1)/2
    C_exp = me.exp(C)
    C_log = me.log(C_exp)

    # Solve — uses Cholesky cost automatically
    C_pd = C + me.multiply(
        me.as_symmetric(np.eye(n), symmetric_axes=(0, 1)),
        np.asarray(float(n))
    )
    x = me.linalg.solve(C_pd, np.ones(n))

    print(budget.summary())
```

---

## Symmetry in einsum

When you pass a `SymmetricTensor` to `me.einsum`, the path optimizer handles everything automatically. It uses symmetry to choose the best contraction order and charges reduced costs based on unique elements.

```python
S = me.as_symmetric(data, symmetric_axes=(0, 1))  # 10x10, 55 unique elements
v = np.ones(10)
result = me.einsum('ij,j->i', S, v)  # costs based on unique elements
```

Use `symmetric_axes` when the output of your einsum is symmetric. The result is returned as a `SymmetricTensor` for downstream savings:

```python
# C[i,j] == C[j,i] — declare output axes (0, 1) as symmetric
C = me.einsum('ki,kj->ij', X, X, symmetric_axes=[(0, 1)])
# C is now a SymmetricTensor — downstream ops get automatic savings
```

For higher-order tensors where only some axes are symmetric, declare multiple groups:

```python
# 4-tensor where axes (0,1) and (2,3) are each separately symmetric
result = me.einsum('...', *operands, symmetric_axes=[(0, 1), (2, 3)])
```

---

## Symmetry propagation through contraction paths

When contracting multiple tensors with `me.einsum`, the path optimizer is
fully symmetry-aware. Symmetry influences two things:

1. **Contraction order.** The optimizer uses symmetric costs to choose
   which pair of tensors to contract at each step. A contraction that
   looks sub-optimal under dense cost estimates may become the best
   choice once symmetry savings are factored in.

2. **Symmetry propagation.** Intermediates' symmetry is tracked and
   influences future ordering decisions. After each contraction, the
   result's symmetry is computed (by restricting each symmetric group
   to the indices that survive) and fed back to the optimizer.

The cost formula correctly distinguishes between summed and surviving
indices. For example, given S_3 on {i,j,k} where i is summed out,
only the S_2 subgroup on {j,k} contributes a symmetry reduction --
the summed index i does not.

Symmetry degrades along the contraction chain as free indices are consumed:

```
ijk (S₃) + ai → ajk (S₂ on j,k) → + bj → abk (none) → + ck → abc (none)
```

The early steps benefit from symmetry savings; later steps operate on dense
intermediates.

### Example

```python
import mechestim as me
import numpy as np

n = 100
T_data = np.random.randn(n, n, n)
T_data = (T_data + T_data.transpose(1, 0, 2) + T_data.transpose(2, 1, 0)
          + T_data.transpose(0, 2, 1) + T_data.transpose(1, 2, 0)
          + T_data.transpose(2, 0, 1)) / 6

T = me.as_symmetric(T_data, symmetric_axes=(0, 1, 2))  # S₃ symmetric
A = me.ones((n, n))
B = me.ones((n, n))
C = me.ones((n, n))

path, info = me.einsum_path('ijk,ai,bj,ck->abc', T, A, B, C)
print(info)
```

Output:

```
Step  Subscript         FLOPs  Dense FLOPs  Symmetry Savings
────  ────────────────  ─────  ───────────  ────────────────
0     ijk,ai->ajk       ...    ...          ~3x (S₃ input)
1     ajk,bj->abk       ...    ...          ~2x (S₂ input)
2     abk,ck->abc       ...    ...          1x  (dense)
────  ────────────────  ─────  ───────────  ────────────────
Total                   ...    ...          ...x speedup
```

Each step's `input_symmetries` and `output_symmetry` fields in `StepInfo`
describe exactly which symmetry was present and how it reduced cost. See
[PathInfo / StepInfo API](../api/symmetric.md) for the full dataclass reference.

---

## Common pitfalls

**Symptom:** `SymmetryError: Tensor not symmetric along axes (0, 1): max deviation = 0.5 (tolerance: atol=1e-06, rtol=1e-05)`

**Fix:** The data must be symmetric within tolerance. Check your computation or use a tighter numerical method.

**Symptom:** Expected `SymmetricTensor` output but got plain ndarray

**Fix:** Symmetry propagation is conservative — it never claims symmetry that isn't guaranteed. Operations like matmul (`S @ T`) always strip symmetry. Slicing and reductions strip symmetry only when the affected dims belong to a symmetric group and the operation leaves fewer than 2 dims in that group. If you know the result is symmetric, re-wrap with `me.as_symmetric()`.

**Symptom:** `SymmetryLossWarning` appearing unexpectedly

**Fix:** This warning tells you symmetry metadata was lost. Check whether the result is still symmetric — if so, re-tag with `as_symmetric()`. To suppress: `me.configure(symmetry_warnings=False)` or use `warnings.filterwarnings("ignore", category=me.SymmetryLossWarning)`.

## Related pages

- [Use Einsum](./use-einsum.md) — einsum basics, multi-operand contractions, and path inspection
- [Use Linear Algebra](./use-linalg.md) — linalg operations and costs
- [FLOP Counting Model](../concepts/flop-counting-model.md) — how costs are calculated
- [Symmetric Tensors API](../api/symmetric.md) — `PathInfo`, `StepInfo`, and `SymmetricTensor`


## Use Linear Algebra

# Use Linear Algebra

## When to use this page

Use this page to learn how to use `me.linalg` operations and their FLOP costs.

## Prerequisites

- [Your First Budget](../getting-started/first-budget.md)

## Available operations

### Decompositions

| Operation | Cost | Notes |
|-----------|------|-------|
| `me.linalg.svd(A, k=k)` | $m \cdot n \cdot k$ | Truncated SVD |
| `me.linalg.eig(A)` | $10n^3$ | General eigendecomposition |
| `me.linalg.eigh(A)` | $4n^3/3$ | Symmetric eigendecomposition |
| `me.linalg.cholesky(A)` | $n^3/3$ | Cholesky (symmetric positive definite) |
| `me.linalg.qr(A)` | $2mn^2 - 2n^3/3$ | Householder QR |
| `me.linalg.eigvals(A)` | $10n^3$ | Eigenvalues only |
| `me.linalg.eigvalsh(A)` | $4n^3/3$ | Symmetric eigenvalues only |
| `me.linalg.svdvals(A)` | $m \cdot n \cdot \min(m,n)$ | Singular values only |

### Solvers

| Operation | Cost | Symmetric cost |
|-----------|------|----------------|
| `me.linalg.solve(A, b)` | $2n^3/3 + n^2 \cdot n_{\text{rhs}}$ | $n^3/3 + n \cdot n_{\text{rhs}}$ |
| `me.linalg.inv(A)` | $n^3$ | $n^3/3 + n^3/2$ |
| `me.linalg.lstsq(A, b)` | $m \cdot n \cdot \min(m,n)$ | — |
| `me.linalg.pinv(A)` | $m \cdot n \cdot \min(m,n)$ | — |

When the input is a `SymmetricTensor`, `solve` and `inv` automatically use cheaper Cholesky-based costs. `inv` of a symmetric matrix returns a `SymmetricTensor`.

### Properties

| Operation | Cost | Symmetric cost |
|-----------|------|----------------|
| `me.linalg.det(A)` | $n^3$ | $n^3/3$ |
| `me.linalg.slogdet(A)` | $n^3$ | $n^3/3$ |
| `me.linalg.norm(x)` | depends on ord | — |
| `me.linalg.cond(A)` | $m \cdot n \cdot \min(m,n)$ | — |
| `me.linalg.matrix_rank(A)` | $m \cdot n \cdot \min(m,n)$ | — |
| `me.linalg.trace(A)` | $n$ | — |

### Compound

| Operation | Cost | Notes |
|-----------|------|-------|
| `me.linalg.multi_dot(arrays)` | Optimal chain ordering | Uses `np.linalg.multi_dot` |
| `me.linalg.matrix_power(A, n)` | $n^3 \times \text{exponent}$ | Repeated squaring |

## Symmetric input savings

Pass a `SymmetricTensor` to get automatic cost reductions:

```python
import mechestim as me
import numpy as np

with me.BudgetContext(flop_budget=10**8) as budget:
    A = me.as_symmetric(np.eye(10) * 2.0, symmetric_axes=(0, 1))

    # solve uses Cholesky cost: n^3/3 + n*nrhs = 343
    x = me.linalg.solve(A, np.ones(10))

    # inv returns SymmetricTensor, uses cheaper cost
    A_inv = me.linalg.inv(A)
    print(isinstance(A_inv, me.SymmetricTensor))  # True
```

See [Exploit Symmetry Savings](./exploit-symmetry.md) for full details.

## Query cost before running

```python
cost = me.flops.svd_cost(m=256, n=256, k=10)
print(f"SVD cost: {cost:,}")  # 655,360

cost = me.flops.solve_cost(n=256, nrhs=1, symmetric=True)
print(f"Solve cost (symmetric): {cost:,}")
```

## Common pitfalls

**Symptom:** Using `numpy.linalg.svd` instead of `me.linalg.svd`

**Fix:** Operations called through `numpy` directly bypass FLOP counting. Always use `me.linalg.*`.

## Related pages

- [Exploit Symmetry Savings](./exploit-symmetry.md) — symmetry-aware cost reductions
- [Plan Your Budget](./plan-your-budget.md) — query costs before running
- [Operation Audit](../reference/operation-audit.md) — full list of supported operations


## Use FFT

# Use FFT

## When to use this page

Use this page to learn how to use `me.fft` operations and understand their FLOP costs.

## Prerequisites

- [Your First Budget](../getting-started/first-budget.md)

## Cost model

FFT costs are based on the Cooley-Tukey radix-2 algorithm:

| Transform | Cost Formula | Example (n=1024) |
|-----------|-------------|------------------|
| `fft`, `ifft` | $5n \cdot \lceil\log_2 n\rceil$ | 51,200 |
| `rfft`, `irfft` | $5(n/2) \cdot \lceil\log_2 n\rceil$ | 25,600 |
| `fft2`, `ifft2` | $5N \cdot \lceil\log_2 N\rceil$ where $N = n_1 \cdot n_2$ | varies |
| `fftn`, `ifftn` | $5N \cdot \lceil\log_2 N\rceil$ where $N = \prod_i n_i$ | varies |
| `fftfreq`, `rfftfreq` | 0 (free) | 0 |
| `fftshift`, `ifftshift` | 0 (free) | 0 |

Real-valued transforms (`rfft`, `irfft`, `rfftn`, `irfftn`) cost roughly half of their complex counterparts because they exploit conjugate symmetry.

## Basic usage

```python
import mechestim as me

with me.BudgetContext(flop_budget=1_000_000) as budget:
    # Generate a signal (free)
    signal = me.random.randn(1024)

    # Forward FFT: 5 * 1024 * 10 = 51,200 FLOPs
    spectrum = me.fft.fft(signal)

    # Inverse FFT: same cost
    reconstructed = me.fft.ifft(spectrum)

    # Frequency bins (free)
    freqs = me.fft.fftfreq(1024)

    print(f"Total FFT cost: {budget.flops_used:,}")  # 102,400
```

## Real vs complex transforms

When your input is real-valued (which is common in signal processing), prefer `rfft` over `fft` — it costs half as much:

```python
import mechestim as me

with me.BudgetContext(flop_budget=1_000_000) as budget:
    signal = me.random.randn(1024)

    # Complex FFT: 51,200 FLOPs
    spec_complex = me.fft.fft(signal)

    budget_after_fft = budget.flops_used

    # Real FFT: 25,600 FLOPs
    spec_real = me.fft.rfft(signal)

    rfft_cost = budget.flops_used - budget_after_fft

    print(f"fft cost:  {budget_after_fft:,}")   # 51,200
    print(f"rfft cost: {rfft_cost:,}")           # 25,600
```

The output of `rfft` has shape `(n//2 + 1,)` instead of `(n,)`, since the negative frequencies are redundant for real inputs.

## Multi-dimensional FFT

Use `fft2` for 2-D transforms (e.g., images) and `fftn` for arbitrary dimensions:

```python
import mechestim as me

with me.BudgetContext(flop_budget=10**8) as budget:
    # 2-D image (free to create)
    image = me.random.randn(256, 256)

    # 2-D FFT
    spectrum_2d = me.fft.fft2(image)
    print(f"2D FFT cost: {budget.flops_used:,}")

    # N-D FFT with explicit shape
    volume = me.random.randn(32, 32, 32)
    spectrum_3d = me.fft.fftn(volume)
```

## Windowed FFT pattern

A common signal processing pattern — window the signal before FFT to reduce spectral leakage:

```python
import mechestim as me

with me.BudgetContext(flop_budget=1_000_000) as budget:
    signal = me.random.randn(1024)

    # Window function (counted — hamming costs n FLOPs)
    window = me.hamming(1024)

    # Apply window (counted — multiply costs n FLOPs)
    windowed = me.multiply(signal, window)

    # FFT (counted)
    spectrum = me.fft.rfft(windowed)

    print(budget.summary())
```

## Query costs before running

```python
from mechestim.flops import fft_cost, rfft_cost

# Check cost of a large FFT before committing budget
n = 2**20  # ~1 million points
print(f"Complex FFT: {fft_cost(n):,} FLOPs")   # 104,857,600
print(f"Real FFT:    {rfft_cost(n):,} FLOPs")   # 52,428,800
```

## ⚠️ Common pitfalls

**Symptom:** Using `me.fft.fft` on real data when `me.fft.rfft` would suffice

**Fix:** `rfft` costs half as much. If your input is real-valued, always prefer `rfft`/`irfft` over `fft`/`ifft`.

**Symptom:** Unexpectedly high cost for multi-dimensional FFT

**Fix:** The cost scales as $5 \cdot \prod n_i \cdot \lceil\log_2(\prod n_i)\rceil$. A 256x256 2-D FFT processes 65,536 elements, not 256. Use `fft_cost` to estimate before running.

## 📎 Related pages

- [FFT API Reference](../api/fft.md) — full function signatures and docstrings
- [Plan Your Budget](./plan-your-budget.md) — general cost estimation workflow
- [FLOP Counting Model](../concepts/flop-counting-model.md) — how all costs are computed


## Plan Your Budget

# Plan Your Budget

## When to use this page

Use this page to learn how to query operation costs before running them.

## Prerequisites

- [Your First Budget](../getting-started/first-budget.md)

## Cost query functions

These functions work **outside** a BudgetContext — they compute costs from shapes without executing anything.

```python
import mechestim as me

# Einsum cost
cost = me.flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
print(f"Matmul cost: {cost:,}")         # 33,554,432 (2 × 256³)

# SVD cost
cost = me.flops.svd_cost(m=256, n=256, k=10)
print(f"SVD cost: {cost:,}")            # 655,360

# Pointwise cost (unary/binary ops)
cost = me.flops.pointwise_cost(shape=(256, 256))
print(f"Pointwise cost: {cost:,}")      # 65,536

# Reduction cost
cost = me.flops.reduction_cost(input_shape=(256, 256))
print(f"Reduction cost: {cost:,}")      # 65,536
```

## Budget breakdown example

Plan a multi-step computation before executing:

```python
import mechestim as me

# Plan
steps = [
    ("einsum ij,j->i", me.flops.einsum_cost('ij,j->i', shapes=[(256, 256), (256,)])),
    ("ReLU (maximum)", me.flops.pointwise_cost(shape=(256,))),
    ("sum reduction", me.flops.reduction_cost(input_shape=(256,))),
]

total = sum(cost for _, cost in steps)
print(f"{'Operation':<20} {'FLOPs':>12}")
print("-" * 34)
for name, cost in steps:
    print(f"{name:<20} {cost:>12,}")
print("-" * 34)
print(f"{'Total':<20} {total:>12,}")
```

Output:

```
Operation                   FLOPs
----------------------------------
einsum ij,j->i            131,072
ReLU (maximum)                256
sum reduction                 256
----------------------------------
Total                     131,584
```

## Multi-operand einsum planning with `einsum_path`

For multi-operand einsums (3+ operands), `me.einsum_path()` is more
informative than `me.flops.einsum_cost()` because it shows the step-by-step
contraction breakdown with per-step symmetry savings:

```python
import mechestim as me
import numpy as np

T = me.as_symmetric(np.random.randn(50, 50, 50), symmetric_axes=(0, 1, 2))
A = me.ones((50, 50))
B = me.ones((50, 50))
C = me.ones((50, 50))

path, info = me.einsum_path('ijk,ai,bj,ck->abc', T, A, B, C)

print(f"Optimized cost: {info.optimized_cost:,}")
print(f"Naive cost:     {info.naive_cost:,}")
print(f"Speedup:        {info.speedup:.1f}x")
print(f"Largest intermediate: {info.largest_intermediate:,} elements")
print(info)  # full per-step table
```

`me.einsum_path()` has **zero budget cost** — it plans the contraction path
without executing anything. Use it alongside `me.flops.einsum_cost()` for
comprehensive planning.

## Using namespaces to track phases

Use the `namespace` parameter to label different computation phases:

```python
with me.BudgetContext(flop_budget=total, namespace="forward") as budget:
    # forward pass here
    ...

with me.BudgetContext(flop_budget=total, namespace="backward") as budget:
    # backward pass here
    ...

# Session-wide summary across all phases
me.budget_summary()
```

`me.budget_summary_dict(by_namespace=True)` returns a dict with per-namespace breakdowns for programmatic analysis.

## 📎 Related pages

- [Use Einsum](./use-einsum.md) — understand einsum cost formulas and multi-operand paths
- [Debug Budget Overruns](./debug-budget-overruns.md) — diagnose after the fact
- [Symmetric Tensors API](../api/symmetric.md) — `PathInfo` and `StepInfo` reference


## Debug Budget Overruns

# Debug Budget Overruns

## When to use this page

Use this page when you hit a `BudgetExhaustedError` and need to find which operations are using the most FLOPs.

## Prerequisites

- [Your First Budget](../getting-started/first-budget.md)

## Reading the budget summary

Call `budget.summary()` at any point inside a `BudgetContext`, or call `me.budget_summary()` outside a context for a session-wide summary across all namespaces:

```python
import mechestim as me

with me.BudgetContext(flop_budget=10_000_000) as budget:
    A = me.ones((256, 256))
    x = me.ones((256,))

    h = me.einsum('ij,j->i', A, x)
    h = me.exp(h)
    h = me.sum(h)

    print(budget.summary())

# Outside the context — summarises every namespace recorded this session
print(me.budget_summary())
```

The summary shows cost per operation type, sorted by highest cost first.

## Session-wide programmatic analysis

Use `me.budget_summary_dict()` to retrieve aggregated budget data as a dict for automated analysis (e.g. in tests or notebooks):

```python
data = me.budget_summary_dict()
print(f"Budget: {data['flop_budget']:,}")
print(f"Used:   {data['flops_used']:,}")
print(f"Left:   {data['flops_remaining']:,}")
for op_name, op_data in data["operations"].items():
    print(f"  {op_name}: {op_data['flops']:,} ({op_data['calls']} calls)")
```

## Inspecting the operation log

For per-call detail, use `budget.op_log`:

```python
for record in budget.op_log:
    print(f"{record.op_name:<16} ns={record.namespace!r}  cost={record.flop_cost:>12,}  cumulative={record.cumulative:>12,}")
```

Each `OpRecord` contains:

| Field | Description |
|-------|-------------|
| `op_name` | Operation name (e.g., `"einsum"`, `"exp"`) |
| `namespace` | Namespace label from the BudgetContext, or `None` |
| `subscripts` | Einsum subscript string, or `None` |
| `shapes` | Tuple of input shapes |
| `flop_cost` | FLOP cost of this single call |
| `cumulative` | Running total after this call |

## Live budget display

Use `me.budget_live()` as a context manager for a Rich-based live-updating display that refreshes as operations run. This is useful when running long computations and you want to watch budget consumption in real time:

```python
import mechestim as me

with me.budget_live():
    with me.BudgetContext(flop_budget=10**8, namespace="training") as budget:
        for i in range(100):
            W = me.ones((256, 256))
            x = me.ones((256,))
            h = me.einsum('ij,j->i', W, x)
            h = me.exp(h)
            # The live display updates automatically as FLOPs are consumed
```

If Rich is not installed, `budget_live()` falls back to printing a plain-text summary on exit.

## ⚠️ Common pitfalls

**Symptom:** `BudgetExhaustedError` but summary shows budget was nearly full

**Fix:** The budget is checked **before** execution. The failing operation's cost is in the error message — compare it with `budget.flops_remaining`.

## 📎 Related pages

- [Plan Your Budget](./plan-your-budget.md) — predict costs before running
- [Common Errors](../troubleshooting/common-errors.md) — all error types explained


# Concepts

## FLOP Counting Model

# FLOP Counting Model

## When to use this page

Use this page to understand how mechestim counts FLOPs and why it uses analytical counting instead of runtime measurement.

## Why FLOPs instead of wall-clock time

- **Deterministic:** The same code always produces the same FLOP count, regardless of hardware
- **Hardware-independent:** A matmul costs the same FLOPs on a laptop and a server
- **Reproducible:** No variance from CPU scheduling, cache effects, or thermal throttling
- **Composable:** You can sum individual operation costs to predict total cost

## How costs are computed

mechestim computes FLOP costs **analytically from tensor shapes**, not by measuring execution time.

1. You call a counted operation (e.g., `me.einsum('ij,j->i', W, x)`)
2. mechestim computes the cost from the shapes: 2 × 256 × 256 = 131,072 FLOPs
3. The cost is checked against the remaining budget
4. If within budget: the operation executes and the cost is deducted
5. If over budget: `BudgetExhaustedError` is raised, the operation does **not** execute

## Cost formulas by category

| Category | Formula | Example |
|----------|---------|---------|
| **Einsum** | Per-step: product of dims × op_factor | `'ij,jk->ik'` → 2 × 3 × 4 × 5 = 120 |
| **Unary** (exp, log, sqrt, ...) | $\text{numel}(\text{output})$ | shape (256, 256) → 65,536 |
| **Binary** (add, multiply, ...) | $\text{numel}(\text{output})$ | shape (256, 256) → 65,536 |
| **Reduction** (sum, mean, max, ...) | $\text{numel}(\text{input})$ | shape (256, 256) → 65,536 |
| **SVD** | $m \cdot n \cdot k$ | (256, 256, k=10) → 655,360 |
| **Solve** | $2n^3/3 + n^2 \cdot n_{\text{rhs}}$ (LU) | (256, 256) solve → ~11.1M |
| **Dot / Matmul** | Same as einsum | (256, 256) @ (256, 256) → 2 × 256³ |
| **Free ops** | 0 | zeros, reshape, etc. |

### Sorting & search

| Category | Formula | Example |
|----------|---------|---------|
| **Sort / Argsort** | $n \cdot \lceil\log_2 n\rceil$ per slice | shape (4, 8), axis=-1 → 4 × 8 × 3 = 96 |
| **Lexsort** | $k \cdot n \cdot \lceil\log_2 n\rceil$ | 2 keys of length 8 → 2 × 8 × 3 = 48 |
| **Partition** | $n$ per slice | shape (100,), kth=50 → 100 |
| **Searchsorted** | $m \cdot \lceil\log_2 n\rceil$ | 5 queries into 1024 → 5 × 10 = 50 |
| **Unique** | $n \cdot \lceil\log_2 n\rceil$ | 8 elements → 8 × 3 = 24 |
| **Set ops** | $(n+m) \cdot \lceil\log_2(n+m)\rceil$ | 4 + 4 elements → 8 × 3 = 24 |

### Histogram & counting

| Category | Formula | Example |
|----------|---------|---------|
| **Histogram** | $n \cdot \lceil\log_2 \text{bins}\rceil$ | 100 elements, 8 bins → 100 × 3 = 300 |
| **Bincount** | $n$ | 100 elements → 100 |

### Random sampling

| Category | Formula | Example |
|----------|---------|---------|
| **Simple samplers** | $\text{numel}(\text{output})$ | shape (10, 20) → 200 |
| **Shuffle / Permutation** | $n \cdot \lceil\log_2 n\rceil$ | 16 elements → 16 × 4 = 64 |

## Symmetry savings

When a tensor is a `SymmetricTensor`, costs are reduced based on the number of unique elements rather than total elements. For a symmetric $n \times n$ matrix, there are $n(n+1)/2$ unique elements instead of $n^2$.

| Category | Symmetric cost | Standard cost |
|----------|---------------|---------------|
| **Pointwise** (unary/binary) | unique_elements | $\text{numel}(\text{output})$ |
| **Reduction** | unique_elements | $\text{numel}(\text{input})$ |
| **Einsum** (symmetric input) | Scaled by unique/total for surviving index groups | Full product |
| **Solve** | $n^3/3 + n \cdot n_{\text{rhs}}$ (Cholesky) | $2n^3/3 + n^2 \cdot n_{\text{rhs}}$ (LU) |
| **Det / Slogdet** | $n^3/3$ (Cholesky) | $n^3$ (LU) |
| **Inv** | $n^3/3 + n^3/2$ | $n^3$ |

See [Exploit Symmetry Savings](../how-to/exploit-symmetry.md) for usage details.

## Einsum cost model

Every einsum — regardless of the number of operands — is decomposed into
pairwise contraction steps along an optimal path (found via mechestim's
[opt_einsum fork](../api/opt-einsum.md)).
The total cost is the sum of per-step costs:

```
total_cost = sum(step.flop_cost for step in path.steps)
```

For each pairwise step, the cost is:

```
step_cost = (product of all index dimensions) × op_factor
```

where `op_factor = 2` when indices are summed (multiply + add) and
`op_factor = 1` when no indices are summed (outer product).

For a simple two-operand einsum like `'ij,jk->ik'`, there is one step,
so the total cost equals the step cost. For multi-operand einsums (3+
tensors), the optimizer finds the pairwise ordering that minimizes the
total cost.

When symmetric tensors are present, the optimizer is symmetry-aware: it
uses symmetric costs to decide which pair to contract at each step, so the
returned path may differ from the dense-optimal path. Symmetry propagates
through intermediates — if an early contraction produces a symmetric
intermediate, subsequent steps benefit from the reduced element count, and
the optimizer factors this into its ordering decisions.

Use `me.einsum_path()` to inspect the per-step breakdown. See
[Use Einsum](../how-to/use-einsum.md) for examples.

## FLOP multiplier

The `flop_multiplier` parameter in `BudgetContext` scales all costs:

```python
with me.BudgetContext(flop_budget=10**6, flop_multiplier=2.0) as budget:
    # Every operation costs 2× its normal FLOP count
    ...
```

This is useful for experimentation or adjusting the difficulty of a budget constraint.

## Namespaces

The `namespace` parameter in `BudgetContext` is a display-only label for grouping operations in summaries:

```python
with me.BudgetContext(flop_budget=10**6, namespace="training") as budget:
    # Operations are tagged with "training" for display
    ...
```

Namespaces do not affect FLOP counting or budget enforcement — they only appear in `me.budget_summary()` output.

## 📎 Related pages

- [Operation Categories](./operation-categories.md) — which operations are free, counted, or unsupported
- [Plan Your Budget](../how-to/plan-your-budget.md) — query costs before running


## Operation Categories

# Operation Categories

## When to use this page

Use this page to understand which operations cost FLOPs, which are free, and which are unsupported.

## Three categories

Every NumPy function falls into one of three categories in mechestim:

### 🟢 Free operations (0 FLOPs)

Operations that involve no arithmetic computation — just memory allocation, reshaping, or data movement.

**Examples:** `zeros`, `ones`, `full`, `eye`, `arange`, `linspace`, `empty`, `reshape`, `transpose`, `concatenate`, `stack`, `split`, `squeeze`, `expand_dims`, `ravel`, `take`, `where`, `copy`, `astype`, `asarray`

### 🟡 Counted operations (cost > 0)

Operations that perform arithmetic. Cost is computed analytically from tensor shapes.

| Sub-category | Cost formula | Examples |
|-------------|-------------|----------|
| Unary | numel(output) | `exp`, `log`, `sqrt`, `abs`, `sin`, `cos`, `tanh`, `ceil`, `floor` |
| Binary | numel(output) | `add`, `multiply`, `maximum`, `divide`, `power`, `subtract` |
| Reduction | numel(input) | `sum`, `mean`, `max`, `min`, `std`, `var`, `argmax`, `nansum` |
| Einsum | product of all index dims | `me.einsum(...)` |
| Dot/Matmul | equivalent einsum | `me.dot(A, B)`, `A @ B` |
| Linalg | per-operation formula | `me.linalg.solve`, `me.linalg.eigh`, `me.linalg.cholesky` |
| FFT | 5 N log N | `me.fft.fft`, `me.fft.rfft`, `me.fft.fft2` |
| SVD | m × n × k | `me.linalg.svd(A, k=10)` |
| Sort/Search | n log n per slice | `sort`, `argsort`, `unique`, `searchsorted` |
| Random | numel(output) | `me.random.randn`, `me.random.normal`, `me.random.uniform` |

When inputs are `SymmetricTensor`, many operations automatically get reduced costs. See [Exploit Symmetry](../how-to/exploit-symmetry.md).

### 🔴 Blacklisted operations

Operations not relevant to numerical computation. Calling them raises an `AttributeError`. These are I/O, configuration, datetime, and display functions that have no meaningful FLOP cost.

```python
me.save(array, "file.npy")
# AttributeError: 'save' is blacklisted in mechestim (I/O operation).
```

**Blacklisted categories:** I/O (`save`, `load`, `loadtxt`, `savetxt`, `savez`, `genfromtxt`), configuration (`seterr`, `geterr`, `setbufsize`), datetime (`busday_count`, `is_busday`), display (`array2string`, `array_repr`), functional (`apply_along_axis`, `piecewise`, `frompyfunc`).

See [Operation Audit](../reference/operation-audit.md) for the complete list.

## 📎 Related pages

- [Operation Audit](../reference/operation-audit.md) — complete list of every operation and its category
- [FLOP Counting Model](./flop-counting-model.md) — how costs are calculated
- [Migrate from NumPy](../how-to/migrate-from-numpy.md) — what changes when moving from NumPy


## NumPy Compatibility Testing

# NumPy Compatibility Testing

mechestim's goal is to be a drop-in replacement for NumPy: `import mechestim as np` should work for all supported functions. To verify this, we run NumPy's own test suite against mechestim.

## How it works

A pytest conftest at `tests/numpy_compat/conftest.py` monkeypatches numpy functions with their mechestim equivalents at session start. When we point pytest at NumPy's installed test files using `--pyargs`, every test that calls `np.sum(...)`, `np.mean(...)`, etc. actually calls mechestim's version.

```
NumPy test file                conftest.py               mechestim
  calls np.sum(x)  ──────>   np.sum = me.sum   ──────>  me.sum(x)
  asserts result              (monkeypatch)              (FLOP-counted)
```

### Avoiding infinite recursion

mechestim functions internally call numpy (e.g., `me.dot` calls `_np.dot`). Since `_np` IS the numpy module, patching `numpy.dot = me.dot` would cause infinite recursion: `me.dot` → `_np.dot` → `numpy.dot` → `me.dot` → ...

We solve this by **freezing numpy before patching**: the conftest creates a snapshot of the numpy module (and its submodules like `numpy.linalg`, `numpy.fft`), then rebinds every mechestim module's `_np` reference to the frozen copy. Now mechestim's internal calls go to the original numpy functions, while the test suite sees mechestim's versions.

```python
# Simplified flow in conftest.py:
frozen_np = freeze_numpy()           # snapshot of original numpy
rebind_mechestim_np(frozen_np)       # me._np → frozen copy
patch_numpy()                        # np.sum = me.sum, etc.
# Now: test calls np.sum → me.sum → frozen_np.sum (original) ✓
```

## What gets patched

Of mechestim's 482 registered functions, most non-ufunc functions are patched onto numpy during testing. The only categories skipped:

| Category | Count | Why skipped |
|----------|-------|-------------|
| Ufuncs | 101 | mechestim functions are plain callables, not ufuncs -- they lack `.reduce`, `.accumulate`, `.outer`, `.nargs`. Tests check these attributes at collection time. |
| Blacklisted | 32 | Intentionally unsupported |
| `linalg.outer` | 1 | `me.linalg.outer` delegates to `np.outer` (not `np.linalg.outer`), which has different validation behavior |

Everything else -- free ops, counted custom ops (dot, einsum, etc.), submodule functions (linalg, fft), reductions, and special functions -- is patched.

## Test suites

We run 7 NumPy test modules covering core math, ufuncs, numerics, linear algebra, FFT, polynomials, and random:

| Suite | Module | Passed | xfailed |
|-------|--------|--------|---------|
| Core math | `numpy._core.tests.test_umath` | 4,668 | 13 |
| Ufunc infrastructure | `numpy._core.tests.test_ufunc` | 795 | 7 |
| Numeric operations | `numpy._core.tests.test_numeric` | 1,560 | 20 |
| Linear algebra | `numpy.linalg.tests.test_linalg` | 48 | 255 |
| FFT | `numpy.fft.tests.test_pocketfft` | 114 | 34 |
| Polynomials | `numpy.polynomial.tests.test_polynomial` | 36 | 2 |
| Random | `numpy.random.tests.test_random` | 142 | 0 |
| **Total** | | **7,363** | **331** |

All failures are tracked as xfails in `tests/numpy_compat/xfails.py`.

## Running the tests

Tests use `pytest-xdist` for parallel execution across all CPU cores.

```bash
# Run everything (recommended)
make test-numpy-compat

# Run a single suite
uv run pytest tests/numpy_compat/ --pyargs numpy._core.tests.test_umath -n auto -q

# Filter to specific functions
uv run pytest tests/numpy_compat/ --pyargs numpy._core.tests.test_umath -k "sqrt" -n auto -v

# Run without parallelism (for debugging)
uv run pytest tests/numpy_compat/ --pyargs numpy._core.tests.test_umath -v --tb=short
```

The numpy_compat tests are excluded from the default `pytest` run (via `pyproject.toml` `addopts`) to prevent the monkeypatch from contaminating the main test suite. They run as a separate step in CI.

## Known divergences (xfails)

Tests that fail due to known, accepted differences are tracked in `tests/numpy_compat/xfails.py`. Each entry maps a test pattern to a categorized reason:

| Category | Meaning | Examples |
|----------|---------|---------|
| `NOT_IMPLEMENTED` | Function exists but lacks a kwarg or edge case | Missing `out=`, `where=`, `subok=` kwargs |
| `UNSUPPORTED_DTYPE` | mechestim doesn't support this dtype | timedelta, object arrays |
| `UFUNC_INTERNALS` | Test relies on ufunc protocol | `.reduce`, `__array_ufunc__` |
| `BUDGET_SIDE_EFFECT` | Test assumes no global state changes | Budget deduction during assertions |
| `NUMPY_INTERNAL` | Test uses numpy internals | `_umath_tests`, internal type tables |

The linalg suite has the most xfails (255) because mechestim's linalg wrappers don't support stacked/batched arrays, 0-size arrays, or some advanced kwargs that numpy's linalg tests exercise extensively.

### Triaging new failures

1. Run a suite: `uv run pytest tests/numpy_compat/ --pyargs <module> -n auto --tb=line`
2. Categorize each failure
3. If it's a bug we should fix, create an issue
4. If it's an accepted divergence, add it to `xfails.py`

## Why monkeypatching (not subclassing)

We considered alternatives:

- **Array subclass with `__array_ufunc__`**: Would intercept ufunc calls, but mechestim arrays are plain `numpy.ndarray` by design -- no custom tensor class.
- **Running tests with `import mechestim as np`**: NumPy's test files import from `numpy._core`, `numpy.testing`, etc. -- can't redirect all internal imports.
- **Monkeypatching with frozen numpy**: Simple, works with NumPy's existing test infrastructure, tests exactly what users experience (same function signatures), and the frozen-numpy trick prevents infinite recursion.


# Architecture

## Client-Server Model

# Client-Server Model

## When to use this page

Use this page to understand how mechestim's client-server architecture works and why it exists.

## Why client-server?

In competition evaluation, participant code runs in an **isolated container** that cannot import NumPy directly. This prevents participants from bypassing FLOP counting by calling NumPy functions outside mechestim.

The client-server model enforces this isolation:

```mermaid
graph LR
  subgraph participant ["Participant Container"]
    direction TB
    p1["import mechestim as me"]
    p2["No NumPy here!<br/>Only client proxy"]
  end

  subgraph server ["Server Container"]
    direction TB
    s1["mechestim library<br/>(real NumPy)"]
    s2["Budget enforcement<br/>Array storage<br/>FLOP counting"]
  end

  participant -- "ZMQ · IPC / TCP" --> server
  server -- "ZMQ · IPC / TCP" --> participant

  style participant fill:#f8f9fa,stroke:#4051b5,stroke-width:2px,color:#1a1a2e
  style server fill:#f8f9fa,stroke:#4051b5,stroke-width:2px,color:#1a1a2e
  style p1 fill:#fff,stroke:#e5e7eb,color:#1a1a2e
  style p2 fill:#fff,stroke:#e5e7eb,color:#555
  style s1 fill:#fff,stroke:#e5e7eb,color:#1a1a2e
  style s2 fill:#fff,stroke:#e5e7eb,color:#555
```

## How it works

1. **Server** runs the real mechestim library backed by NumPy. It stores all arrays, enforces budgets, and counts FLOPs.

2. **Client** is a drop-in replacement (`import mechestim as me`) that proxies every operation to the server over ZMQ (msgpack-encoded messages).

3. **Arrays stay on the server.** The client holds lightweight `RemoteArray` handles that reference server-side data. When you call `me.einsum(...)`, the client sends the operation and handle IDs to the server, which executes it and returns a new handle.

4. **Budget enforcement happens server-side.** The client cannot manipulate FLOP counts.

## Communication protocol

- **Transport:** ZMQ (REQ/REP pattern)
- **Serialization:** msgpack with binary-safe array payloads
- **Default endpoint:** `ipc:///tmp/mechestim.sock` (configurable via `MECHESTIM_SERVER_URL`)
- **Timeout:** 30 seconds per request

## API compatibility

Code written for the local library works unchanged with the client:

```python
# This code works with BOTH the local library and the client
import mechestim as me

with me.BudgetContext(flop_budget=10**6) as budget:
    x = me.zeros((256,))
    W = me.random.randn(256, 256)
    h = me.einsum('ij,j->i', W, x)
    print(budget.summary())
```

## When to use which

| Use case | Package | Install path |
|----------|---------|-------------|
| Development, testing, research | `mechestim` (local library) | `uv add git+...` or `uv sync` from repo |
| Competition evaluation, sandboxed environments | `mechestim-client` + `mechestim-server` | Docker containers |

## Three packages in this repo

| Package | Location | Description |
|---------|----------|-------------|
| `mechestim` | `src/mechestim/` | Local library — full NumPy backend, direct execution |
| `mechestim-client` | `mechestim-client/` | Client proxy — no NumPy dependency, forwards ops to server |
| `mechestim-server` | `mechestim-server/` | Server — runs real mechestim, manages sessions and arrays |

## 📎 Related pages

- [Running with Docker](./docker.md) — set up client-server locally
- [Contributor Guide](../development/contributing.md) — source-checkout commands for local development
- [Your First Budget](../getting-started/first-budget.md) — getting started with the local library


## Running with Docker

# Running with Docker

## When to use this page

Use this page to run the client-server model locally, either with Docker Compose or manually.

## Prerequisites

- [Client-Server Model](./client-server.md) — understand why the architecture exists
- Docker and Docker Compose installed

## With Docker Compose

The `docker/` directory contains a ready-to-use setup:

```bash
cd docker
docker compose up --build
```

This starts two containers:

| Service | Image | Role |
|---------|-------|------|
| `backend` | `Dockerfile.server` | Runs mechestim server, listens on IPC socket |
| `participant` | `Dockerfile.participant-hardened` | Runs participant code with mechestim-client only |

The containers share an IPC socket volume for communication.

## Without Docker

From a source checkout, start both processes from the repository root so the
server can import the local `src/mechestim` package:

```bash
# Terminal 1: Start the server
PYTHONPATH=src:mechestim-server/src \
  uv run --with pyzmq --with msgpack \
  python -m mechestim_server --url ipc:///tmp/mechestim.sock
```

```bash
# Terminal 2: Run client code
export MECHESTIM_SERVER_URL=ipc:///tmp/mechestim.sock
PYTHONPATH=mechestim-client/src \
  uv run --with pyzmq --with msgpack python your_script.py
```

For TCP (e.g., across machines):

```bash
# Server
PYTHONPATH=src:mechestim-server/src \
  uv run --with pyzmq --with msgpack \
  python -m mechestim_server --url tcp://0.0.0.0:15555

# Client
export MECHESTIM_SERVER_URL=tcp://server-host:15555
PYTHONPATH=mechestim-client/src \
  uv run --with pyzmq --with msgpack python your_script.py
```

If you already have `mechestim-client` and `mechestim-server` installed into
separate environments, the shorter `cd ... && uv run ...` workflow also works.
The commands above are the reproducible source-checkout path.

## ⚠️ Common pitfalls

**Symptom:** `Connection refused` or `timeout`

**Fix:** Ensure the server is running before starting the client. Check that `MECHESTIM_SERVER_URL` matches the server's `--url` argument.

**Symptom:** Port conflict

**Fix:** Change the port in both the server `--url` and client `MECHESTIM_SERVER_URL`.

## 📎 Related pages

- [Client-Server Model](./client-server.md) — architecture overview
- [Contributor Guide](../development/contributing.md) — local repo workflows


# Development

## Contributor Guide

# Contributor Guide

## When to use this page

Use this page when you are working on the mechestim repository itself rather than
only consuming the published API.

## Repository layout

This repository contains three Python packages plus docs and Docker assets:

| Path | Purpose |
|------|---------|
| `src/mechestim/` | Core library backed by NumPy |
| `mechestim-client/src/mechestim/` | Client proxy used in sandboxed participant environments |
| `mechestim-server/src/mechestim_server/` | ZMQ server that executes the real library |
| `tests/` | Core library test suite |
| `mechestim-client/tests/` | Client unit, integration, and adversarial tests |
| `mechestim-server/tests/` | Server unit tests |
| `docs/` | MkDocs source |
| `scripts/generate_api_docs.py` | Generates the API/reference inventory pages |
| `docker/` | Local client-server and hardened evaluation images |

## Initial setup

For normal work on the core package, docs, and root test suite:

```bash
git clone https://github.com/AIcrowd/mechestim.git
cd mechestim
make install
```

`make install` runs `uv sync --all-extras` and configures the local git hooks.

## Which environment to use

The root environment covers the core package, linting, docs, and the main test
suite. The client and server each also have their own `pyproject.toml`.

One important caveat: `mechestim-server` depends on the local `mechestim`
package, which is not resolved from a package index in a fresh source checkout.
For server development, run commands from the repository root with
`PYTHONPATH=src:mechestim-server/src` instead of relying on `cd mechestim-server
&& uv run ...`.

## Common commands

### Core library

```bash
make lint
make test
make test-numpy-compat
make docs-build
make docs-serve
make ci
```

If you prefer direct `uv` commands:

```bash
uv run pytest
uv run mkdocs serve
```

### Client package

The client package is independently installable, so its test suite can run via
its own project file:

```bash
uv run --project mechestim-client pytest mechestim-client/tests
```

Client integration and adversarial tests start a real server subprocess using
the repository root `.venv/bin/python`, so run `make install` first.

### Server package

Run server tests from the repository root so the local core package is on
`PYTHONPATH`:

```bash
PYTHONPATH=src:mechestim-server/src \
  uv run --with pyzmq --with msgpack pytest mechestim-server/tests
```

To launch the server manually from a source checkout:

```bash
PYTHONPATH=src:mechestim-server/src \
  uv run --with pyzmq --with msgpack \
  python -m mechestim_server --url ipc:///tmp/mechestim.sock
```

## Running client and server together without Docker

From a source checkout, use repo-root commands so both packages resolve
correctly:

```bash
# Terminal 1
PYTHONPATH=src:mechestim-server/src \
  uv run --with pyzmq --with msgpack \
  python -m mechestim_server --url ipc:///tmp/mechestim.sock
```

```bash
# Terminal 2
export MECHESTIM_SERVER_URL=ipc:///tmp/mechestim.sock
PYTHONPATH=mechestim-client/src \
  uv run --with pyzmq --with msgpack python your_script.py
```

See [Running with Docker](../architecture/docker.md) if you want the same split
using containers.

## Generated documentation

Do not hand-edit pages that start with:

```md
<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
```

Today that includes:

- `docs/api/fft.md`
- `docs/api/linalg.md`
- `docs/api/polynomial.md`
- `docs/api/random.md`
- `docs/api/window.md`
- `docs/reference/cheat-sheet.md`
- `docs/reference/operation-audit.md`

Instead, update `scripts/generate_api_docs.py`, the relevant source docstrings,
or the operation registry, then regenerate:

```bash
uv run python scripts/generate_api_docs.py
uv run python scripts/generate_api_docs.py --verify
```

## Related pages

- [Running with Docker](../architecture/docker.md) — containerized client-server setup
- [Client-Server Model](../architecture/client-server.md) — architecture overview
- [NumPy Compatibility Testing](../concepts/numpy-compatibility-testing.md) — separate compatibility suite


# API Reference

## Counted Operations

# Counted Operations

Operations that consume the FLOP budget. Every call deducts its analytical cost
from the active budget before execution.

**Cost rules of thumb:**

- **Unary** (exp, log, sqrt, ...): 1 FLOP per output element
- **Binary** (add, multiply, maximum, ...): 1 FLOP per output element (after broadcasting)
- **Reductions** (sum, mean, max, ...): 1 FLOP per input element
- **Einsum**: `product_of_all_index_dims x op_factor` (op_factor=2 for inner products)

See [FLOP Counting Model](../concepts/flop-counting-model.md) for full details.

## API Reference

::: mechestim._einsum.einsum

::: mechestim._einsum.einsum_path

::: mechestim._pointwise

::: mechestim._polynomial

::: mechestim._window

::: mechestim._unwrap

::: mechestim._sorting_ops

::: mechestim._counting_ops


## Free Operations

# Free Operations

Zero-cost operations (0 FLOPs) for tensor creation, reshaping, indexing,
and data movement. These never consume budget.

**Includes:** `array`, `zeros`, `ones`, `eye`, `arange`, `linspace`,
`reshape`, `transpose`, `concatenate`, `stack`, `split`,
`where`, `copy`, `astype`, and 140+ more.

## API Reference

::: mechestim._free_ops


## Symmetric Tensors

# Symmetric Tensors

First-class symmetric tensor support for automatic FLOP cost reductions.

`SymmetricTensor` is an `ndarray` subclass that carries symmetry metadata
through operations. When passed to any mechestim operation, the cost is
automatically reduced based on the number of unique elements.

See [Exploit Symmetry Savings](../how-to/exploit-symmetry.md) for usage patterns.

::: mechestim._symmetric

---

## PathInfo

Contraction path with per-step diagnostics. Returned by `me.einsum_path()`.

::: mechestim.PathInfo

| Field | Type | Description |
|-------|------|-------------|
| `path` | `list[tuple[int, ...]]` | Sequence of contraction index groups |
| `steps` | `list[StepInfo]` | Per-step diagnostics |
| `naive_cost` | `int` | FLOP cost without path optimization |
| `optimized_cost` | `int` | FLOP cost along the optimal path |
| `largest_intermediate` | `int` | Max number of elements in any intermediate tensor |
| `speedup` | `float` | `naive_cost / optimized_cost` |

---

## StepInfo

Per-step contraction info within a `PathInfo`. Each step represents one
pairwise contraction along the optimal path.

::: mechestim.StepInfo

| Field | Type | Description |
|-------|------|-------------|
| `subscript` | `str` | Einsum subscript for this pairwise step (e.g., `'ijk,ai->ajk'`) |
| `flop_cost` | `int` | Symmetry-aware FLOP cost of this step |
| `dense_flop_cost` | `int` | FLOP cost without symmetry savings |
| `symmetry_savings` | `float` | `1 - (flop_cost / dense_flop_cost)` — fraction of cost saved by symmetry |
| `input_symmetries` | `list[IndexSymmetry | None]` | Symmetry of each input to this step |
| `output_symmetry` | `IndexSymmetry | None` | Symmetry of the step's output (propagated to next step) |
| `input_shapes` | `list[tuple[int, ...]]` | Shapes of input operands |
| `output_shape` | `tuple[int, ...]` | Shape of the output tensor |


## Linear Algebra

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# Linear Algebra

Operations from `mechestim.linalg`. Cost formulas vary per operation — see each function's docstring.

::: mechestim.linalg._svd

::: mechestim.linalg._decompositions

::: mechestim.linalg._solvers

::: mechestim.linalg._properties

::: mechestim.linalg._compound

::: mechestim.linalg._aliases


## FFT

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# FFT

Fast Fourier Transform operations from `mechestim.fft`. All FFT
transforms are counted. Real-valued transforms (`rfft`) cost roughly
half of complex transforms.

## Cost Summary

| Operation | Cost Formula |
|-----------|-------------|
| `fft`, `ifft` | $5n \cdot \lceil\log_2 n\rceil$ |
| `fft2`, `ifft2` | $5N \cdot \lceil\log_2 N\rceil$ where $N = n_1 \cdot n_2$ |
| `fftn`, `ifftn` | $5N \cdot \lceil\log_2 N\rceil$ where $N = \prod_i n_i$ |
| `rfft`, `irfft` | $5(n/2) \cdot \lceil\log_2 n\rceil$ |
| `rfft2`, `irfft2` | $5(N/2) \cdot \lceil\log_2 N\rceil$ where $N = n_1 \cdot n_2$ |
| `rfftn`, `irfftn` | $5(N/2) \cdot \lceil\log_2 N\rceil$ where $N = \prod_i n_i$ |
| `hfft`, `ihfft` | $5n_{\text{out}} \cdot \lceil\log_2 n_{\text{out}}\rceil$ |
| `fftfreq`, `rfftfreq`, `fftshift`, `ifftshift` | $0$ (free) |

## Examples

```python
import mechestim as me

with me.BudgetContext(flop_budget=1_000_000) as budget:
    signal = me.random.randn(1024)    # free
    spectrum = me.fft.fft(signal)     # 5 * 1024 * 10 = 51,200 FLOPs
    freqs = me.fft.fftfreq(1024)      # free
    print(f"FFT cost: {budget.flops_used:,}")  # 51,200
```

## API Reference


::: mechestim.fft._transforms

::: mechestim.fft._free


## Random

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# Random

Random number generation from `mechestim.random`. Sampling operations are **counted** — each sample costs `numel(output)` FLOPs, and shuffle/permutation costs `n * ceil(log2(n))` FLOPs. Only configuration helpers (`seed`, `get_state`, `set_state`, `default_rng`) are free (0 FLOPs).

::: mechestim.random


## Polynomial

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# Polynomial

Polynomial operations from `mechestim`. These wrap NumPy's polynomial
functions with FLOP counting.

## Cost Summary

| Operation | Cost Formula |
|-----------|-------------|
| `polyval` | $2 \cdot m \cdot \text{deg}$ (Horner's method) |
| `polyadd`, `polysub` | $\max(n_1, n_2)$ |
| `polymul`, `polydiv` | $n_1 \cdot n_2$ |
| `polyfit` | $2m \cdot (\text{deg}+1)^2$ |
| `poly` | $n^2$ |
| `roots` | $10n^3$ (companion matrix eigendecomposition) |
| `polyder`, `polyint` | $n$ |

## Examples

```python
import mechestim as me

with me.BudgetContext(flop_budget=1_000_000) as budget:
    coeffs = me.array([1.0, -3.0, 2.0])  # x^2 - 3x + 2 (free)
    x = me.linspace(0, 5, 100)            # free
    y = me.polyval(coeffs, x)             # 2 * 100 * 2 = 400 FLOPs
    print(f"polyval cost: {budget.flops_used}")  # 400
```

## API Reference


::: mechestim._polynomial


## Window Functions

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# Window Functions

Window function wrappers from `mechestim`. These generate window
arrays used in signal processing (e.g., for windowed FFTs).

## Cost Summary

| Operation | Cost Formula | Notes |
|-----------|-------------|-------|
| `bartlett` | $n$ | Linear taper |
| `hamming` | $n$ | One cosine term |
| `hanning` | $n$ | One cosine term |
| `blackman` | $3n$ | Three cosine terms |
| `kaiser` | $3n$ | Bessel function evaluation |

## Examples

```python
import mechestim as me

with me.BudgetContext(flop_budget=1_000_000) as budget:
    win = me.hamming(256)   # 256 FLOPs
    win2 = me.kaiser(256)   # 768 FLOPs (3 * 256)
    print(f"Window cost: {budget.flops_used}")  # 1024
```

## API Reference


::: mechestim._window


## Budget

# Budget

Budget management for FLOP counting. `BudgetContext` is the core context
manager that tracks operation costs and enforces limits.

## Quick example

```python
import mechestim as me

# Explicit budget with namespace
with me.BudgetContext(flop_budget=10**7, namespace="forward") as budget:
    W = me.ones((256, 256))
    x = me.ones((256,))
    h = me.einsum('ij,j->i', W, x)

    print(f"Used: {budget.flops_used:,} / {budget.flop_budget:,}")

# Session-wide summary across all namespaces
me.budget_summary()

# Programmatic access
data = me.budget_summary_dict()
```

## API Reference

::: mechestim._budget.BudgetContext

::: mechestim._budget.OpRecord

::: mechestim._budget.budget

::: mechestim._budget.budget_summary_dict

::: mechestim._budget.budget_reset

::: mechestim._display.render_budget_summary

::: mechestim._display.budget_summary

::: mechestim._display.budget_live


## FLOP Cost Query

# FLOP Cost Query API

Pre-execution cost estimation functions. These are pure functions that compute
FLOP costs from shapes without executing anything or consuming budget.

## Quick example

```python
import mechestim as me

# Einsum cost
cost = me.flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
print(f"Matmul: {cost:,} FLOPs")  # 33,554,432

# SVD cost
cost = me.flops.svd_cost(m=256, n=256, k=10)
print(f"SVD: {cost:,} FLOPs")  # 655,360

# Pointwise (unary/binary) cost
cost = me.flops.pointwise_cost(shape=(256, 256))
print(f"Pointwise: {cost:,} FLOPs")  # 65,536

# Reduction cost
cost = me.flops.reduction_cost(input_shape=(1000, 100))
print(f"Reduction: {cost:,} FLOPs")  # 100,000
```

## API Reference

::: mechestim._flops


## Path Optimizer

# Path Optimizer (opt_einsum fork)

mechestim includes a vendored fork of [opt_einsum](https://github.com/dgasmith/opt_einsum) by Daniel G. A. Smith and Johnnie Gray (MIT license). The fork lives at [`src/mechestim/_opt_einsum/`](https://github.com/AIcrowd/mechestim/tree/main/src/mechestim/_opt_einsum) and is used internally by `me.einsum()` and `me.einsum_path()`.

## What is opt_einsum?

[opt_einsum](https://github.com/dgasmith/opt_einsum) is a library for optimizing the contraction order of Einstein summations. Given a multi-operand einsum like `ij,jk,kl->il`, it finds the pairwise contraction sequence that minimizes total FLOPs. For background, see the [opt_einsum documentation](https://dgasmith.github.io/opt_einsum/) and the paper [*Opt_einsum — A Python library for optimizing contraction order*](https://joss.theoj.org/papers/10.21105/joss.00753).

## What our fork adds

This fork extends opt_einsum with **symmetry-aware path finding**. When input tensors have permutation symmetry (e.g., a symmetric matrix where `A[i,j] = A[j,i]`), the fork:

1. **Uses symmetry to choose contraction order.** The path algorithms (greedy, optimal, branch-and-bound) account for symmetry when scoring candidate contractions, preferring orders that exploit symmetric structure.

2. **Propagates symmetry through intermediates.** After each pairwise contraction, the result's symmetry is computed by restricting each input's symmetry groups to the surviving indices. This propagated symmetry influences subsequent ordering decisions.

3. **Reports symmetry-aware costs.** Each step in the contraction path includes both the symmetry-reduced cost and the dense cost, so you can see exactly where symmetry helps.

4. **Classifies symmetric BLAS operations.** Pairwise contractions involving symmetric inputs are labeled with symmetric BLAS types (SYMM, SYMV, SYDT) in addition to standard types (GEMM, DOT, TDOT).

## What was removed from upstream

- Execution logic (`contract`, `_core_contract`) — mechestim handles execution via `numpy.einsum`
- Backend dispatch (JAX, TensorFlow, PyTorch, Theano, CuPy)
- Intermediate caching/sharing layer

The fork is **self-contained** (zero imports from mechestim, depends only on Python stdlib + numpy) and could be contributed back upstream.

## Path algorithms

All algorithms accept an optional `input_symmetries` parameter for symmetry-aware path finding.

| Algorithm | `optimize=` | Used by | Symmetry-aware ordering? |
|-----------|-------------|---------|--------------------------|
| Optimal (brute-force DFS) | `'optimal'` | `'auto'` for 1-4 operands | Yes |
| Branch-and-bound | `'branch-all'`, `'branch-2'` | `'auto'` for 5-14 operands | Yes |
| Greedy | `'greedy'` | `'auto'` for 15+ operands | Yes |
| Dynamic programming | `'dp'` | `'auto-hq'` for 6-16 operands | No (dense ordering, symmetric cost reporting) |
| Random greedy | `'random-greedy'`, `'random-greedy-128'` | `'auto-hq'` for 17+ operands | Yes (via greedy) |
| Auto | `'auto'` (default) | — | Dispatches to above |

## Key types

### `IndexSymmetry`

```python
IndexSymmetry = list[frozenset[str]]
```

The fork's native symmetry representation. Each `frozenset` names einsum index characters that are symmetric under permutation. Example: `[frozenset("ijk")]` means S_3 symmetry on indices i, j, k.

mechestim's `_einsum.py` converts between positional `SymmetryInfo` (used by `SymmetricTensor`) and character-based `IndexSymmetry` at the boundary.

### `PathInfo` and `StepInfo`

See [Symmetric Tensors API](./symmetric.md#pathinfo) for the full dataclass reference.

## Attribution

See the [NOTICE](https://github.com/AIcrowd/mechestim/blob/main/src/mechestim/_opt_einsum/NOTICE) file for a detailed file-by-file changelog of all modifications from upstream.


## Errors

# Errors

Exception and warning classes raised by mechestim. See [Common Errors](../troubleshooting/common-errors.md) for symptoms and fixes.

| Class | Type | When raised |
|-------|------|------------|
| `MechEstimError` | Exception (base) | Base class for all mechestim exceptions |
| `BudgetExhaustedError` | Exception | Operation would exceed remaining FLOP budget |
| `NoBudgetContextError` | Exception | No active budget (rare — global default usually prevents this) |
| `SymmetryError` | Exception | Tensor data is not symmetric within tolerance |
| `MechEstimWarning` | Warning (base) | Base class for all mechestim warnings |
| `SymmetryLossWarning` | Warning | An operation caused loss of symmetry metadata |

## API Reference

::: mechestim.errors


# Reference

## For AI Agents

# For AI Agents

This page is for AI coding assistants (Claude, Cursor, Copilot, etc.) helping
users write mechestim code. It explains what resources are available, how to
access them, and the key things you must know before generating code.

---

## Quick orientation

**mechestim is NOT NumPy.** It wraps a subset of NumPy with analytical FLOP
counting. Every arithmetic operation is charged against a budget. Code that
works with NumPy may fail or behave differently with mechestim:

- All counted operations require an active `BudgetContext`
- 32 operations are blocked entirely (I/O, config, state)
- `sort`, `argsort`, `trace`, `random.*` sampling ops are now **counted** (not free)
- Costs are analytical (from tensor shapes), not measured at runtime

## Machine-readable resources

| Resource | Format | Use case |
|----------|--------|----------|
| [llms.txt](../../llms.txt) | Markdown | Start here. Curated index of all doc pages with one-line descriptions. Under 4K tokens. |
| [llms-full.txt](../../llms-full.txt) | Markdown | Complete docs in one file. Use if your context window is large enough (~115KB). |
| [ops.json](../ops.json) | JSON | Machine-readable manifest of all 482 operations. Query programmatically for name, category, cost formula, status. |
| [FLOP Cost Cheat Sheet](./cheat-sheet.md) | Markdown | Dense reference of every operation's cost. Optimized for agent context windows. |
| [Operation Audit](./operation-audit.md) | Markdown | 7-column searchable table: operation, mechestim ref, NumPy ref, category, cost, status, notes. |

### How to use llms.txt

If you're an agent encountering mechestim for the first time:

1. Fetch `llms.txt` — this gives you the doc map in ~300 words
2. Identify which page answers your question from the section descriptions
3. Fetch that specific page

**URL patterns:** `llms.txt` links to `.md` variants of each page (raw
markdown for agents). Every page is also available as rendered HTML — just
drop the trailing `/index.md` from the URL:

| Agent URL (raw markdown) | Human URL (rendered HTML) |
|--------------------------|---------------------------|
| `.../getting-started/installation/index.md` | `.../getting-started/installation/` |
| `.../how-to/use-einsum/index.md` | `.../how-to/use-einsum/` |
| `.../api/linalg/index.md` | `.../api/linalg/` |

If you have a large context window, fetch `llms-full.txt` instead to get
everything in one request.

### How to use ops.json

`ops.json` contains a JSON object with an `operations` array. Each entry has:

```json
{
  "name": "einsum",
  "module": "numpy",
  "mechestim_ref": "me.einsum",
  "numpy_ref": "np.einsum",
  "category": "counted_custom",
  "cost_formula": "product of all index dims * op_factor",
  "cost_formula_latex": "$\\text{op\\_factor} \\cdot \\prod_i d_i$",
  "free": false,
  "blocked": false,
  "status": "supported",
  "notes": "Supports symmetric tensors via input_symmetries for automatic cost reduction"
}
```

Use this to:

- Check if an operation is supported: filter by `"blocked": false`
- Get the cost formula for a specific operation: look up by `name`
- List all free operations: filter by `"free": true`
- Map between NumPy and mechestim calls: use `numpy_ref` and `mechestim_ref`

## Five rules for generating mechestim code

**1. A global default budget is active automatically — use `BudgetContext` for control.**

A global default budget auto-activates when mechestim is imported, so quick
scripts work without any setup. For precise budget control and namespacing,
use an explicit `BudgetContext`. Both forms are valid:

```python
# Quick work — global default handles budget tracking automatically
result = me.einsum('ij,jk->ik', A, B)

# Recommended for budget control and namespacing
with me.BudgetContext(flop_budget=10**8) as budget:
    result = me.einsum('ij,jk->ik', A, B)

# Decorator form for functions
@me.BudgetContext(flop_budget=10**8)
def my_forward_pass(x):
    return me.einsum('ij,j->i', W, x)
```

**2. Know what's free and what's counted.**

Free (0 FLOPs): `zeros`, `ones`, `array`, `reshape`, `transpose`,
`concatenate`, `linspace`, `where`, `copy`, `random.seed`, `random.get_state`,
`random.set_state`, `random.default_rng`.

Counted: `einsum`, `dot`, `matmul`, `exp`, `log`, `add`, `multiply`, `sum`,
`mean`, all `linalg.*`, all `fft.*`, `sort`, `argsort`, `trace`,
`unique`, set ops (`in1d`, `isin`, etc.), `histogram`, `random.*` sampling.

Blocked: `save`, `load`, `geterr`, `seterr`, and 28 others. These raise
`AttributeError`.

When in doubt, check `ops.json` or the [cheat sheet](./cheat-sheet.md).

**3. Use `me.flops.*` to estimate costs before running.**

```python
cost = me.flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
cost = me.flops.svd_cost(m=256, n=256, k=10)
```

These are pure functions — no `BudgetContext` needed.

**4. Use `me.einsum` as the primary computation primitive.**

Most linear algebra can be expressed as einsum. The cost follows the
[opt_einsum](https://github.com/dgasmith/opt_einsum) convention (see
[our fork](../api/opt-einsum.md)): `product_of_all_index_dims * op_factor`,
where `op_factor` is 2 when there is an inner product (summed indices)
and 1 otherwise.
`'ij,jk->ik'` with shapes `(m, k)` and `(k, n)` costs `2 * m * k * n` FLOPs.

**5. Exploit symmetry for cost savings.**

- Use `symmetric_axes` for symmetric outputs:
  `me.einsum('ki,kj->ij', X, X, symmetric_axes=[(0, 1)])`
- Wrap known-symmetric matrices with `me.as_symmetric(data, symmetric_axes=(0, 1))`
  for automatic savings in downstream ops

## Common mistakes agents make

| Mistake | What happens | Fix |
|---------|-------------|-----|
| Using `np.einsum` instead of `me.einsum` | FLOPs not counted, budget not checked | Always use `me.*` for operations you want tracked |
| Skipping `BudgetContext` entirely | No error (global default handles it), but budget is harder to track and namespace | Use an explicit `BudgetContext` for any work you want to measure or label |
| Assuming `sort` is free | Underestimates budget usage | `sort` costs `n*ceil(log2(n))` per slice — check the cheat sheet |
| Using `me.save()` or `me.load()` | `AttributeError` — blocked | Use `numpy` directly for I/O |
| Nesting two explicit `BudgetContext` blocks | `RuntimeError` | Use a single explicit context; nesting with the global default is fine |

## 📎 Related pages

- [FLOP Cost Cheat Sheet](./cheat-sheet.md) — every operation's cost at a glance
- [Operation Audit](./operation-audit.md) — full 482-operation inventory
- [Exploit Symmetry](../how-to/exploit-symmetry.md) — detailed symmetry guide
- [Common Errors](../troubleshooting/common-errors.md) — error messages and fixes


## Operation Audit

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# Operation Audit

Complete inventory of every NumPy operation and its mechestim status.
Generated from the operation registry (`_registry.py`).

## Summary

| Category | Count | Cost |
|----------|-------|------|
| free | 138 | 0 FLOPs |
| counted_unary | 73 | $\text{numel}(\text{output})$ |
| counted_binary | 45 | $\text{numel}(\text{output})$ |
| counted_reduction | 37 | $\text{numel}(\text{input})$ |
| counted_custom | 157 | Per-operation formula |
| blacklisted | 32 | Not available |
| **Total** | **482** | |

## All Operations

| Operation | mechestim | NumPy | Category | Cost | Status | Notes |
|-----------|-----------|-------|----------|------|--------|-------|
| `abs` | `me.abs` | `np.abs` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise absolute value; alias for absolute. |
| `absolute` | `me.absolute` | `np.absolute` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise absolute value. |
| `acos` | `me.acos` | `np.acos` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arccos (NumPy 2.x). |
| `acosh` | `me.acosh` | `np.acosh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arccosh (NumPy 2.x). |
| `add` | `me.add` | `np.add` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise addition. |
| `all` | `me.all` | `np.all` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Test whether all array elements are true. |
| `allclose` | `me.allclose` | `np.allclose` | counted_custom | varies | 🟠 supported | Element-wise tolerance check; cost = numel(a). |
| `amax` | `me.amax` | `np.amax` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Maximum value of array (alias for max/numpy.amax). |
| `amin` | `me.amin` | `np.amin` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Minimum value of array (alias for min/numpy.amin). |
| `angle` | `me.angle` | `np.angle` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return angle of complex argument element-wise. |
| `any` | `me.any` | `np.any` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Test whether any array element is true. |
| `append` | `me.append` | `np.append` | free | $0$ | 🟢 supported | Append values to end of array. |
| `apply_along_axis` | — | `np.apply_along_axis` | blacklisted | N/A | 🔴 blocked | Apply function along axis. Not supported. |
| `apply_over_axes` | — | `np.apply_over_axes` | blacklisted | N/A | 🔴 blocked | Apply function over multiple axes. Not supported. |
| `arange` | `me.arange` | `np.arange` | free | $0$ | 🟢 supported | Return evenly spaced values in given interval. |
| `arccos` | `me.arccos` | `np.arccos` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise inverse cosine. |
| `arccosh` | `me.arccosh` | `np.arccosh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise inverse hyperbolic cosine. |
| `arcsin` | `me.arcsin` | `np.arcsin` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise inverse sine. |
| `arcsinh` | `me.arcsinh` | `np.arcsinh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise inverse hyperbolic sine. |
| `arctan` | `me.arctan` | `np.arctan` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise inverse tangent. |
| `arctan2` | `me.arctan2` | `np.arctan2` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise arctan(y/x) considering quadrant. |
| `arctanh` | `me.arctanh` | `np.arctanh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise inverse hyperbolic tangent. |
| `argmax` | `me.argmax` | `np.argmax` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Index of maximum value. |
| `argmin` | `me.argmin` | `np.argmin` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Index of minimum value. |
| `argpartition` | `me.argpartition` | `np.argpartition` | counted_custom | varies | 🟠 supported | Indirect partition; cost = n per slice. |
| `argsort` | `me.argsort` | `np.argsort` | counted_custom | varies | 🟠 supported | Indirect sort; cost = n*ceil(log2(n)) per slice. |
| `argwhere` | `me.argwhere` | `np.argwhere` | free | $0$ | 🟢 supported | Find indices of non-zero elements. |
| `around` | `me.around` | `np.around` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for round. |
| `array` | `me.array` | `np.array` | free | $0$ | 🟢 supported | Create array from data. |
| `array2string` | — | `np.array2string` | blacklisted | N/A | 🔴 blocked | Return string representation of array. Not supported. |
| `array_equal` | `me.array_equal` | `np.array_equal` | counted_custom | varies | 🟠 supported | Element-wise equality; cost = numel(a). |
| `array_equiv` | `me.array_equiv` | `np.array_equiv` | counted_custom | varies | 🟠 supported | Element-wise equivalence; cost = numel(a). |
| `array_repr` | — | `np.array_repr` | blacklisted | N/A | 🔴 blocked | Return string representation of array. Not supported. |
| `array_split` | `me.array_split` | `np.array_split` | free | $0$ | 🟢 supported | Split array into sub-arrays (possibly unequal). |
| `array_str` | — | `np.array_str` | blacklisted | N/A | 🔴 blocked | Return string representation of data in array. Not supported. |
| `asarray` | `me.asarray` | `np.asarray` | free | $0$ | 🟢 supported | Convert input to array. |
| `asarray_chkfinite` | `me.asarray_chkfinite` | `np.asarray_chkfinite` | free | $0$ | 🟢 supported | Convert to array, raising if NaN or inf. |
| `asin` | `me.asin` | `np.asin` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arcsin (NumPy 2.x). |
| `asinh` | `me.asinh` | `np.asinh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arcsinh (NumPy 2.x). |
| `asmatrix` | — | `np.asmatrix` | blacklisted | N/A | 🔴 blocked | Interpret input as matrix (deprecated). Not supported. |
| `astype` | `me.astype` | `np.astype` | free | $0$ | 🟢 supported | Cast array to specified type. |
| `atan` | `me.atan` | `np.atan` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arctan (NumPy 2.x). |
| `atan2` | `me.atan2` | `np.atan2` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arctan2 (NumPy 2.x). |
| `atanh` | `me.atanh` | `np.atanh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for arctanh (NumPy 2.x). |
| `atleast_1d` | `me.atleast_1d` | `np.atleast_1d` | free | $0$ | 🟢 supported | View inputs as arrays with at least one dimension. |
| `atleast_2d` | `me.atleast_2d` | `np.atleast_2d` | free | $0$ | 🟢 supported | View inputs as arrays with at least two dimensions. |
| `atleast_3d` | `me.atleast_3d` | `np.atleast_3d` | free | $0$ | 🟢 supported | View inputs as arrays with at least three dimensions. |
| `average` | `me.average` | `np.average` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Weighted average of array elements. |
| `bartlett` | `me.bartlett` | `np.bartlett` | counted_custom | $n$ | 🟠 supported | Bartlett window. Cost: n (one linear eval per sample). |
| `base_repr` | `me.base_repr` | `np.base_repr` | free | $0$ | 🟢 supported | Return string representation of number in given base. |
| `binary_repr` | `me.binary_repr` | `np.binary_repr` | free | $0$ | 🟢 supported | Return binary string representation of the input number. |
| `bincount` | `me.bincount` | `np.bincount` | counted_custom | varies | 🟠 supported | Integer counting; cost = numel(x). |
| `bitwise_and` | `me.bitwise_and` | `np.bitwise_and` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise bitwise AND. |
| `bitwise_count` | `me.bitwise_count` | `np.bitwise_count` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Count set bits element-wise (popcount). |
| `bitwise_invert` | `me.bitwise_invert` | `np.bitwise_invert` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise bitwise invert (alias for bitwise_not). |
| `bitwise_left_shift` | `me.bitwise_left_shift` | `np.bitwise_left_shift` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise left bit shift. |
| `bitwise_not` | `me.bitwise_not` | `np.bitwise_not` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise bitwise NOT. |
| `bitwise_or` | `me.bitwise_or` | `np.bitwise_or` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise bitwise OR. |
| `bitwise_right_shift` | `me.bitwise_right_shift` | `np.bitwise_right_shift` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise right bit shift. |
| `bitwise_xor` | `me.bitwise_xor` | `np.bitwise_xor` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise bitwise XOR. |
| `blackman` | `me.blackman` | `np.blackman` | counted_custom | $3n$ | 🟠 supported | Blackman window. Cost: 3*n (three cosine terms per sample). |
| `block` | `me.block` | `np.block` | free | $0$ | 🟢 supported | Assemble ndarray from nested list of blocks. |
| `bmat` | `me.bmat` | `np.bmat` | free | $0$ | 🟢 supported | Build matrix from nested list of matrices. |
| `broadcast_arrays` | `me.broadcast_arrays` | `np.broadcast_arrays` | free | $0$ | 🟢 supported | Broadcast arrays against each other. |
| `broadcast_shapes` | `me.broadcast_shapes` | `np.broadcast_shapes` | free | $0$ | 🟢 supported | Compute broadcast shape from input shapes. |
| `broadcast_to` | `me.broadcast_to` | `np.broadcast_to` | free | $0$ | 🟢 supported | Broadcast array to new shape. |
| `busday_count` | — | `np.busday_count` | blacklisted | N/A | 🔴 blocked | Count valid days between begindate and enddate. Not supported. |
| `busday_offset` | — | `np.busday_offset` | blacklisted | N/A | 🔴 blocked | Apply offset to dates subject to valid day rules. Not supported. |
| `can_cast` | `me.can_cast` | `np.can_cast` | free | $0$ | 🟢 supported | Returns True if cast is safe. |
| `cbrt` | `me.cbrt` | `np.cbrt` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise cube root. |
| `ceil` | `me.ceil` | `np.ceil` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise ceiling. |
| `choose` | `me.choose` | `np.choose` | free | $0$ | 🟢 supported | Construct array from index array and choices. |
| `clip` | `me.clip` | `np.clip` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | Clip array to [a_min, a_max] element-wise. |
| `column_stack` | `me.column_stack` | `np.column_stack` | free | $0$ | 🟢 supported | Stack 1-D arrays as columns into 2-D array. |
| `common_type` | `me.common_type` | `np.common_type` | free | $0$ | 🟢 supported | Return scalar type common to all input arrays. |
| `compress` | `me.compress` | `np.compress` | free | $0$ | 🟢 supported | Return selected slices along axis. |
| `concat` | `me.concat` | `np.concat` | free | $0$ | 🟢 supported | Join arrays along axis (NumPy 2.x array API alias for concatenate). |
| `concatenate` | `me.concatenate` | `np.concatenate` | free | $0$ | 🟢 supported | Join arrays along axis. |
| `conj` | `me.conj` | `np.conj` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Complex conjugate element-wise. |
| `conjugate` | `me.conjugate` | `np.conjugate` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Complex conjugate element-wise. |
| `convolve` | `me.convolve` | `np.convolve` | counted_custom | $n \cdot m$ | 🟠 supported | 1-D discrete convolution. |
| `copy` | `me.copy` | `np.copy` | free | $0$ | 🟢 supported | Return array copy. |
| `copysign` | `me.copysign` | `np.copysign` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Copy sign of x2 to magnitude of x1 element-wise. |
| `copyto` | `me.copyto` | `np.copyto` | free | $0$ | 🟢 supported | Copy values from src to dst array. |
| `corrcoef` | `me.corrcoef` | `np.corrcoef` | counted_custom | $n^2 \cdot m$ | 🟠 supported | Pearson correlation coefficients. |
| `correlate` | `me.correlate` | `np.correlate` | counted_custom | $n \cdot m$ | 🟠 supported | 1-D cross-correlation. |
| `cos` | `me.cos` | `np.cos` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise cosine. |
| `cosh` | `me.cosh` | `np.cosh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise hyperbolic cosine. |
| `count_nonzero` | `me.count_nonzero` | `np.count_nonzero` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Count non-zero elements. |
| `cov` | `me.cov` | `np.cov` | counted_custom | $n^2 \cdot m$ | 🟠 supported | Covariance matrix. |
| `cross` | `me.cross` | `np.cross` | counted_custom | $\text{numel}(\text{output})$ | 🟠 supported | Cross product of two 3-D vectors. |
| `cumprod` | `me.cumprod` | `np.cumprod` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Cumulative product of array elements. |
| `cumsum` | `me.cumsum` | `np.cumsum` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Cumulative sum of array elements. |
| `cumulative_prod` | `me.cumulative_prod` | `np.cumulative_prod` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Cumulative product (NumPy 2.x array API). |
| `cumulative_sum` | `me.cumulative_sum` | `np.cumulative_sum` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Cumulative sum (NumPy 2.x array API). |
| `datetime_as_string` | — | `np.datetime_as_string` | blacklisted | N/A | 🔴 blocked | Convert datetime array to string representation. Not supported. |
| `datetime_data` | — | `np.datetime_data` | blacklisted | N/A | 🔴 blocked | Get information about the step size of datetime dtype. Not supported. |
| `deg2rad` | `me.deg2rad` | `np.deg2rad` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for radians. |
| `degrees` | `me.degrees` | `np.degrees` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Convert radians to degrees element-wise. |
| `delete` | `me.delete` | `np.delete` | free | $0$ | 🟢 supported | Return array with sub-arrays deleted along axis. |
| `diag` | `me.diag` | `np.diag` | free | $0$ | 🟢 supported | Extract diagonal or construct diagonal array. |
| `diag_indices` | `me.diag_indices` | `np.diag_indices` | free | $0$ | 🟢 supported | Return indices to access main diagonal of n-D array. |
| `diag_indices_from` | `me.diag_indices_from` | `np.diag_indices_from` | free | $0$ | 🟢 supported | Return indices to access main diagonal of given array. |
| `diagflat` | `me.diagflat` | `np.diagflat` | free | $0$ | 🟢 supported | Create diagonal array from flattened input. |
| `diagonal` | `me.diagonal` | `np.diagonal` | free | $0$ | 🟢 supported | Return specified diagonals. |
| `diff` | `me.diff` | `np.diff` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | n-th discrete difference along axis. |
| `digitize` | `me.digitize` | `np.digitize` | counted_custom | varies | 🟠 supported | Bin search; cost = n*ceil(log2(bins)). |
| `divide` | `me.divide` | `np.divide` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise true division. |
| `divmod` | `me.divmod` | `np.divmod` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise (quotient, remainder) tuple. |
| `dot` | `me.dot` | `np.dot` | counted_custom | $2 \cdot m \cdot k \cdot n$ | 🟠 supported | Dot product; cost = 2*M*N*K for matrix multiply. |
| `dsplit` | `me.dsplit` | `np.dsplit` | free | $0$ | 🟢 supported | Split array into multiple sub-arrays depth-wise. |
| `dstack` | `me.dstack` | `np.dstack` | free | $0$ | 🟢 supported | Stack arrays depth-wise (along third axis). |
| `ediff1d` | `me.ediff1d` | `np.ediff1d` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | Differences between consecutive elements. |
| `einsum` | `me.einsum` | `np.einsum` | counted_custom | $\text{op\_factor} \cdot \prod_i d_i$ | 🟠 supported | Generalized Einstein summation. |
| `einsum_path` | `me.einsum_path` | `np.einsum_path` | counted_custom | $0$ | 🟠 supported | Optimize einsum contraction path (no numeric output). |
| `empty` | `me.empty` | `np.empty` | free | $0$ | 🟢 supported | Uninitialized array allocation. |
| `empty_like` | `me.empty_like` | `np.empty_like` | free | $0$ | 🟢 supported | Uninitialized array with same shape/type as input. |
| `equal` | `me.equal` | `np.equal` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x1 == x2. |
| `exp` | `me.exp` | `np.exp` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise e^x. |
| `exp2` | `me.exp2` | `np.exp2` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise 2^x. |
| `expand_dims` | `me.expand_dims` | `np.expand_dims` | free | $0$ | 🟢 supported | Insert new size-1 axis. |
| `expm1` | `me.expm1` | `np.expm1` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise e^x - 1 (accurate near zero). |
| `extract` | `me.extract` | `np.extract` | free | $0$ | 🟢 supported | Return elements satisfying condition. |
| `eye` | `me.eye` | `np.eye` | free | $0$ | 🟢 supported | Create identity matrix. |
| `fabs` | `me.fabs` | `np.fabs` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise absolute value (always float). |
| `fft.fft` | `me.fft.fft` | `np.fft.fft` | counted_custom | $5n \cdot \lceil\log_2 n\rceil$ | 🟠 supported | 1-D complex FFT. Cost: 5*n*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.fft2` | `me.fft.fft2` | `np.fft.fft2` | counted_custom | $5N \cdot \lceil\log_2 N\rceil$ | 🟠 supported | 2-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.fftfreq` | `me.fft.fftfreq` | `np.fft.fftfreq` | free | $0$ | 🟢 supported | FFT sample frequencies. No arithmetic; returns index array. |
| `fft.fftn` | `me.fft.fftn` | `np.fft.fftn` | counted_custom | $5N \cdot \lceil\log_2 N\rceil$ | 🟠 supported | N-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.fftshift` | `me.fft.fftshift` | `np.fft.fftshift` | free | $0$ | 🟢 supported | Shift zero-frequency component to center. No arithmetic; index reordering only. |
| `fft.hfft` | `me.fft.hfft` | `np.fft.hfft` | counted_custom | $5n \cdot \lceil\log_2 n\rceil$ | 🟠 supported | FFT of Hermitian-symmetric signal. Cost: 5*n_out*ceil(log2(n_out)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifft` | `me.fft.ifft` | `np.fft.ifft` | counted_custom | $5n \cdot \lceil\log_2 n\rceil$ | 🟠 supported | Inverse 1-D complex FFT. Cost: 5*n*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifft2` | `me.fft.ifft2` | `np.fft.ifft2` | counted_custom | $5N \cdot \lceil\log_2 N\rceil$ | 🟠 supported | Inverse 2-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifftn` | `me.fft.ifftn` | `np.fft.ifftn` | counted_custom | $5N \cdot \lceil\log_2 N\rceil$ | 🟠 supported | Inverse N-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifftshift` | `me.fft.ifftshift` | `np.fft.ifftshift` | free | $0$ | 🟢 supported | Inverse of fftshift. No arithmetic; index reordering only. |
| `fft.ihfft` | `me.fft.ihfft` | `np.fft.ihfft` | counted_custom | $5n \cdot \lceil\log_2 n\rceil$ | 🟠 supported | Inverse FFT of Hermitian signal. Cost: 5*n*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.irfft` | `me.fft.irfft` | `np.fft.irfft` | counted_custom | $5(n/2) \cdot \lceil\log_2 n\rceil$ | 🟠 supported | Inverse 1-D real FFT. Cost: 5*(n//2)*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.irfft2` | `me.fft.irfft2` | `np.fft.irfft2` | counted_custom | $5(N/2) \cdot \lceil\log_2 N\rceil$ | 🟠 supported | Inverse 2-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.irfftn` | `me.fft.irfftn` | `np.fft.irfftn` | counted_custom | $5(N/2) \cdot \lceil\log_2 N\rceil$ | 🟠 supported | Inverse N-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.rfft` | `me.fft.rfft` | `np.fft.rfft` | counted_custom | $5(n/2) \cdot \lceil\log_2 n\rceil$ | 🟠 supported | 1-D real FFT. Cost: 5*(n//2)*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.rfft2` | `me.fft.rfft2` | `np.fft.rfft2` | counted_custom | $5(N/2) \cdot \lceil\log_2 N\rceil$ | 🟠 supported | 2-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.rfftfreq` | `me.fft.rfftfreq` | `np.fft.rfftfreq` | free | $0$ | 🟢 supported | Real FFT sample frequencies. No arithmetic; returns index array. |
| `fft.rfftn` | `me.fft.rfftn` | `np.fft.rfftn` | counted_custom | $5(N/2) \cdot \lceil\log_2 N\rceil$ | 🟠 supported | N-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fill_diagonal` | `me.fill_diagonal` | `np.fill_diagonal` | free | $0$ | 🟢 supported | Fill main diagonal of given array. |
| `fix` | `me.fix` | `np.fix` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Round toward zero element-wise (alias for trunc). |
| `flatnonzero` | `me.flatnonzero` | `np.flatnonzero` | free | $0$ | 🟢 supported | Return indices of non-zero elements in flattened array. |
| `flip` | `me.flip` | `np.flip` | free | $0$ | 🟢 supported | Reverse order of elements along axis. |
| `fliplr` | `me.fliplr` | `np.fliplr` | free | $0$ | 🟢 supported | Flip array left-right. |
| `flipud` | `me.flipud` | `np.flipud` | free | $0$ | 🟢 supported | Flip array up-down. |
| `float_power` | `me.float_power` | `np.float_power` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise exponentiation in float64. |
| `floor` | `me.floor` | `np.floor` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise floor. |
| `floor_divide` | `me.floor_divide` | `np.floor_divide` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise floor division. |
| `fmax` | `me.fmax` | `np.fmax` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise maximum ignoring NaN. |
| `fmin` | `me.fmin` | `np.fmin` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise minimum ignoring NaN. |
| `fmod` | `me.fmod` | `np.fmod` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise C-style fmod (remainder toward zero). |
| `format_float_positional` | — | `np.format_float_positional` | blacklisted | N/A | 🔴 blocked | Format floating point scalar as decimal string. Not supported. |
| `format_float_scientific` | — | `np.format_float_scientific` | blacklisted | N/A | 🔴 blocked | Format floating point scalar as scientific notation. Not supported. |
| `frexp` | `me.frexp` | `np.frexp` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Decompose x into mantissa and exponent element-wise. |
| `from_dlpack` | `me.from_dlpack` | `np.from_dlpack` | free | $0$ | 🟢 supported | Create ndarray from DLPack object (zero-copy). |
| `frombuffer` | `me.frombuffer` | `np.frombuffer` | free | $0$ | 🟢 supported | Interpret buffer as 1-D array. |
| `fromfile` | `me.fromfile` | `np.fromfile` | free | $0$ | 🟢 supported | Construct array from binary/text file. |
| `fromfunction` | `me.fromfunction` | `np.fromfunction` | free | $0$ | 🟢 supported | Construct array by executing function over each coordinate. |
| `fromiter` | `me.fromiter` | `np.fromiter` | free | $0$ | 🟢 supported | Create array from an iterable. |
| `frompyfunc` | — | `np.frompyfunc` | blacklisted | N/A | 🔴 blocked | Take arbitrary Python function and return NumPy ufunc. Not supported. |
| `fromregex` | `me.fromregex` | `np.fromregex` | free | $0$ | 🟢 supported | Construct array from text file using regex. |
| `fromstring` | `me.fromstring` | `np.fromstring` | free | $0$ | 🟢 supported | Create 1-D array from string data. |
| `full` | `me.full` | `np.full` | free | $0$ | 🟢 supported | Create array filled with scalar value. |
| `full_like` | `me.full_like` | `np.full_like` | free | $0$ | 🟢 supported | Array filled with scalar, same shape/type as input. |
| `gcd` | `me.gcd` | `np.gcd` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise greatest common divisor. |
| `genfromtxt` | — | `np.genfromtxt` | blacklisted | N/A | 🔴 blocked | Load data from text file with missing values. Not supported. |
| `geomspace` | `me.geomspace` | `np.geomspace` | counted_custom | varies | 🟠 supported | Geometric-spaced generation; cost = num. |
| `get_include` | — | `np.get_include` | blacklisted | N/A | 🔴 blocked | Return directory containing NumPy C header files. Not supported. |
| `getbufsize` | — | `np.getbufsize` | blacklisted | N/A | 🔴 blocked | Return size of buffer used in ufuncs. Not supported. |
| `geterr` | — | `np.geterr` | blacklisted | N/A | 🔴 blocked | Get current way of handling floating-point errors. Not supported. |
| `geterrcall` | — | `np.geterrcall` | blacklisted | N/A | 🔴 blocked | Return current callback function for floating-point errors. Not supported. |
| `gradient` | `me.gradient` | `np.gradient` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | Gradient using central differences. |
| `greater` | `me.greater` | `np.greater` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x1 > x2. |
| `greater_equal` | `me.greater_equal` | `np.greater_equal` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x1 >= x2. |
| `hamming` | `me.hamming` | `np.hamming` | counted_custom | $n$ | 🟠 supported | Hamming window. Cost: n (one cosine per sample). |
| `hanning` | `me.hanning` | `np.hanning` | counted_custom | $n$ | 🟠 supported | Hanning window. Cost: n (one cosine per sample). |
| `heaviside` | `me.heaviside` | `np.heaviside` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Heaviside step function element-wise. |
| `histogram` | `me.histogram` | `np.histogram` | counted_custom | varies | 🟠 supported | Binning; cost = n*ceil(log2(bins)). |
| `histogram2d` | `me.histogram2d` | `np.histogram2d` | counted_custom | varies | 🟠 supported | 2D binning; cost = n*(ceil(log2(bx))+ceil(log2(by))). |
| `histogram_bin_edges` | `me.histogram_bin_edges` | `np.histogram_bin_edges` | counted_custom | varies | 🟠 supported | Bin edge computation; cost = numel(a). |
| `histogramdd` | `me.histogramdd` | `np.histogramdd` | counted_custom | varies | 🟠 supported | ND binning; cost = n*sum(ceil(log2(b_i))). |
| `hsplit` | `me.hsplit` | `np.hsplit` | free | $0$ | 🟢 supported | Split array into columns. |
| `hstack` | `me.hstack` | `np.hstack` | free | $0$ | 🟢 supported | Stack arrays horizontally. |
| `hypot` | `me.hypot` | `np.hypot` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise Euclidean norm sqrt(x1^2 + x2^2). |
| `i0` | `me.i0` | `np.i0` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Modified Bessel function of order 0, element-wise. |
| `identity` | `me.identity` | `np.identity` | free | $0$ | 🟢 supported | Create square identity matrix. |
| `imag` | `me.imag` | `np.imag` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return imaginary part of complex array. |
| `in1d` | `me.in1d` | `np.in1d` | counted_custom | varies | 🟠 supported | Set membership; cost = (n+m)*ceil(log2(n+m)). |
| `indices` | `me.indices` | `np.indices` | free | $0$ | 🟢 supported | Return array representing indices of a grid. |
| `inner` | `me.inner` | `np.inner` | counted_custom | $n$ | 🟠 supported | Inner product; cost = 2*N for 1-D, 2*N*M for n-D. |
| `insert` | `me.insert` | `np.insert` | free | $0$ | 🟢 supported | Insert values along axis before given indices. |
| `interp` | `me.interp` | `np.interp` | counted_custom | $n \cdot \log m$ | 🟠 supported | 1-D linear interpolation. |
| `intersect1d` | `me.intersect1d` | `np.intersect1d` | counted_custom | varies | 🟠 supported | Set intersection; cost = (n+m)*ceil(log2(n+m)). |
| `invert` | `me.invert` | `np.invert` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Bitwise NOT element-wise. |
| `is_busday` | — | `np.is_busday` | blacklisted | N/A | 🔴 blocked | Calculates which of given dates are valid days. Not supported. |
| `isclose` | `me.isclose` | `np.isclose` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise approximate equality test. |
| `iscomplex` | `me.iscomplex` | `np.iscomplex` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Test if element is complex element-wise. |
| `iscomplexobj` | `me.iscomplexobj` | `np.iscomplexobj` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return True if input is a complex type or array. |
| `isdtype` | `me.isdtype` | `np.isdtype` | free | $0$ | 🟢 supported | Return True if array or dtype is of specified kind (NumPy 2.x). |
| `isfinite` | `me.isfinite` | `np.isfinite` | free | $0$ | 🟢 supported | Test for finite values element-wise. |
| `isfortran` | `me.isfortran` | `np.isfortran` | free | $0$ | 🟢 supported | Return True if array is Fortran contiguous. |
| `isin` | `me.isin` | `np.isin` | counted_custom | varies | 🟠 supported | Set membership; cost = (n+m)*ceil(log2(n+m)). |
| `isinf` | `me.isinf` | `np.isinf` | free | $0$ | 🟢 supported | Test for infinity element-wise. |
| `isnan` | `me.isnan` | `np.isnan` | free | $0$ | 🟢 supported | Test for NaN element-wise. |
| `isnat` | `me.isnat` | `np.isnat` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Test for NaT (not-a-time) element-wise. |
| `isneginf` | `me.isneginf` | `np.isneginf` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Test for negative infinity element-wise. |
| `isposinf` | `me.isposinf` | `np.isposinf` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Test for positive infinity element-wise. |
| `isreal` | `me.isreal` | `np.isreal` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Test if element is real (imag == 0) element-wise. |
| `isrealobj` | `me.isrealobj` | `np.isrealobj` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return True if x is a not complex type or array. |
| `isscalar` | `me.isscalar` | `np.isscalar` | free | $0$ | 🟢 supported | Return True if input is a scalar. |
| `issubdtype` | `me.issubdtype` | `np.issubdtype` | free | $0$ | 🟢 supported | Return True if first argument is lower in type hierarchy. |
| `iterable` | `me.iterable` | `np.iterable` | free | $0$ | 🟢 supported | Return True if object is iterable. |
| `ix_` | `me.ix_` | `np.ix_` | free | $0$ | 🟢 supported | Construct open mesh from multiple sequences. |
| `kaiser` | `me.kaiser` | `np.kaiser` | counted_custom | $3n$ | 🟠 supported | Kaiser window. Cost: 3*n (Bessel function eval per sample). |
| `kron` | `me.kron` | `np.kron` | counted_custom | $m_1 m_2 \cdot n_1 n_2$ | 🟠 supported | Kronecker product; cost proportional to output size. |
| `lcm` | `me.lcm` | `np.lcm` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise least common multiple. |
| `ldexp` | `me.ldexp` | `np.ldexp` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Return x1 * 2**x2 element-wise. |
| `left_shift` | `me.left_shift` | `np.left_shift` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise left bit shift (legacy name). |
| `less` | `me.less` | `np.less` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x1 < x2. |
| `less_equal` | `me.less_equal` | `np.less_equal` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x1 <= x2. |
| `lexsort` | `me.lexsort` | `np.lexsort` | counted_custom | varies | 🟠 supported | Multi-key sort; cost = k*n*ceil(log2(n)). |
| `linalg.cholesky` | `me.linalg.cholesky` | `np.linalg.cholesky` | counted_custom | $n^3 / 3$ | 🟠 supported | Cholesky decomposition. Cost: $n^3/3$ (Golub & Van Loan §4.2). |
| `linalg.cond` | `me.linalg.cond` | `np.linalg.cond` | counted_custom | $m \cdot n \cdot \min(m,n)$ | 🟠 supported | Condition number. Cost: m*n*min(m,n) (via SVD). |
| `linalg.cross` | `me.linalg.cross` | `np.linalg.cross` | counted_custom | delegates to `cross` | 🟠 supported | Delegates to `me.cross` which charges `numel(output)` FLOPs. |
| `linalg.det` | `me.linalg.det` | `np.linalg.det` | counted_custom | $n^3$ | 🟠 supported | Determinant. Cost: $n^3$ (LU factorization). |
| `linalg.diagonal` | `me.linalg.diagonal` | `np.linalg.diagonal` | free | $0$ | 🟢 supported | View of diagonal — delegates to mechestim.diagonal. Cost: 0 FLOPs. |
| `linalg.eig` | `me.linalg.eig` | `np.linalg.eig` | counted_custom | $10n^3$ | 🟠 supported | Eigendecomposition. Cost: $10n^3$ (Francis QR, Golub & Van Loan §7.5). |
| `linalg.eigh` | `me.linalg.eigh` | `np.linalg.eigh` | counted_custom | $4n^3 / 3$ | 🟠 supported | Symmetric eigendecomposition. Cost: $(4/3)n^3$ (Golub & Van Loan §8.3). |
| `linalg.eigvals` | `me.linalg.eigvals` | `np.linalg.eigvals` | counted_custom | $10n^3$ | 🟠 supported | Eigenvalues only. Cost: $10n^3$ (same as eig). |
| `linalg.eigvalsh` | `me.linalg.eigvalsh` | `np.linalg.eigvalsh` | counted_custom | $4n^3 / 3$ | 🟠 supported | Symmetric eigenvalues. Cost: $(4/3)n^3$ (same as eigh). |
| `linalg.inv` | `me.linalg.inv` | `np.linalg.inv` | counted_custom | $n^3$ | 🟠 supported | Matrix inverse. Cost: $n^3$ (LU + solve). |
| `linalg.lstsq` | `me.linalg.lstsq` | `np.linalg.lstsq` | counted_custom | $m \cdot n \cdot \min(m,n)$ | 🟠 supported | Least squares. Cost: m*n*min(m,n) (LAPACK gelsd/SVD). |
| `linalg.matmul` | `me.linalg.matmul` | `np.linalg.matmul` | counted_custom | delegates to `matmul` | 🟠 supported | Delegates to `me.matmul` which charges `2*m*k*n` FLOPs. |
| `linalg.matrix_norm` | `me.linalg.matrix_norm` | `np.linalg.matrix_norm` | counted_custom | varies | 🟠 supported | Matrix norm. Cost depends on ord: 2*numel for Frobenius, m*n*min(m,n) for ord=2. |
| `linalg.matrix_power` | `me.linalg.matrix_power` | `np.linalg.matrix_power` | counted_custom | $\lfloor\log_2 k\rfloor \cdot n^3$ | 🟠 supported | Matrix power. Cost: $(\lfloor\log_2 k\rfloor + \text{popcount}(k) - 1) \cdot n^3$ (exponentiation by squaring). |
| `linalg.matrix_rank` | `me.linalg.matrix_rank` | `np.linalg.matrix_rank` | counted_custom | $m \cdot n \cdot \min(m,n)$ | 🟠 supported | Matrix rank. Cost: m*n*min(m,n) (via SVD). |
| `linalg.matrix_transpose` | `me.linalg.matrix_transpose` | `np.linalg.matrix_transpose` | free | $0$ | 🟢 supported | Transpose view — delegates to mechestim.matrix_transpose. Cost: 0 FLOPs. |
| `linalg.multi_dot` | `me.linalg.multi_dot` | `np.linalg.multi_dot` | counted_custom | optimal chain | 🟠 supported | Chain matmul. Cost: sum of optimal chain matmul costs (CLRS §15.2). |
| `linalg.norm` | `me.linalg.norm` | `np.linalg.norm` | counted_custom | varies | 🟠 supported | Norm. Cost depends on ord: numel for L1/inf, 2*numel for Frobenius, m*n*min(m,n) for ord=2. |
| `linalg.outer` | `me.linalg.outer` | `np.linalg.outer` | counted_custom | delegates to `outer` | 🟠 supported | Delegates to `me.outer` which charges `m*n` FLOPs. |
| `linalg.pinv` | `me.linalg.pinv` | `np.linalg.pinv` | counted_custom | $m \cdot n \cdot \min(m,n)$ | 🟠 supported | Pseudoinverse. Cost: m*n*min(m,n) (via SVD). |
| `linalg.qr` | `me.linalg.qr` | `np.linalg.qr` | counted_custom | $2mn^2 - 2n^3/3$ | 🟠 supported | QR decomposition. Cost: $2mn^2 - (2/3)n^3$ (Golub & Van Loan §5.2). |
| `linalg.slogdet` | `me.linalg.slogdet` | `np.linalg.slogdet` | counted_custom | $n^3$ | 🟠 supported | Sign + log determinant. Cost: $n^3$ (LU factorization). |
| `linalg.solve` | `me.linalg.solve` | `np.linalg.solve` | counted_custom | $2n^3/3 + n^2 \cdot n_{\text{rhs}}$ | 🟠 supported | Solve Ax=b. Cost: $2n^3/3$ (LU) + $n^2 \cdot n_{\text{rhs}}$ (back-substitution). |
| `linalg.svd` | `me.linalg.svd` | `np.linalg.svd` | counted_custom | $m \cdot n \cdot k$ | 🟠 supported | Singular value decomposition; cost ~ O(min(m,n)*m*n). |
| `linalg.svdvals` | `me.linalg.svdvals` | `np.linalg.svdvals` | counted_custom | $m \cdot n \cdot \min(m,n)$ | 🟠 supported | Singular values only. Cost: m*n*min(m,n) (Golub-Reinsch). |
| `linalg.tensordot` | `me.linalg.tensordot` | `np.linalg.tensordot` | counted_custom | delegates to `tensordot` | 🟠 supported | Delegates to `me.tensordot` which charges FLOPs based on contraction. |
| `linalg.tensorinv` | `me.linalg.tensorinv` | `np.linalg.tensorinv` | counted_custom | $n^3$ | 🟠 supported | Tensor inverse. Cost: $n^3$ after reshape (delegates to inv). |
| `linalg.tensorsolve` | `me.linalg.tensorsolve` | `np.linalg.tensorsolve` | counted_custom | $n^3$ | 🟠 supported | Tensor solve. Cost: $n^3$ after reshape (delegates to solve). |
| `linalg.trace` | `me.linalg.trace` | `np.linalg.trace` | counted_custom | $n$ | 🟠 supported | Matrix trace. Cost: n (sum of diagonal elements). |
| `linalg.vecdot` | `me.linalg.vecdot` | `np.linalg.vecdot` | counted_custom | delegates to `vecdot` | 🟠 supported | Delegates to `me.vecdot` which charges `2*n` FLOPs. |
| `linalg.vector_norm` | `me.linalg.vector_norm` | `np.linalg.vector_norm` | counted_custom | $n$ or $2n$ | 🟠 supported | Vector norm. Cost: numel (or 2*numel for general p-norm). |
| `linspace` | `me.linspace` | `np.linspace` | free | $0$ | 🟢 supported | Return evenly spaced numbers over interval. |
| `load` | — | `np.load` | blacklisted | N/A | 🔴 blocked | Load arrays from .npy/.npz files. Not supported. |
| `loadtxt` | — | `np.loadtxt` | blacklisted | N/A | 🔴 blocked | Load data from text file. Not supported. |
| `log` | `me.log` | `np.log` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise natural logarithm. |
| `log10` | `me.log10` | `np.log10` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise base-10 logarithm. |
| `log1p` | `me.log1p` | `np.log1p` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise log(1+x) (accurate near zero). |
| `log2` | `me.log2` | `np.log2` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise base-2 logarithm. |
| `logaddexp` | `me.logaddexp` | `np.logaddexp` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | log(exp(x1) + exp(x2)) element-wise. |
| `logaddexp2` | `me.logaddexp2` | `np.logaddexp2` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | log2(2**x1 + 2**x2) element-wise. |
| `logical_and` | `me.logical_and` | `np.logical_and` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise logical AND. |
| `logical_not` | `me.logical_not` | `np.logical_not` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise logical NOT. |
| `logical_or` | `me.logical_or` | `np.logical_or` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise logical OR. |
| `logical_xor` | `me.logical_xor` | `np.logical_xor` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise logical XOR. |
| `logspace` | `me.logspace` | `np.logspace` | counted_custom | varies | 🟠 supported | Log-spaced generation; cost = num. |
| `mask_indices` | `me.mask_indices` | `np.mask_indices` | free | $0$ | 🟢 supported | Return indices of mask for n x n array. |
| `matmul` | `me.matmul` | `np.matmul` | counted_custom | $2 \cdot m \cdot k \cdot n$ | 🟠 supported | Matrix multiplication; cost = 2*M*N*K. |
| `matrix_transpose` | `me.matrix_transpose` | `np.matrix_transpose` | free | $0$ | 🟢 supported | Transpose last two dimensions (NumPy 2.x array API). |
| `max` | `me.max` | `np.max` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Maximum value of array. |
| `maximum` | `me.maximum` | `np.maximum` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise maximum (propagates NaN). |
| `may_share_memory` | `me.may_share_memory` | `np.may_share_memory` | free | $0$ | 🟢 supported | Determine if two arrays might share memory. |
| `mean` | `me.mean` | `np.mean` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Arithmetic mean of array elements. |
| `median` | `me.median` | `np.median` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Median of array elements (sorts internally). |
| `meshgrid` | `me.meshgrid` | `np.meshgrid` | free | $0$ | 🟢 supported | Coordinate matrices from coordinate vectors. |
| `min` | `me.min` | `np.min` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Minimum value of array. |
| `min_scalar_type` | `me.min_scalar_type` | `np.min_scalar_type` | free | $0$ | 🟢 supported | Return the minimum scalar type for a value. |
| `minimum` | `me.minimum` | `np.minimum` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise minimum (propagates NaN). |
| `mintypecode` | `me.mintypecode` | `np.mintypecode` | free | $0$ | 🟢 supported | Return minimum data type character that can satisfy all given types. |
| `mod` | `me.mod` | `np.mod` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise modulo. |
| `modf` | `me.modf` | `np.modf` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return fractional and integral parts element-wise. |
| `moveaxis` | `me.moveaxis` | `np.moveaxis` | free | $0$ | 🟢 supported | Move axes to new positions. |
| `multiply` | `me.multiply` | `np.multiply` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise multiplication. |
| `nan_to_num` | `me.nan_to_num` | `np.nan_to_num` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Replace NaN/inf with finite numbers element-wise. |
| `nanargmax` | `me.nanargmax` | `np.nanargmax` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Index of maximum ignoring NaNs. |
| `nanargmin` | `me.nanargmin` | `np.nanargmin` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Index of minimum ignoring NaNs. |
| `nancumprod` | `me.nancumprod` | `np.nancumprod` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Cumulative product ignoring NaNs. |
| `nancumsum` | `me.nancumsum` | `np.nancumsum` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Cumulative sum ignoring NaNs. |
| `nanmax` | `me.nanmax` | `np.nanmax` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Maximum ignoring NaNs. |
| `nanmean` | `me.nanmean` | `np.nanmean` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Mean ignoring NaNs. |
| `nanmedian` | `me.nanmedian` | `np.nanmedian` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Median ignoring NaNs. |
| `nanmin` | `me.nanmin` | `np.nanmin` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Minimum ignoring NaNs. |
| `nanpercentile` | `me.nanpercentile` | `np.nanpercentile` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | q-th percentile ignoring NaNs. |
| `nanprod` | `me.nanprod` | `np.nanprod` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Product ignoring NaNs. |
| `nanquantile` | `me.nanquantile` | `np.nanquantile` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | q-th quantile ignoring NaNs. |
| `nanstd` | `me.nanstd` | `np.nanstd` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Standard deviation ignoring NaNs. |
| `nansum` | `me.nansum` | `np.nansum` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Sum ignoring NaNs. |
| `nanvar` | `me.nanvar` | `np.nanvar` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Variance ignoring NaNs. |
| `ndim` | `me.ndim` | `np.ndim` | free | $0$ | 🟢 supported | Return number of dimensions of array. |
| `negative` | `me.negative` | `np.negative` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise negation. |
| `nested_iters` | — | `np.nested_iters` | blacklisted | N/A | 🔴 blocked | Create nested iterators for multi-index broadcasting. Not supported. |
| `nextafter` | `me.nextafter` | `np.nextafter` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Return next float after x1 toward x2 element-wise. |
| `nonzero` | `me.nonzero` | `np.nonzero` | free | $0$ | 🟢 supported | Return indices of non-zero elements. |
| `not_equal` | `me.not_equal` | `np.not_equal` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x1 != x2. |
| `ones` | `me.ones` | `np.ones` | free | $0$ | 🟢 supported | Create one-filled array. |
| `ones_like` | `me.ones_like` | `np.ones_like` | free | $0$ | 🟢 supported | Array of ones with same shape/type as input. |
| `outer` | `me.outer` | `np.outer` | counted_custom | $m \cdot n$ | 🟠 supported | Outer product of two vectors; cost = M*N. |
| `packbits` | `me.packbits` | `np.packbits` | free | $0$ | 🟢 supported | Pack elements of array into bits. |
| `pad` | `me.pad` | `np.pad` | free | $0$ | 🟢 supported | Pad array. |
| `partition` | `me.partition` | `np.partition` | counted_custom | varies | 🟠 supported | Quickselect; cost = n per slice. |
| `percentile` | `me.percentile` | `np.percentile` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | q-th percentile of array elements. |
| `permute_dims` | `me.permute_dims` | `np.permute_dims` | free | $0$ | 🟢 supported | Permute dimensions (NumPy 2.x array API). |
| `piecewise` | — | `np.piecewise` | blacklisted | N/A | 🔴 blocked | Evaluate piecewise-defined function. Not supported. |
| `place` | `me.place` | `np.place` | free | $0$ | 🟢 supported | Change elements satisfying condition. |
| `poly` | `me.poly` | `np.poly` | counted_custom | $n^2$ | 🟠 supported | Polynomial from roots. Cost: $n^2$ FLOPs. |
| `polyadd` | `me.polyadd` | `np.polyadd` | counted_custom | $\max(n_1, n_2)$ | 🟠 supported | Add two polynomials. Cost: max(n1, n2) FLOPs. |
| `polyder` | `me.polyder` | `np.polyder` | counted_custom | $n$ | 🟠 supported | Differentiate polynomial. Cost: n FLOPs. |
| `polydiv` | `me.polydiv` | `np.polydiv` | counted_custom | $n_1 \cdot n_2$ | 🟠 supported | Divide one polynomial by another. Cost: n1 * n2 FLOPs. |
| `polyfit` | `me.polyfit` | `np.polyfit` | counted_custom | $2m \cdot (\text{deg}+1)^2$ | 🟠 supported | Least squares polynomial fit. Cost: 2 * m * (deg+1)^2 FLOPs. |
| `polyint` | `me.polyint` | `np.polyint` | counted_custom | $n$ | 🟠 supported | Integrate polynomial. Cost: n FLOPs. |
| `polymul` | `me.polymul` | `np.polymul` | counted_custom | $n_1 \cdot n_2$ | 🟠 supported | Multiply polynomials. Cost: n1 * n2 FLOPs. |
| `polysub` | `me.polysub` | `np.polysub` | counted_custom | $\max(n_1, n_2)$ | 🟠 supported | Difference (subtraction) of two polynomials. Cost: max(n1, n2) FLOPs. |
| `polyval` | `me.polyval` | `np.polyval` | counted_custom | $2 \cdot m \cdot \text{deg}$ | 🟠 supported | Evaluate polynomial at given points. Cost: 2 * m * deg FLOPs (Horner's method). |
| `positive` | `me.positive` | `np.positive` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise unary plus (copy with sign preserved). |
| `pow` | `me.pow` | `np.pow` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for power (NumPy 2.x). |
| `power` | `me.power` | `np.power` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise exponentiation x**y. |
| `prod` | `me.prod` | `np.prod` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Product of array elements. |
| `promote_types` | `me.promote_types` | `np.promote_types` | free | $0$ | 🟢 supported | Return smallest type to which both types may be safely cast. |
| `ptp` | `me.ptp` | `np.ptp` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Peak-to-peak (max - min) range of array. |
| `put` | `me.put` | `np.put` | free | $0$ | 🟢 supported | Replace elements at given flat indices. |
| `put_along_axis` | `me.put_along_axis` | `np.put_along_axis` | free | $0$ | 🟢 supported | Put values into destination array using indices. |
| `putmask` | `me.putmask` | `np.putmask` | free | $0$ | 🟢 supported | Change elements of array based on condition and input values. |
| `quantile` | `me.quantile` | `np.quantile` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | q-th quantile of array elements. |
| `rad2deg` | `me.rad2deg` | `np.rad2deg` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Alias for degrees. |
| `radians` | `me.radians` | `np.radians` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Convert degrees to radians element-wise. |
| `random.beta` | `me.random.beta` | `np.random.beta` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.binomial` | `me.random.binomial` | `np.random.binomial` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.bytes` | `me.random.bytes` | `np.random.bytes` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.chisquare` | `me.random.chisquare` | `np.random.chisquare` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.choice` | `me.random.choice` | `np.random.choice` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output) if replace, n*ceil(log2(n)) if not. |
| `random.default_rng` | `me.random.default_rng` | `np.random.default_rng` | free | $0$ | 🟢 supported | Construct a new Generator with default BitGenerator. |
| `random.dirichlet` | `me.random.dirichlet` | `np.random.dirichlet` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.exponential` | `me.random.exponential` | `np.random.exponential` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.f` | `me.random.f` | `np.random.f` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.gamma` | `me.random.gamma` | `np.random.gamma` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.geometric` | `me.random.geometric` | `np.random.geometric` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.get_state` | `me.random.get_state` | `np.random.get_state` | free | $0$ | 🟢 supported | Return tuple representing internal state of generator. |
| `random.gumbel` | `me.random.gumbel` | `np.random.gumbel` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.hypergeometric` | `me.random.hypergeometric` | `np.random.hypergeometric` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.laplace` | `me.random.laplace` | `np.random.laplace` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.logistic` | `me.random.logistic` | `np.random.logistic` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.lognormal` | `me.random.lognormal` | `np.random.lognormal` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.logseries` | `me.random.logseries` | `np.random.logseries` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.multinomial` | `me.random.multinomial` | `np.random.multinomial` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.multivariate_normal` | `me.random.multivariate_normal` | `np.random.multivariate_normal` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.negative_binomial` | `me.random.negative_binomial` | `np.random.negative_binomial` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.noncentral_chisquare` | `me.random.noncentral_chisquare` | `np.random.noncentral_chisquare` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.noncentral_f` | `me.random.noncentral_f` | `np.random.noncentral_f` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.normal` | `me.random.normal` | `np.random.normal` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.pareto` | `me.random.pareto` | `np.random.pareto` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.permutation` | `me.random.permutation` | `np.random.permutation` | counted_custom | varies | 🟠 supported | Shuffle; cost = n*ceil(log2(n)). |
| `random.poisson` | `me.random.poisson` | `np.random.poisson` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.power` | `me.random.power` | `np.random.power` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.rand` | `me.random.rand` | `np.random.rand` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.randint` | `me.random.randint` | `np.random.randint` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.randn` | `me.random.randn` | `np.random.randn` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.random` | `me.random.random` | `np.random.random` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.random_integers` | `me.random.random_integers` | `np.random.random_integers` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.random_sample` | `me.random.random_sample` | `np.random.random_sample` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.ranf` | `me.random.ranf` | `np.random.ranf` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.rayleigh` | `me.random.rayleigh` | `np.random.rayleigh` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.sample` | `me.random.sample` | `np.random.sample` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.seed` | `me.random.seed` | `np.random.seed` | free | $0$ | 🟢 supported | Seed random number generator. |
| `random.set_state` | `me.random.set_state` | `np.random.set_state` | free | $0$ | 🟢 supported | Set internal state of generator. |
| `random.shuffle` | `me.random.shuffle` | `np.random.shuffle` | counted_custom | varies | 🟠 supported | Shuffle; cost = n*ceil(log2(n)). |
| `random.standard_cauchy` | `me.random.standard_cauchy` | `np.random.standard_cauchy` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.standard_exponential` | `me.random.standard_exponential` | `np.random.standard_exponential` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.standard_gamma` | `me.random.standard_gamma` | `np.random.standard_gamma` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.standard_normal` | `me.random.standard_normal` | `np.random.standard_normal` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.standard_t` | `me.random.standard_t` | `np.random.standard_t` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.triangular` | `me.random.triangular` | `np.random.triangular` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.uniform` | `me.random.uniform` | `np.random.uniform` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.vonmises` | `me.random.vonmises` | `np.random.vonmises` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.wald` | `me.random.wald` | `np.random.wald` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.weibull` | `me.random.weibull` | `np.random.weibull` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `random.zipf` | `me.random.zipf` | `np.random.zipf` | counted_custom | varies | 🟠 supported | Sampling; cost = numel(output). |
| `ravel` | `me.ravel` | `np.ravel` | free | $0$ | 🟢 supported | Return contiguous flattened array. |
| `ravel_multi_index` | `me.ravel_multi_index` | `np.ravel_multi_index` | free | $0$ | 🟢 supported | Convert multi-dimensional index to flat index. |
| `real` | `me.real` | `np.real` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return real part of complex array. |
| `real_if_close` | `me.real_if_close` | `np.real_if_close` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return real array if imaginary part is negligible. |
| `reciprocal` | `me.reciprocal` | `np.reciprocal` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise 1/x. |
| `remainder` | `me.remainder` | `np.remainder` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise remainder (same as mod). |
| `repeat` | `me.repeat` | `np.repeat` | free | $0$ | 🟢 supported | Repeat elements of an array. |
| `require` | `me.require` | `np.require` | free | $0$ | 🟢 supported | Return array that satisfies requirements. |
| `reshape` | `me.reshape` | `np.reshape` | free | $0$ | 🟢 supported | Reshape array without copying. |
| `resize` | `me.resize` | `np.resize` | free | $0$ | 🟢 supported | Return new array with given shape by repeating. |
| `result_type` | `me.result_type` | `np.result_type` | free | $0$ | 🟢 supported | Return type that results from applying NumPy type promotion. |
| `right_shift` | `me.right_shift` | `np.right_shift` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise right bit shift (legacy name). |
| `rint` | `me.rint` | `np.rint` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Round to nearest integer element-wise. |
| `roll` | `me.roll` | `np.roll` | free | $0$ | 🟢 supported | Roll array elements along axis. |
| `rollaxis` | `me.rollaxis` | `np.rollaxis` | free | $0$ | 🟢 supported | Roll specified axis backwards. |
| `roots` | `me.roots` | `np.roots` | counted_custom | $10n^3$ | 🟠 supported | Return roots of polynomial with given coefficients. Cost: $10n^3$ FLOPs (companion matrix eig). |
| `rot90` | `me.rot90` | `np.rot90` | free | $0$ | 🟢 supported | Rotate array 90 degrees. |
| `round` | `me.round` | `np.round` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Round to given number of decimals element-wise. |
| `row_stack` | `me.row_stack` | `np.row_stack` | free | $0$ | 🟢 supported | Stack arrays vertically (alias for vstack). |
| `save` | — | `np.save` | blacklisted | N/A | 🔴 blocked | Save array to .npy file. Not supported. |
| `savetxt` | — | `np.savetxt` | blacklisted | N/A | 🔴 blocked | Save array to text file. Not supported. |
| `savez` | — | `np.savez` | blacklisted | N/A | 🔴 blocked | Save multiple arrays to .npz file. Not supported. |
| `savez_compressed` | — | `np.savez_compressed` | blacklisted | N/A | 🔴 blocked | Save multiple arrays to compressed .npz file. Not supported. |
| `searchsorted` | `me.searchsorted` | `np.searchsorted` | counted_custom | varies | 🟠 supported | Binary search; cost = m*ceil(log2(n)). |
| `select` | `me.select` | `np.select` | free | $0$ | 🟢 supported | Return array from list of choices based on conditions. |
| `setbufsize` | — | `np.setbufsize` | blacklisted | N/A | 🔴 blocked | Set size of buffer used in ufuncs. Not supported. |
| `setdiff1d` | `me.setdiff1d` | `np.setdiff1d` | counted_custom | varies | 🟠 supported | Set difference; cost = (n+m)*ceil(log2(n+m)). |
| `seterr` | — | `np.seterr` | blacklisted | N/A | 🔴 blocked | Set how floating-point errors are handled. Not supported. |
| `seterrcall` | — | `np.seterrcall` | blacklisted | N/A | 🔴 blocked | Set callback function for floating-point errors. Not supported. |
| `setxor1d` | `me.setxor1d` | `np.setxor1d` | counted_custom | varies | 🟠 supported | Symmetric set difference; cost = (n+m)*ceil(log2(n+m)). |
| `shape` | `me.shape` | `np.shape` | free | $0$ | 🟢 supported | Return shape of array. |
| `shares_memory` | `me.shares_memory` | `np.shares_memory` | free | $0$ | 🟢 supported | Determine if two arrays share memory. |
| `show_config` | — | `np.show_config` | blacklisted | N/A | 🔴 blocked | Show NumPy build configuration. Not supported. |
| `show_runtime` | — | `np.show_runtime` | blacklisted | N/A | 🔴 blocked | Show runtime info. Not supported. |
| `sign` | `me.sign` | `np.sign` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise sign function. |
| `signbit` | `me.signbit` | `np.signbit` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Returns True for elements with negative sign bit. |
| `sin` | `me.sin` | `np.sin` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise sine. |
| `sinc` | `me.sinc` | `np.sinc` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Normalized sinc function element-wise. |
| `sinh` | `me.sinh` | `np.sinh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise hyperbolic sine. |
| `size` | `me.size` | `np.size` | free | $0$ | 🟢 supported | Return number of elements in array. |
| `sort` | `me.sort` | `np.sort` | counted_custom | varies | 🟠 supported | Comparison sort; cost = n*ceil(log2(n)) per slice. |
| `sort_complex` | `me.sort_complex` | `np.sort_complex` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Sort complex array by real then imaginary part. |
| `spacing` | `me.spacing` | `np.spacing` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Return ULP spacing for each element. |
| `split` | `me.split` | `np.split` | free | $0$ | 🟢 supported | Split array into sub-arrays. |
| `sqrt` | `me.sqrt` | `np.sqrt` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise square root. |
| `square` | `me.square` | `np.square` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise x^2. |
| `squeeze` | `me.squeeze` | `np.squeeze` | free | $0$ | 🟢 supported | Remove size-1 dimensions. |
| `stack` | `me.stack` | `np.stack` | free | $0$ | 🟢 supported | Join arrays along new axis. |
| `std` | `me.std` | `np.std` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Standard deviation; cost_multiplier=2 (two passes). |
| `subtract` | `me.subtract` | `np.subtract` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise subtraction. |
| `sum` | `me.sum` | `np.sum` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Sum of array elements. |
| `swapaxes` | `me.swapaxes` | `np.swapaxes` | free | $0$ | 🟢 supported | Interchange two axes of an array. |
| `take` | `me.take` | `np.take` | free | $0$ | 🟢 supported | Take elements from array along axis. |
| `take_along_axis` | `me.take_along_axis` | `np.take_along_axis` | free | $0$ | 🟢 supported | Take values from input array by matching 1-D index. |
| `tan` | `me.tan` | `np.tan` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise tangent. |
| `tanh` | `me.tanh` | `np.tanh` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise hyperbolic tangent. |
| `tensordot` | `me.tensordot` | `np.tensordot` | counted_custom | $\prod_i d_i$ | 🟠 supported | Tensor dot product along specified axes. |
| `tile` | `me.tile` | `np.tile` | free | $0$ | 🟢 supported | Repeat array by tiling. |
| `trace` | `me.trace` | `np.trace` | counted_custom | varies | 🟠 supported | Diagonal sum; cost = min(n,m). |
| `transpose` | `me.transpose` | `np.transpose` | free | $0$ | 🟢 supported | Permute array dimensions. |
| `trapezoid` | `me.trapezoid` | `np.trapezoid` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | Integrate using the trapezoidal rule. |
| `trapz` | `me.trapz` | `np.trapz` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | Alias for trapezoid (deprecated). |
| `tri` | `me.tri` | `np.tri` | free | $0$ | 🟢 supported | Array with ones at and below given diagonal. |
| `tril` | `me.tril` | `np.tril` | free | $0$ | 🟢 supported | Lower triangle of array. |
| `tril_indices` | `me.tril_indices` | `np.tril_indices` | free | $0$ | 🟢 supported | Return lower-triangle indices for n x n array. |
| `tril_indices_from` | `me.tril_indices_from` | `np.tril_indices_from` | free | $0$ | 🟢 supported | Return lower-triangle indices for given array. |
| `trim_zeros` | `me.trim_zeros` | `np.trim_zeros` | free | $0$ | 🟢 supported | Trim leading/trailing zeros from 1-D array. |
| `triu` | `me.triu` | `np.triu` | free | $0$ | 🟢 supported | Upper triangle of array. |
| `triu_indices` | `me.triu_indices` | `np.triu_indices` | free | $0$ | 🟢 supported | Return upper-triangle indices for n x n array. |
| `triu_indices_from` | `me.triu_indices_from` | `np.triu_indices_from` | free | $0$ | 🟢 supported | Return upper-triangle indices for given array. |
| `true_divide` | `me.true_divide` | `np.true_divide` | counted_binary | $\text{numel}(\text{output})$ | 🟡 supported | Element-wise true division (explicit). |
| `trunc` | `me.trunc` | `np.trunc` | counted_unary | $\text{numel}(\text{output})$ | 🟡 supported | Truncate toward zero element-wise. |
| `typename` | `me.typename` | `np.typename` | free | $0$ | 🟢 supported | Return description of given data type code. |
| `union1d` | `me.union1d` | `np.union1d` | counted_custom | varies | 🟠 supported | Set union; cost = (n+m)*ceil(log2(n+m)). |
| `unique` | `me.unique` | `np.unique` | counted_custom | varies | 🟠 supported | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_all` | `me.unique_all` | `np.unique_all` | counted_custom | varies | 🟠 supported | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_counts` | `me.unique_counts` | `np.unique_counts` | counted_custom | varies | 🟠 supported | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_inverse` | `me.unique_inverse` | `np.unique_inverse` | counted_custom | varies | 🟠 supported | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_values` | `me.unique_values` | `np.unique_values` | counted_custom | varies | 🟠 supported | Sort-based unique; cost = n*ceil(log2(n)). |
| `unpackbits` | `me.unpackbits` | `np.unpackbits` | free | $0$ | 🟢 supported | Unpack elements of array into bits. |
| `unravel_index` | `me.unravel_index` | `np.unravel_index` | free | $0$ | 🟢 supported | Convert flat index to multi-dimensional index. |
| `unstack` | `me.unstack` | `np.unstack` | free | $0$ | 🟢 supported | Unstack array along axis into tuple of arrays (NumPy 2.x). |
| `unwrap` | `me.unwrap` | `np.unwrap` | counted_custom | $\text{numel}(\text{input})$ | 🟠 supported | Phase unwrap. Cost: $\text{numel}(\text{input})$ (diff + conditional adjustment). |
| `vander` | `me.vander` | `np.vander` | counted_custom | varies | 🟠 supported | Vandermonde matrix; cost = len(x)*(N-1). |
| `var` | `me.var` | `np.var` | counted_reduction | $\text{numel}(\text{input})$ | 🟡 supported | Variance; cost_multiplier=2 (two passes). |
| `vdot` | `me.vdot` | `np.vdot` | counted_custom | $n$ | 🟠 supported | Dot product with conjugation; cost = 2*N. |
| `vecdot` | `me.vecdot` | `np.vecdot` | counted_binary | $n$ | 🟡 supported | Vector dot product along last axis. |
| `vsplit` | `me.vsplit` | `np.vsplit` | free | $0$ | 🟢 supported | Split array into rows. |
| `vstack` | `me.vstack` | `np.vstack` | free | $0$ | 🟢 supported | Stack arrays vertically. |
| `where` | `me.where` | `np.where` | free | $0$ | 🟢 supported | Select elements based on condition. |
| `zeros` | `me.zeros` | `np.zeros` | free | $0$ | 🟢 supported | Create zero-filled array. |
| `zeros_like` | `me.zeros_like` | `np.zeros_like` | free | $0$ | 🟢 supported | Array of zeros with same shape/type as input. |


## FLOP Cost Cheat Sheet

<!-- Auto-generated by scripts/generate_api_docs.py. Do not edit manually. -->
# FLOP Cost Cheat Sheet

> **mechestim is NOT NumPy.** All computation requires a `BudgetContext`.
> Some operations cost FLOPs, some are free, and 32 are blocked entirely.

## Key Constraints

- All counted operations require an active `BudgetContext`
- Budget is checked *before* execution — exceeding it raises `BudgetExhaustedError`
- 32 operations are blocked (I/O, config, state functions)
- `sort`, `argsort`, `trace`, and random sampling are **counted** with analytical FLOP costs

## Cost by Category

| Category | Count | Cost Formula |
|----------|-------|-------------|
| Free | 138 | $0$ |
| Unary | 73 | $\text{numel}(\text{output})$ |
| Binary | 45 | $\text{numel}(\text{output})$ |
| Reduction | 37 | $\text{numel}(\text{input})$ |
| Custom | 157 | varies |
| Blocked | 32 | N/A |

## Custom Operation Costs

| Operation | Cost Formula | Notes |
|-----------|-------------|-------|
| `allclose` | varies | Element-wise tolerance check; cost = numel(a). |
| `argpartition` | varies | Indirect partition; cost = n per slice. |
| `argsort` | varies | Indirect sort; cost = n*ceil(log2(n)) per slice. |
| `array_equal` | varies | Element-wise equality; cost = numel(a). |
| `array_equiv` | varies | Element-wise equivalence; cost = numel(a). |
| `bartlett` | $n$ | Bartlett window. Cost: n (one linear eval per sample). |
| `bincount` | varies | Integer counting; cost = numel(x). |
| `blackman` | $3n$ | Blackman window. Cost: 3*n (three cosine terms per sample). |
| `clip` | $\text{numel}(\text{input})$ | Clip array to [a_min, a_max] element-wise. |
| `convolve` | $n \cdot m$ | 1-D discrete convolution. |
| `corrcoef` | $n^2 \cdot m$ | Pearson correlation coefficients. |
| `correlate` | $n \cdot m$ | 1-D cross-correlation. |
| `cov` | $n^2 \cdot m$ | Covariance matrix. |
| `cross` | $\text{numel}(\text{output})$ | Cross product of two 3-D vectors. |
| `diff` | $\text{numel}(\text{input})$ | n-th discrete difference along axis. |
| `digitize` | varies | Bin search; cost = n*ceil(log2(bins)). |
| `dot` | $2 \cdot m \cdot k \cdot n$ | Dot product; cost = 2*M*N*K for matrix multiply. |
| `ediff1d` | $\text{numel}(\text{input})$ | Differences between consecutive elements. |
| `einsum` | $\text{op\_factor} \cdot \prod_i d_i$ | Generalized Einstein summation. |
| `einsum_path` | $0$ | Optimize einsum contraction path (no numeric output). |
| `fft.fft` | $5n \cdot \lceil\log_2 n\rceil$ | 1-D complex FFT. Cost: 5*n*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.fft2` | $5N \cdot \lceil\log_2 N\rceil$ | 2-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.fftn` | $5N \cdot \lceil\log_2 N\rceil$ | N-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.hfft` | $5n \cdot \lceil\log_2 n\rceil$ | FFT of Hermitian-symmetric signal. Cost: 5*n_out*ceil(log2(n_out)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifft` | $5n \cdot \lceil\log_2 n\rceil$ | Inverse 1-D complex FFT. Cost: 5*n*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifft2` | $5N \cdot \lceil\log_2 N\rceil$ | Inverse 2-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ifftn` | $5N \cdot \lceil\log_2 N\rceil$ | Inverse N-D complex FFT. Cost: 5*N*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.ihfft` | $5n \cdot \lceil\log_2 n\rceil$ | Inverse FFT of Hermitian signal. Cost: 5*n*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.irfft` | $5(n/2) \cdot \lceil\log_2 n\rceil$ | Inverse 1-D real FFT. Cost: 5*(n//2)*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.irfft2` | $5(N/2) \cdot \lceil\log_2 N\rceil$ | Inverse 2-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.irfftn` | $5(N/2) \cdot \lceil\log_2 N\rceil$ | Inverse N-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.rfft` | $5(n/2) \cdot \lceil\log_2 n\rceil$ | 1-D real FFT. Cost: 5*(n//2)*ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.rfft2` | $5(N/2) \cdot \lceil\log_2 N\rceil$ | 2-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `fft.rfftn` | $5(N/2) \cdot \lceil\log_2 N\rceil$ | N-D real FFT. Cost: 5*(N//2)*ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
| `geomspace` | varies | Geometric-spaced generation; cost = num. |
| `gradient` | $\text{numel}(\text{input})$ | Gradient using central differences. |
| `hamming` | $n$ | Hamming window. Cost: n (one cosine per sample). |
| `hanning` | $n$ | Hanning window. Cost: n (one cosine per sample). |
| `histogram` | varies | Binning; cost = n*ceil(log2(bins)). |
| `histogram2d` | varies | 2D binning; cost = n*(ceil(log2(bx))+ceil(log2(by))). |
| `histogram_bin_edges` | varies | Bin edge computation; cost = numel(a). |
| `histogramdd` | varies | ND binning; cost = n*sum(ceil(log2(b_i))). |
| `in1d` | varies | Set membership; cost = (n+m)*ceil(log2(n+m)). |
| `inner` | $n$ | Inner product; cost = 2*N for 1-D, 2*N*M for n-D. |
| `interp` | $n \cdot \log m$ | 1-D linear interpolation. |
| `intersect1d` | varies | Set intersection; cost = (n+m)*ceil(log2(n+m)). |
| `isin` | varies | Set membership; cost = (n+m)*ceil(log2(n+m)). |
| `kaiser` | $3n$ | Kaiser window. Cost: 3*n (Bessel function eval per sample). |
| `kron` | $m_1 m_2 \cdot n_1 n_2$ | Kronecker product; cost proportional to output size. |
| `lexsort` | varies | Multi-key sort; cost = k*n*ceil(log2(n)). |
| `linalg.cholesky` | $n^3 / 3$ | Cholesky decomposition. Cost: $n^3/3$ (Golub & Van Loan §4.2). |
| `linalg.cond` | $m \cdot n \cdot \min(m,n)$ | Condition number. Cost: m*n*min(m,n) (via SVD). |
| `linalg.cross` | delegates to `cross` | Delegates to `me.cross` which charges `numel(output)` FLOPs. |
| `linalg.det` | $n^3$ | Determinant. Cost: $n^3$ (LU factorization). |
| `linalg.eig` | $10n^3$ | Eigendecomposition. Cost: $10n^3$ (Francis QR, Golub & Van Loan §7.5). |
| `linalg.eigh` | $4n^3 / 3$ | Symmetric eigendecomposition. Cost: $(4/3)n^3$ (Golub & Van Loan §8.3). |
| `linalg.eigvals` | $10n^3$ | Eigenvalues only. Cost: $10n^3$ (same as eig). |
| `linalg.eigvalsh` | $4n^3 / 3$ | Symmetric eigenvalues. Cost: $(4/3)n^3$ (same as eigh). |
| `linalg.inv` | $n^3$ | Matrix inverse. Cost: $n^3$ (LU + solve). |
| `linalg.lstsq` | $m \cdot n \cdot \min(m,n)$ | Least squares. Cost: m*n*min(m,n) (LAPACK gelsd/SVD). |
| `linalg.matmul` | delegates to `matmul` | Delegates to `me.matmul` which charges `2*m*k*n` FLOPs. |
| `linalg.matrix_norm` | varies | Matrix norm. Cost depends on ord: 2*numel for Frobenius, m*n*min(m,n) for ord=2. |
| `linalg.matrix_power` | $\lfloor\log_2 k\rfloor \cdot n^3$ | Matrix power. Cost: $(\lfloor\log_2 k\rfloor + \text{popcount}(k) - 1) \cdot n^3$ (exponentiation by squaring). |
| `linalg.matrix_rank` | $m \cdot n \cdot \min(m,n)$ | Matrix rank. Cost: m*n*min(m,n) (via SVD). |
| `linalg.multi_dot` | optimal chain | Chain matmul. Cost: sum of optimal chain matmul costs (CLRS §15.2). |
| `linalg.norm` | varies | Norm. Cost depends on ord: numel for L1/inf, 2*numel for Frobenius, m*n*min(m,n) for ord=2. |
| `linalg.outer` | delegates to `outer` | Delegates to `me.outer` which charges `m*n` FLOPs. |
| `linalg.pinv` | $m \cdot n \cdot \min(m,n)$ | Pseudoinverse. Cost: m*n*min(m,n) (via SVD). |
| `linalg.qr` | $2mn^2 - 2n^3/3$ | QR decomposition. Cost: $2mn^2 - (2/3)n^3$ (Golub & Van Loan §5.2). |
| `linalg.slogdet` | $n^3$ | Sign + log determinant. Cost: $n^3$ (LU factorization). |
| `linalg.solve` | $2n^3/3 + n^2 \cdot n_{\text{rhs}}$ | Solve Ax=b. Cost: $2n^3/3$ (LU) + $n^2 \cdot n_{\text{rhs}}$ (back-substitution). |
| `linalg.svd` | $m \cdot n \cdot k$ | Singular value decomposition; cost ~ O(min(m,n)*m*n). |
| `linalg.svdvals` | $m \cdot n \cdot \min(m,n)$ | Singular values only. Cost: m*n*min(m,n) (Golub-Reinsch). |
| `linalg.tensordot` | delegates to `tensordot` | Delegates to `me.tensordot` which charges FLOPs based on contraction. |
| `linalg.tensorinv` | $n^3$ | Tensor inverse. Cost: $n^3$ after reshape (delegates to inv). |
| `linalg.tensorsolve` | $n^3$ | Tensor solve. Cost: $n^3$ after reshape (delegates to solve). |
| `linalg.trace` | $n$ | Matrix trace. Cost: n (sum of diagonal elements). |
| `linalg.vecdot` | delegates to `vecdot` | Delegates to `me.vecdot` which charges `2*n` FLOPs. |
| `linalg.vector_norm` | $n$ or $2n$ | Vector norm. Cost: numel (or 2*numel for general p-norm). |
| `logspace` | varies | Log-spaced generation; cost = num. |
| `matmul` | $2 \cdot m \cdot k \cdot n$ | Matrix multiplication; cost = 2*M*N*K. |
| `outer` | $m \cdot n$ | Outer product of two vectors; cost = M*N. |
| `partition` | varies | Quickselect; cost = n per slice. |
| `poly` | $n^2$ | Polynomial from roots. Cost: $n^2$ FLOPs. |
| `polyadd` | $\max(n_1, n_2)$ | Add two polynomials. Cost: max(n1, n2) FLOPs. |
| `polyder` | $n$ | Differentiate polynomial. Cost: n FLOPs. |
| `polydiv` | $n_1 \cdot n_2$ | Divide one polynomial by another. Cost: n1 * n2 FLOPs. |
| `polyfit` | $2m \cdot (\text{deg}+1)^2$ | Least squares polynomial fit. Cost: 2 * m * (deg+1)^2 FLOPs. |
| `polyint` | $n$ | Integrate polynomial. Cost: n FLOPs. |
| `polymul` | $n_1 \cdot n_2$ | Multiply polynomials. Cost: n1 * n2 FLOPs. |
| `polysub` | $\max(n_1, n_2)$ | Difference (subtraction) of two polynomials. Cost: max(n1, n2) FLOPs. |
| `polyval` | $2 \cdot m \cdot \text{deg}$ | Evaluate polynomial at given points. Cost: 2 * m * deg FLOPs (Horner's method). |
| `random.beta` | varies | Sampling; cost = numel(output). |
| `random.binomial` | varies | Sampling; cost = numel(output). |
| `random.bytes` | varies | Sampling; cost = numel(output). |
| `random.chisquare` | varies | Sampling; cost = numel(output). |
| `random.choice` | varies | Sampling; cost = numel(output) if replace, n*ceil(log2(n)) if not. |
| `random.dirichlet` | varies | Sampling; cost = numel(output). |
| `random.exponential` | varies | Sampling; cost = numel(output). |
| `random.f` | varies | Sampling; cost = numel(output). |
| `random.gamma` | varies | Sampling; cost = numel(output). |
| `random.geometric` | varies | Sampling; cost = numel(output). |
| `random.gumbel` | varies | Sampling; cost = numel(output). |
| `random.hypergeometric` | varies | Sampling; cost = numel(output). |
| `random.laplace` | varies | Sampling; cost = numel(output). |
| `random.logistic` | varies | Sampling; cost = numel(output). |
| `random.lognormal` | varies | Sampling; cost = numel(output). |
| `random.logseries` | varies | Sampling; cost = numel(output). |
| `random.multinomial` | varies | Sampling; cost = numel(output). |
| `random.multivariate_normal` | varies | Sampling; cost = numel(output). |
| `random.negative_binomial` | varies | Sampling; cost = numel(output). |
| `random.noncentral_chisquare` | varies | Sampling; cost = numel(output). |
| `random.noncentral_f` | varies | Sampling; cost = numel(output). |
| `random.normal` | varies | Sampling; cost = numel(output). |
| `random.pareto` | varies | Sampling; cost = numel(output). |
| `random.permutation` | varies | Shuffle; cost = n*ceil(log2(n)). |
| `random.poisson` | varies | Sampling; cost = numel(output). |
| `random.power` | varies | Sampling; cost = numel(output). |
| `random.rand` | varies | Sampling; cost = numel(output). |
| `random.randint` | varies | Sampling; cost = numel(output). |
| `random.randn` | varies | Sampling; cost = numel(output). |
| `random.random` | varies | Sampling; cost = numel(output). |
| `random.random_integers` | varies | Sampling; cost = numel(output). |
| `random.random_sample` | varies | Sampling; cost = numel(output). |
| `random.ranf` | varies | Sampling; cost = numel(output). |
| `random.rayleigh` | varies | Sampling; cost = numel(output). |
| `random.sample` | varies | Sampling; cost = numel(output). |
| `random.shuffle` | varies | Shuffle; cost = n*ceil(log2(n)). |
| `random.standard_cauchy` | varies | Sampling; cost = numel(output). |
| `random.standard_exponential` | varies | Sampling; cost = numel(output). |
| `random.standard_gamma` | varies | Sampling; cost = numel(output). |
| `random.standard_normal` | varies | Sampling; cost = numel(output). |
| `random.standard_t` | varies | Sampling; cost = numel(output). |
| `random.triangular` | varies | Sampling; cost = numel(output). |
| `random.uniform` | varies | Sampling; cost = numel(output). |
| `random.vonmises` | varies | Sampling; cost = numel(output). |
| `random.wald` | varies | Sampling; cost = numel(output). |
| `random.weibull` | varies | Sampling; cost = numel(output). |
| `random.zipf` | varies | Sampling; cost = numel(output). |
| `roots` | $10n^3$ | Return roots of polynomial with given coefficients. Cost: $10n^3$ FLOPs (companion matrix eig). |
| `searchsorted` | varies | Binary search; cost = m*ceil(log2(n)). |
| `setdiff1d` | varies | Set difference; cost = (n+m)*ceil(log2(n+m)). |
| `setxor1d` | varies | Symmetric set difference; cost = (n+m)*ceil(log2(n+m)). |
| `sort` | varies | Comparison sort; cost = n*ceil(log2(n)) per slice. |
| `tensordot` | $\prod_i d_i$ | Tensor dot product along specified axes. |
| `trace` | varies | Diagonal sum; cost = min(n,m). |
| `trapezoid` | $\text{numel}(\text{input})$ | Integrate using the trapezoidal rule. |
| `trapz` | $\text{numel}(\text{input})$ | Alias for trapezoid (deprecated). |
| `union1d` | varies | Set union; cost = (n+m)*ceil(log2(n+m)). |
| `unique` | varies | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_all` | varies | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_counts` | varies | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_inverse` | varies | Sort-based unique; cost = n*ceil(log2(n)). |
| `unique_values` | varies | Sort-based unique; cost = n*ceil(log2(n)). |
| `unwrap` | $\text{numel}(\text{input})$ | Phase unwrap. Cost: $\text{numel}(\text{input})$ (diff + conditional adjustment). |
| `vander` | varies | Vandermonde matrix; cost = len(x)*(N-1). |
| `vdot` | $n$ | Dot product with conjugation; cost = 2*N. |

## Free Operations (complete list)

`append`, `arange`, `argwhere`, `array`, `array_split`, `asarray`, `asarray_chkfinite`, `astype`
`atleast_1d`, `atleast_2d`, `atleast_3d`, `base_repr`, `binary_repr`, `block`, `bmat`, `broadcast_arrays`
`broadcast_shapes`, `broadcast_to`, `can_cast`, `choose`, `column_stack`, `common_type`, `compress`, `concat`
`concatenate`, `copy`, `copyto`, `delete`, `diag`, `diag_indices`, `diag_indices_from`, `diagflat`
`diagonal`, `dsplit`, `dstack`, `empty`, `empty_like`, `expand_dims`, `extract`, `eye`
`fft.fftfreq`, `fft.fftshift`, `fft.ifftshift`, `fft.rfftfreq`, `fill_diagonal`, `flatnonzero`, `flip`, `fliplr`
`flipud`, `from_dlpack`, `frombuffer`, `fromfile`, `fromfunction`, `fromiter`, `fromregex`, `fromstring`
`full`, `full_like`, `hsplit`, `hstack`, `identity`, `indices`, `insert`, `isdtype`
`isfinite`, `isfortran`, `isinf`, `isnan`, `isscalar`, `issubdtype`, `iterable`, `ix_`
`linalg.diagonal`, `linalg.matrix_transpose`, `linspace`, `mask_indices`, `matrix_transpose`, `may_share_memory`, `meshgrid`, `min_scalar_type`
`mintypecode`, `moveaxis`, `ndim`, `nonzero`, `ones`, `ones_like`, `packbits`, `pad`
`permute_dims`, `place`, `promote_types`, `put`, `put_along_axis`, `putmask`, `random.default_rng`, `random.get_state`
`random.seed`, `random.set_state`, `ravel`, `ravel_multi_index`, `repeat`, `require`, `reshape`, `resize`
`result_type`, `roll`, `rollaxis`, `rot90`, `row_stack`, `select`, `shape`, `shares_memory`
`size`, `split`, `squeeze`, `stack`, `swapaxes`, `take`, `take_along_axis`, `tile`
`transpose`, `tri`, `tril`, `tril_indices`, `tril_indices_from`, `trim_zeros`, `triu`, `triu_indices`
`triu_indices_from`, `typename`, `unpackbits`, `unravel_index`, `unstack`, `vsplit`, `vstack`, `where`
`zeros`, `zeros_like`

## Blocked Operations (complete list)

`apply_along_axis`, `apply_over_axes`, `array2string`, `array_repr`, `array_str`, `asmatrix`, `busday_count`, `busday_offset`
`datetime_as_string`, `datetime_data`, `format_float_positional`, `format_float_scientific`, `frompyfunc`, `genfromtxt`, `get_include`, `getbufsize`
`geterr`, `geterrcall`, `is_busday`, `load`, `loadtxt`, `nested_iters`, `piecewise`, `save`
`savetxt`, `savez`, `savez_compressed`, `setbufsize`, `seterr`, `seterrcall`, `show_config`, `show_runtime`


# Troubleshooting

## Troubleshooting

# Common Errors

## When to use this page

Use this page when you encounter an error from mechestim and need to understand what went wrong.

---

## BudgetExhaustedError

**Symptom:**

```
mechestim.errors.BudgetExhaustedError: einsum would cost 16,777,216 FLOPs but only 1,000,000 remain
```

**Why:** The operation you called would exceed the remaining FLOP budget. The operation did **not** execute.

**Fix:** Increase `flop_budget` in your `BudgetContext`, or reduce the cost of your computation. Use `budget.summary()` to see which operations are consuming the most FLOPs.

---

## NoBudgetContextError

**Symptom:**

```
mechestim.errors.NoBudgetContextError: No active BudgetContext. Wrap your code in `with mechestim.BudgetContext(...):`
```

**Why:** A counted operation (like `me.einsum`, `me.exp`, etc.) was called with no active budget session.

**In the core library:** This error is unlikely in normal use. mechestim automatically activates a global default budget, so operations run freely without any explicit setup. If you do see this error in the core library, it may indicate the global default was somehow torn down — restarting your session should resolve it.

**In the client-server model:** The server requires an open session. If your code runs on the server without a `BudgetContext`, this error will fire. Fix it by wrapping your computation:

```python
with me.BudgetContext(flop_budget=10_000_000) as budget:
    # your code here
```

---

## AttributeError: module 'mechestim' has no attribute '...'

**Symptom:**

```
AttributeError: module 'mechestim' has no attribute 'save'. mechestim does not support this operation.
```

**Why:** The NumPy function you're trying to use is blocked by mechestim (I/O, config, and system-level functions are not part of the competition API).

**Fix:** Check [Operation Categories](../concepts/operation-categories.md) for supported operations, or see the [Operation Audit](../reference/operation-audit.md) for the complete list.

---

## RuntimeError: Cannot nest BudgetContexts

**Symptom:**

```
RuntimeError: Cannot nest BudgetContexts
```

**Why:** You opened a `BudgetContext` inside another explicit `BudgetContext`. This error can only arise when two explicit contexts overlap.

**Note:** If the global default budget is active (the normal case in the core library), opening an explicit `BudgetContext` is fine — it temporarily replaces the global default for the duration of the `with` block, then restores it on exit. This is not nesting and does not raise an error.

**Fix:** If you see this error, you have two explicit `BudgetContext` managers open at the same time. Restructure your code so only one explicit context is active at a time, or rely on the global default for the outer scope.

---

## SymmetryError

**Symptom:**

```
mechestim.errors.SymmetryError: Tensor not symmetric along axes (0, 1): max deviation = 0.5
```

**Why:** You called `me.as_symmetric()` to declare that a tensor is symmetric along certain axes, but the actual data does not satisfy the symmetry within tolerance.

**Fix:** Verify that the tensor truly has the claimed symmetry (e.g., `A[i,j] == A[j,i]`). If it's approximately symmetric, you may need to symmetrise it first: `A = (A + A.T) / 2`.

---

## 📎 Related pages

- [Debug Budget Overruns](../how-to/debug-budget-overruns.md) — diagnose which operations are expensive
- [Error Reference (API)](../api/errors.md) — full error class documentation


# Changelog

## Changelog

# Changelog

## 0.2.0 (2026-04-03)

Second release with unified einsum cost model, NumPy compatibility testing, and expanded operation coverage.

### New features

- **Unified einsum cost model** — all einsum-like operations (einsum, dot, matmul, tensordot) now share a single cost model based on opt_einsum's contraction path optimizer
- **Symmetry-aware path finding** — the opt_einsum path optimizer now factors symmetry savings into contraction ordering decisions, producing different (cheaper) paths for symmetric inputs
- **NumPy compatibility test harness** — run NumPy's own test suite against mechestim via monkeypatching; 7,300+ tests passing across 7 NumPy test modules
- **Polynomial operations** — `polyval`, `polyfit`, `polymul`, `polydiv`, `polyadd`, `polysub`, `poly`, `roots`, `polyder`, `polyint` with analytical FLOP costs
- **Window functions** — `bartlett`, `hamming`, `hanning`, `blackman`, `kaiser` with per-function cost formulas
- **FFT module** — `fft`, `ifft`, `rfft`, `irfft`, `fft2`, `ifft2`, `fftn`, `ifftn`, `rfftn`, `irfftn` and free helpers (`fftfreq`, `rfftfreq`, `fftshift`, `ifftshift`)
- **Client-server architecture** — `mechestim-client` and `mechestim-server` packages for sandboxed competition evaluation over ZMQ
- **Global default budget** — a 1e15 FLOP budget auto-activates on first use, so explicit `BudgetContext` is no longer required for quick scripts
- **`MECHESTIM_DEFAULT_BUDGET` env var** — configure the global default budget amount
- **`budget_live()`** — Rich-based live-updating budget display context manager
- **`einsum_path()`** — inspect contraction plans with per-step symmetry savings without spending budget
- **90%+ test coverage gate** enforced in CI

### Breaking changes

- Einsum cost formula now uses `product_of_all_index_dims × op_factor` (op_factor=2 for inner products, 1 for outer products), matching opt_einsum convention. Previously used a different formula.
- `me.dot` and `me.matmul` costs are now computed via the einsum cost model instead of separate formulas.

### Bug fixes

- Accept scalars and array-likes in all mechestim functions
- Fix symmetry-aware greedy algorithm to actually use symmetry in path selection
- Fix `contract_path` cost reporting for output indices
- Correctly handle `symmetric_dims` propagation through multi-step contraction paths

### Documentation

- Comprehensive how-to guides for einsum, symmetry, linalg, budget planning, and debugging
- Architecture docs for client-server model and Docker deployment
- AI agent guide with `llms.txt`, `ops.json`, and cheat sheet
- NumPy compatibility testing methodology docs

## 0.1.0 (2026-04-01)

Initial release for warm-up round.

- Einsum with symmetry detection and FLOP counting
- Pointwise operations (exp, log, add, multiply, etc.)
- Reductions (sum, mean, max, etc.)
- SVD with truncated top-k
- Free tensor creation and manipulation ops
- Budget enforcement via BudgetContext
- FLOP cost query API
- NumPy-compatible API (`import mechestim as me`)