FLOP Cost Cheat Sheet
mechestim is NOT NumPy. All computation requires a
BudgetContext. Some operations cost FLOPs, some are free, and 32 are blocked entirely.
Key Constraints
- All counted operations require an active
BudgetContext - Budget is checked before execution — exceeding it raises
BudgetExhaustedError - 32 operations are blocked (I/O, config, state functions)
sort,argsort,trace, and random sampling are counted with analytical FLOP costs
Cost by Category
| Category | Count | Cost Formula |
|---|---|---|
| Free | 138 | \(0\) |
| Unary | 73 | \(\text{numel}(\text{output})\) |
| Binary | 45 | \(\text{numel}(\text{output})\) |
| Reduction | 37 | \(\text{numel}(\text{input})\) |
| Custom | 157 | varies |
| Blocked | 32 | N/A |
Custom Operation Costs
| Operation | Cost Formula | Notes |
|---|---|---|
allclose |
varies | Element-wise tolerance check; cost = numel(a). |
argpartition |
varies | Indirect partition; cost = n per slice. |
argsort |
varies | Indirect sort; cost = n*ceil(log2(n)) per slice. |
array_equal |
varies | Element-wise equality; cost = numel(a). |
array_equiv |
varies | Element-wise equivalence; cost = numel(a). |
bartlett |
\(n\) | Bartlett window. Cost: n (one linear eval per sample). |
bincount |
varies | Integer counting; cost = numel(x). |
blackman |
\(3n\) | Blackman window. Cost: 3*n (three cosine terms per sample). |
clip |
\(\text{numel}(\text{input})\) | Clip array to [a_min, a_max] element-wise. |
convolve |
\(n \cdot m\) | 1-D discrete convolution. |
corrcoef |
\(n^2 \cdot m\) | Pearson correlation coefficients. |
correlate |
\(n \cdot m\) | 1-D cross-correlation. |
cov |
\(n^2 \cdot m\) | Covariance matrix. |
cross |
\(\text{numel}(\text{output})\) | Cross product of two 3-D vectors. |
diff |
\(\text{numel}(\text{input})\) | n-th discrete difference along axis. |
digitize |
varies | Bin search; cost = n*ceil(log2(bins)). |
dot |
\(2 \cdot m \cdot k \cdot n\) | Dot product; cost = 2MN*K for matrix multiply. |
ediff1d |
\(\text{numel}(\text{input})\) | Differences between consecutive elements. |
einsum |
\(\text{op\_factor} \cdot \prod_i d_i\) | Generalized Einstein summation. |
einsum_path |
\(0\) | Optimize einsum contraction path (no numeric output). |
fft.fft |
\(5n \cdot \lceil\log_2 n\rceil\) | 1-D complex FFT. Cost: 5nceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.fft2 |
\(5N \cdot \lceil\log_2 N\rceil\) | 2-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.fftn |
\(5N \cdot \lceil\log_2 N\rceil\) | N-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.hfft |
\(5n \cdot \lceil\log_2 n\rceil\) | FFT of Hermitian-symmetric signal. Cost: 5n_outceil(log2(n_out)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.ifft |
\(5n \cdot \lceil\log_2 n\rceil\) | Inverse 1-D complex FFT. Cost: 5nceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.ifft2 |
\(5N \cdot \lceil\log_2 N\rceil\) | Inverse 2-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.ifftn |
\(5N \cdot \lceil\log_2 N\rceil\) | Inverse N-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.ihfft |
\(5n \cdot \lceil\log_2 n\rceil\) | Inverse FFT of Hermitian signal. Cost: 5nceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.irfft |
\(5(n/2) \cdot \lceil\log_2 n\rceil\) | Inverse 1-D real FFT. Cost: 5(n//2)ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.irfft2 |
\(5(N/2) \cdot \lceil\log_2 N\rceil\) | Inverse 2-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.irfftn |
\(5(N/2) \cdot \lceil\log_2 N\rceil\) | Inverse N-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.rfft |
\(5(n/2) \cdot \lceil\log_2 n\rceil\) | 1-D real FFT. Cost: 5(n//2)ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.rfft2 |
\(5(N/2) \cdot \lceil\log_2 N\rceil\) | 2-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
fft.rfftn |
\(5(N/2) \cdot \lceil\log_2 N\rceil\) | N-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4). |
geomspace |
varies | Geometric-spaced generation; cost = num. |
gradient |
\(\text{numel}(\text{input})\) | Gradient using central differences. |
hamming |
\(n\) | Hamming window. Cost: n (one cosine per sample). |
hanning |
\(n\) | Hanning window. Cost: n (one cosine per sample). |
histogram |
varies | Binning; cost = n*ceil(log2(bins)). |
histogram2d |
varies | 2D binning; cost = n*(ceil(log2(bx))+ceil(log2(by))). |
histogram_bin_edges |
varies | Bin edge computation; cost = numel(a). |
histogramdd |
varies | ND binning; cost = n*sum(ceil(log2(b_i))). |
in1d |
varies | Set membership; cost = (n+m)*ceil(log2(n+m)). |
inner |
\(n\) | Inner product; cost = 2N for 1-D, 2N*M for n-D. |
interp |
\(n \cdot \log m\) | 1-D linear interpolation. |
intersect1d |
varies | Set intersection; cost = (n+m)*ceil(log2(n+m)). |
isin |
varies | Set membership; cost = (n+m)*ceil(log2(n+m)). |
kaiser |
\(3n\) | Kaiser window. Cost: 3*n (Bessel function eval per sample). |
kron |
\(m_1 m_2 \cdot n_1 n_2\) | Kronecker product; cost proportional to output size. |
lexsort |
varies | Multi-key sort; cost = knceil(log2(n)). |
linalg.cholesky |
\(n^3 / 3\) | Cholesky decomposition. Cost: \(n^3/3\) (Golub & Van Loan §4.2). |
linalg.cond |
\(m \cdot n \cdot \min(m,n)\) | Condition number. Cost: mnmin(m,n) (via SVD). |
linalg.cross |
delegates to cross |
Delegates to me.cross which charges numel(output) FLOPs. |
linalg.det |
\(n^3\) | Determinant. Cost: \(n^3\) (LU factorization). |
linalg.eig |
\(10n^3\) | Eigendecomposition. Cost: \(10n^3\) (Francis QR, Golub & Van Loan §7.5). |
linalg.eigh |
\(4n^3 / 3\) | Symmetric eigendecomposition. Cost: \((4/3)n^3\) (Golub & Van Loan §8.3). |
linalg.eigvals |
\(10n^3\) | Eigenvalues only. Cost: \(10n^3\) (same as eig). |
linalg.eigvalsh |
\(4n^3 / 3\) | Symmetric eigenvalues. Cost: \((4/3)n^3\) (same as eigh). |
linalg.inv |
\(n^3\) | Matrix inverse. Cost: \(n^3\) (LU + solve). |
linalg.lstsq |
\(m \cdot n \cdot \min(m,n)\) | Least squares. Cost: mnmin(m,n) (LAPACK gelsd/SVD). |
linalg.matmul |
delegates to matmul |
Delegates to me.matmul which charges 2*m*k*n FLOPs. |
linalg.matrix_norm |
varies | Matrix norm. Cost depends on ord: 2numel for Frobenius, mn*min(m,n) for ord=2. |
linalg.matrix_power |
\(\lfloor\log_2 k\rfloor \cdot n^3\) | Matrix power. Cost: \((\lfloor\log_2 k\rfloor + \text{popcount}(k) - 1) \cdot n^3\) (exponentiation by squaring). |
linalg.matrix_rank |
\(m \cdot n \cdot \min(m,n)\) | Matrix rank. Cost: mnmin(m,n) (via SVD). |
linalg.multi_dot |
optimal chain | Chain matmul. Cost: sum of optimal chain matmul costs (CLRS §15.2). |
linalg.norm |
varies | Norm. Cost depends on ord: numel for L1/inf, 2numel for Frobenius, mn*min(m,n) for ord=2. |
linalg.outer |
delegates to outer |
Delegates to me.outer which charges m*n FLOPs. |
linalg.pinv |
\(m \cdot n \cdot \min(m,n)\) | Pseudoinverse. Cost: mnmin(m,n) (via SVD). |
linalg.qr |
\(2mn^2 - 2n^3/3\) | QR decomposition. Cost: \(2mn^2 - (2/3)n^3\) (Golub & Van Loan §5.2). |
linalg.slogdet |
\(n^3\) | Sign + log determinant. Cost: \(n^3\) (LU factorization). |
linalg.solve |
\(2n^3/3 + n^2 \cdot n_{\text{rhs}}\) | Solve Ax=b. Cost: \(2n^3/3\) (LU) + \(n^2 \cdot n_{\text{rhs}}\) (back-substitution). |
linalg.svd |
\(m \cdot n \cdot k\) | Singular value decomposition; cost ~ O(min(m,n)mn). |
linalg.svdvals |
\(m \cdot n \cdot \min(m,n)\) | Singular values only. Cost: mnmin(m,n) (Golub-Reinsch). |
linalg.tensordot |
delegates to tensordot |
Delegates to me.tensordot which charges FLOPs based on contraction. |
linalg.tensorinv |
\(n^3\) | Tensor inverse. Cost: \(n^3\) after reshape (delegates to inv). |
linalg.tensorsolve |
\(n^3\) | Tensor solve. Cost: \(n^3\) after reshape (delegates to solve). |
linalg.trace |
\(n\) | Matrix trace. Cost: n (sum of diagonal elements). |
linalg.vecdot |
delegates to vecdot |
Delegates to me.vecdot which charges 2*n FLOPs. |
linalg.vector_norm |
\(n\) or \(2n\) | Vector norm. Cost: numel (or 2*numel for general p-norm). |
logspace |
varies | Log-spaced generation; cost = num. |
matmul |
\(2 \cdot m \cdot k \cdot n\) | Matrix multiplication; cost = 2MN*K. |
outer |
\(m \cdot n\) | Outer product of two vectors; cost = M*N. |
partition |
varies | Quickselect; cost = n per slice. |
poly |
\(n^2\) | Polynomial from roots. Cost: \(n^2\) FLOPs. |
polyadd |
\(\max(n_1, n_2)\) | Add two polynomials. Cost: max(n1, n2) FLOPs. |
polyder |
\(n\) | Differentiate polynomial. Cost: n FLOPs. |
polydiv |
\(n_1 \cdot n_2\) | Divide one polynomial by another. Cost: n1 * n2 FLOPs. |
polyfit |
\(2m \cdot (\text{deg}+1)^2\) | Least squares polynomial fit. Cost: 2 * m * (deg+1)^2 FLOPs. |
polyint |
\(n\) | Integrate polynomial. Cost: n FLOPs. |
polymul |
\(n_1 \cdot n_2\) | Multiply polynomials. Cost: n1 * n2 FLOPs. |
polysub |
\(\max(n_1, n_2)\) | Difference (subtraction) of two polynomials. Cost: max(n1, n2) FLOPs. |
polyval |
\(2 \cdot m \cdot \text{deg}\) | Evaluate polynomial at given points. Cost: 2 * m * deg FLOPs (Horner's method). |
random.beta |
varies | Sampling; cost = numel(output). |
random.binomial |
varies | Sampling; cost = numel(output). |
random.bytes |
varies | Sampling; cost = numel(output). |
random.chisquare |
varies | Sampling; cost = numel(output). |
random.choice |
varies | Sampling; cost = numel(output) if replace, n*ceil(log2(n)) if not. |
random.dirichlet |
varies | Sampling; cost = numel(output). |
random.exponential |
varies | Sampling; cost = numel(output). |
random.f |
varies | Sampling; cost = numel(output). |
random.gamma |
varies | Sampling; cost = numel(output). |
random.geometric |
varies | Sampling; cost = numel(output). |
random.gumbel |
varies | Sampling; cost = numel(output). |
random.hypergeometric |
varies | Sampling; cost = numel(output). |
random.laplace |
varies | Sampling; cost = numel(output). |
random.logistic |
varies | Sampling; cost = numel(output). |
random.lognormal |
varies | Sampling; cost = numel(output). |
random.logseries |
varies | Sampling; cost = numel(output). |
random.multinomial |
varies | Sampling; cost = numel(output). |
random.multivariate_normal |
varies | Sampling; cost = numel(output). |
random.negative_binomial |
varies | Sampling; cost = numel(output). |
random.noncentral_chisquare |
varies | Sampling; cost = numel(output). |
random.noncentral_f |
varies | Sampling; cost = numel(output). |
random.normal |
varies | Sampling; cost = numel(output). |
random.pareto |
varies | Sampling; cost = numel(output). |
random.permutation |
varies | Shuffle; cost = n*ceil(log2(n)). |
random.poisson |
varies | Sampling; cost = numel(output). |
random.power |
varies | Sampling; cost = numel(output). |
random.rand |
varies | Sampling; cost = numel(output). |
random.randint |
varies | Sampling; cost = numel(output). |
random.randn |
varies | Sampling; cost = numel(output). |
random.random |
varies | Sampling; cost = numel(output). |
random.random_integers |
varies | Sampling; cost = numel(output). |
random.random_sample |
varies | Sampling; cost = numel(output). |
random.ranf |
varies | Sampling; cost = numel(output). |
random.rayleigh |
varies | Sampling; cost = numel(output). |
random.sample |
varies | Sampling; cost = numel(output). |
random.shuffle |
varies | Shuffle; cost = n*ceil(log2(n)). |
random.standard_cauchy |
varies | Sampling; cost = numel(output). |
random.standard_exponential |
varies | Sampling; cost = numel(output). |
random.standard_gamma |
varies | Sampling; cost = numel(output). |
random.standard_normal |
varies | Sampling; cost = numel(output). |
random.standard_t |
varies | Sampling; cost = numel(output). |
random.triangular |
varies | Sampling; cost = numel(output). |
random.uniform |
varies | Sampling; cost = numel(output). |
random.vonmises |
varies | Sampling; cost = numel(output). |
random.wald |
varies | Sampling; cost = numel(output). |
random.weibull |
varies | Sampling; cost = numel(output). |
random.zipf |
varies | Sampling; cost = numel(output). |
roots |
\(10n^3\) | Return roots of polynomial with given coefficients. Cost: \(10n^3\) FLOPs (companion matrix eig). |
searchsorted |
varies | Binary search; cost = m*ceil(log2(n)). |
setdiff1d |
varies | Set difference; cost = (n+m)*ceil(log2(n+m)). |
setxor1d |
varies | Symmetric set difference; cost = (n+m)*ceil(log2(n+m)). |
sort |
varies | Comparison sort; cost = n*ceil(log2(n)) per slice. |
tensordot |
\(\prod_i d_i\) | Tensor dot product along specified axes. |
trace |
varies | Diagonal sum; cost = min(n,m). |
trapezoid |
\(\text{numel}(\text{input})\) | Integrate using the trapezoidal rule. |
trapz |
\(\text{numel}(\text{input})\) | Alias for trapezoid (deprecated). |
union1d |
varies | Set union; cost = (n+m)*ceil(log2(n+m)). |
unique |
varies | Sort-based unique; cost = n*ceil(log2(n)). |
unique_all |
varies | Sort-based unique; cost = n*ceil(log2(n)). |
unique_counts |
varies | Sort-based unique; cost = n*ceil(log2(n)). |
unique_inverse |
varies | Sort-based unique; cost = n*ceil(log2(n)). |
unique_values |
varies | Sort-based unique; cost = n*ceil(log2(n)). |
unwrap |
\(\text{numel}(\text{input})\) | Phase unwrap. Cost: \(\text{numel}(\text{input})\) (diff + conditional adjustment). |
vander |
varies | Vandermonde matrix; cost = len(x)*(N-1). |
vdot |
\(n\) | Dot product with conjugation; cost = 2*N. |
Free Operations (complete list)
append, arange, argwhere, array, array_split, asarray, asarray_chkfinite, astype
atleast_1d, atleast_2d, atleast_3d, base_repr, binary_repr, block, bmat, broadcast_arrays
broadcast_shapes, broadcast_to, can_cast, choose, column_stack, common_type, compress, concat
concatenate, copy, copyto, delete, diag, diag_indices, diag_indices_from, diagflat
diagonal, dsplit, dstack, empty, empty_like, expand_dims, extract, eye
fft.fftfreq, fft.fftshift, fft.ifftshift, fft.rfftfreq, fill_diagonal, flatnonzero, flip, fliplr
flipud, from_dlpack, frombuffer, fromfile, fromfunction, fromiter, fromregex, fromstring
full, full_like, hsplit, hstack, identity, indices, insert, isdtype
isfinite, isfortran, isinf, isnan, isscalar, issubdtype, iterable, ix_
linalg.diagonal, linalg.matrix_transpose, linspace, mask_indices, matrix_transpose, may_share_memory, meshgrid, min_scalar_type
mintypecode, moveaxis, ndim, nonzero, ones, ones_like, packbits, pad
permute_dims, place, promote_types, put, put_along_axis, putmask, random.default_rng, random.get_state
random.seed, random.set_state, ravel, ravel_multi_index, repeat, require, reshape, resize
result_type, roll, rollaxis, rot90, row_stack, select, shape, shares_memory
size, split, squeeze, stack, swapaxes, take, take_along_axis, tile
transpose, tri, tril, tril_indices, tril_indices_from, trim_zeros, triu, triu_indices
triu_indices_from, typename, unpackbits, unravel_index, unstack, vsplit, vstack, where
zeros, zeros_like
Blocked Operations (complete list)
apply_along_axis, apply_over_axes, array2string, array_repr, array_str, asmatrix, busday_count, busday_offset
datetime_as_string, datetime_data, format_float_positional, format_float_scientific, frompyfunc, genfromtxt, get_include, getbufsize
geterr, geterrcall, is_busday, load, loadtxt, nested_iters, piecewise, save
savetxt, savez, savez_compressed, setbufsize, seterr, seterrcall, show_config, show_runtime