Skip to content

FLOP Cost Cheat Sheet

mechestim is NOT NumPy. All computation requires a BudgetContext. Some operations cost FLOPs, some are free, and 32 are blocked entirely.

Key Constraints

  • All counted operations require an active BudgetContext
  • Budget is checked before execution — exceeding it raises BudgetExhaustedError
  • 32 operations are blocked (I/O, config, state functions)
  • sort, argsort, trace, and random sampling are counted with analytical FLOP costs

Cost by Category

Category Count Cost Formula
Free 138 \(0\)
Unary 73 \(\text{numel}(\text{output})\)
Binary 45 \(\text{numel}(\text{output})\)
Reduction 37 \(\text{numel}(\text{input})\)
Custom 157 varies
Blocked 32 N/A

Custom Operation Costs

Operation Cost Formula Notes
allclose varies Element-wise tolerance check; cost = numel(a).
argpartition varies Indirect partition; cost = n per slice.
argsort varies Indirect sort; cost = n*ceil(log2(n)) per slice.
array_equal varies Element-wise equality; cost = numel(a).
array_equiv varies Element-wise equivalence; cost = numel(a).
bartlett \(n\) Bartlett window. Cost: n (one linear eval per sample).
bincount varies Integer counting; cost = numel(x).
blackman \(3n\) Blackman window. Cost: 3*n (three cosine terms per sample).
clip \(\text{numel}(\text{input})\) Clip array to [a_min, a_max] element-wise.
convolve \(n \cdot m\) 1-D discrete convolution.
corrcoef \(n^2 \cdot m\) Pearson correlation coefficients.
correlate \(n \cdot m\) 1-D cross-correlation.
cov \(n^2 \cdot m\) Covariance matrix.
cross \(\text{numel}(\text{output})\) Cross product of two 3-D vectors.
diff \(\text{numel}(\text{input})\) n-th discrete difference along axis.
digitize varies Bin search; cost = n*ceil(log2(bins)).
dot \(2 \cdot m \cdot k \cdot n\) Dot product; cost = 2MN*K for matrix multiply.
ediff1d \(\text{numel}(\text{input})\) Differences between consecutive elements.
einsum \(\text{op\_factor} \cdot \prod_i d_i\) Generalized Einstein summation.
einsum_path \(0\) Optimize einsum contraction path (no numeric output).
fft.fft \(5n \cdot \lceil\log_2 n\rceil\) 1-D complex FFT. Cost: 5nceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.fft2 \(5N \cdot \lceil\log_2 N\rceil\) 2-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.fftn \(5N \cdot \lceil\log_2 N\rceil\) N-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.hfft \(5n \cdot \lceil\log_2 n\rceil\) FFT of Hermitian-symmetric signal. Cost: 5n_outceil(log2(n_out)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.ifft \(5n \cdot \lceil\log_2 n\rceil\) Inverse 1-D complex FFT. Cost: 5nceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.ifft2 \(5N \cdot \lceil\log_2 N\rceil\) Inverse 2-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.ifftn \(5N \cdot \lceil\log_2 N\rceil\) Inverse N-D complex FFT. Cost: 5Nceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.ihfft \(5n \cdot \lceil\log_2 n\rceil\) Inverse FFT of Hermitian signal. Cost: 5nceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.irfft \(5(n/2) \cdot \lceil\log_2 n\rceil\) Inverse 1-D real FFT. Cost: 5(n//2)ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.irfft2 \(5(N/2) \cdot \lceil\log_2 N\rceil\) Inverse 2-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.irfftn \(5(N/2) \cdot \lceil\log_2 N\rceil\) Inverse N-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.rfft \(5(n/2) \cdot \lceil\log_2 n\rceil\) 1-D real FFT. Cost: 5(n//2)ceil(log2(n)) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.rfft2 \(5(N/2) \cdot \lceil\log_2 N\rceil\) 2-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
fft.rfftn \(5(N/2) \cdot \lceil\log_2 N\rceil\) N-D real FFT. Cost: 5(N//2)ceil(log2(N)), N=prod(s) (Cooley-Tukey radix-2; Van Loan 1992 §1.4).
geomspace varies Geometric-spaced generation; cost = num.
gradient \(\text{numel}(\text{input})\) Gradient using central differences.
hamming \(n\) Hamming window. Cost: n (one cosine per sample).
hanning \(n\) Hanning window. Cost: n (one cosine per sample).
histogram varies Binning; cost = n*ceil(log2(bins)).
histogram2d varies 2D binning; cost = n*(ceil(log2(bx))+ceil(log2(by))).
histogram_bin_edges varies Bin edge computation; cost = numel(a).
histogramdd varies ND binning; cost = n*sum(ceil(log2(b_i))).
in1d varies Set membership; cost = (n+m)*ceil(log2(n+m)).
inner \(n\) Inner product; cost = 2N for 1-D, 2N*M for n-D.
interp \(n \cdot \log m\) 1-D linear interpolation.
intersect1d varies Set intersection; cost = (n+m)*ceil(log2(n+m)).
isin varies Set membership; cost = (n+m)*ceil(log2(n+m)).
kaiser \(3n\) Kaiser window. Cost: 3*n (Bessel function eval per sample).
kron \(m_1 m_2 \cdot n_1 n_2\) Kronecker product; cost proportional to output size.
lexsort varies Multi-key sort; cost = knceil(log2(n)).
linalg.cholesky \(n^3 / 3\) Cholesky decomposition. Cost: \(n^3/3\) (Golub & Van Loan §4.2).
linalg.cond \(m \cdot n \cdot \min(m,n)\) Condition number. Cost: mnmin(m,n) (via SVD).
linalg.cross delegates to cross Delegates to me.cross which charges numel(output) FLOPs.
linalg.det \(n^3\) Determinant. Cost: \(n^3\) (LU factorization).
linalg.eig \(10n^3\) Eigendecomposition. Cost: \(10n^3\) (Francis QR, Golub & Van Loan §7.5).
linalg.eigh \(4n^3 / 3\) Symmetric eigendecomposition. Cost: \((4/3)n^3\) (Golub & Van Loan §8.3).
linalg.eigvals \(10n^3\) Eigenvalues only. Cost: \(10n^3\) (same as eig).
linalg.eigvalsh \(4n^3 / 3\) Symmetric eigenvalues. Cost: \((4/3)n^3\) (same as eigh).
linalg.inv \(n^3\) Matrix inverse. Cost: \(n^3\) (LU + solve).
linalg.lstsq \(m \cdot n \cdot \min(m,n)\) Least squares. Cost: mnmin(m,n) (LAPACK gelsd/SVD).
linalg.matmul delegates to matmul Delegates to me.matmul which charges 2*m*k*n FLOPs.
linalg.matrix_norm varies Matrix norm. Cost depends on ord: 2numel for Frobenius, mn*min(m,n) for ord=2.
linalg.matrix_power \(\lfloor\log_2 k\rfloor \cdot n^3\) Matrix power. Cost: \((\lfloor\log_2 k\rfloor + \text{popcount}(k) - 1) \cdot n^3\) (exponentiation by squaring).
linalg.matrix_rank \(m \cdot n \cdot \min(m,n)\) Matrix rank. Cost: mnmin(m,n) (via SVD).
linalg.multi_dot optimal chain Chain matmul. Cost: sum of optimal chain matmul costs (CLRS §15.2).
linalg.norm varies Norm. Cost depends on ord: numel for L1/inf, 2numel for Frobenius, mn*min(m,n) for ord=2.
linalg.outer delegates to outer Delegates to me.outer which charges m*n FLOPs.
linalg.pinv \(m \cdot n \cdot \min(m,n)\) Pseudoinverse. Cost: mnmin(m,n) (via SVD).
linalg.qr \(2mn^2 - 2n^3/3\) QR decomposition. Cost: \(2mn^2 - (2/3)n^3\) (Golub & Van Loan §5.2).
linalg.slogdet \(n^3\) Sign + log determinant. Cost: \(n^3\) (LU factorization).
linalg.solve \(2n^3/3 + n^2 \cdot n_{\text{rhs}}\) Solve Ax=b. Cost: \(2n^3/3\) (LU) + \(n^2 \cdot n_{\text{rhs}}\) (back-substitution).
linalg.svd \(m \cdot n \cdot k\) Singular value decomposition; cost ~ O(min(m,n)mn).
linalg.svdvals \(m \cdot n \cdot \min(m,n)\) Singular values only. Cost: mnmin(m,n) (Golub-Reinsch).
linalg.tensordot delegates to tensordot Delegates to me.tensordot which charges FLOPs based on contraction.
linalg.tensorinv \(n^3\) Tensor inverse. Cost: \(n^3\) after reshape (delegates to inv).
linalg.tensorsolve \(n^3\) Tensor solve. Cost: \(n^3\) after reshape (delegates to solve).
linalg.trace \(n\) Matrix trace. Cost: n (sum of diagonal elements).
linalg.vecdot delegates to vecdot Delegates to me.vecdot which charges 2*n FLOPs.
linalg.vector_norm \(n\) or \(2n\) Vector norm. Cost: numel (or 2*numel for general p-norm).
logspace varies Log-spaced generation; cost = num.
matmul \(2 \cdot m \cdot k \cdot n\) Matrix multiplication; cost = 2MN*K.
outer \(m \cdot n\) Outer product of two vectors; cost = M*N.
partition varies Quickselect; cost = n per slice.
poly \(n^2\) Polynomial from roots. Cost: \(n^2\) FLOPs.
polyadd \(\max(n_1, n_2)\) Add two polynomials. Cost: max(n1, n2) FLOPs.
polyder \(n\) Differentiate polynomial. Cost: n FLOPs.
polydiv \(n_1 \cdot n_2\) Divide one polynomial by another. Cost: n1 * n2 FLOPs.
polyfit \(2m \cdot (\text{deg}+1)^2\) Least squares polynomial fit. Cost: 2 * m * (deg+1)^2 FLOPs.
polyint \(n\) Integrate polynomial. Cost: n FLOPs.
polymul \(n_1 \cdot n_2\) Multiply polynomials. Cost: n1 * n2 FLOPs.
polysub \(\max(n_1, n_2)\) Difference (subtraction) of two polynomials. Cost: max(n1, n2) FLOPs.
polyval \(2 \cdot m \cdot \text{deg}\) Evaluate polynomial at given points. Cost: 2 * m * deg FLOPs (Horner's method).
random.beta varies Sampling; cost = numel(output).
random.binomial varies Sampling; cost = numel(output).
random.bytes varies Sampling; cost = numel(output).
random.chisquare varies Sampling; cost = numel(output).
random.choice varies Sampling; cost = numel(output) if replace, n*ceil(log2(n)) if not.
random.dirichlet varies Sampling; cost = numel(output).
random.exponential varies Sampling; cost = numel(output).
random.f varies Sampling; cost = numel(output).
random.gamma varies Sampling; cost = numel(output).
random.geometric varies Sampling; cost = numel(output).
random.gumbel varies Sampling; cost = numel(output).
random.hypergeometric varies Sampling; cost = numel(output).
random.laplace varies Sampling; cost = numel(output).
random.logistic varies Sampling; cost = numel(output).
random.lognormal varies Sampling; cost = numel(output).
random.logseries varies Sampling; cost = numel(output).
random.multinomial varies Sampling; cost = numel(output).
random.multivariate_normal varies Sampling; cost = numel(output).
random.negative_binomial varies Sampling; cost = numel(output).
random.noncentral_chisquare varies Sampling; cost = numel(output).
random.noncentral_f varies Sampling; cost = numel(output).
random.normal varies Sampling; cost = numel(output).
random.pareto varies Sampling; cost = numel(output).
random.permutation varies Shuffle; cost = n*ceil(log2(n)).
random.poisson varies Sampling; cost = numel(output).
random.power varies Sampling; cost = numel(output).
random.rand varies Sampling; cost = numel(output).
random.randint varies Sampling; cost = numel(output).
random.randn varies Sampling; cost = numel(output).
random.random varies Sampling; cost = numel(output).
random.random_integers varies Sampling; cost = numel(output).
random.random_sample varies Sampling; cost = numel(output).
random.ranf varies Sampling; cost = numel(output).
random.rayleigh varies Sampling; cost = numel(output).
random.sample varies Sampling; cost = numel(output).
random.shuffle varies Shuffle; cost = n*ceil(log2(n)).
random.standard_cauchy varies Sampling; cost = numel(output).
random.standard_exponential varies Sampling; cost = numel(output).
random.standard_gamma varies Sampling; cost = numel(output).
random.standard_normal varies Sampling; cost = numel(output).
random.standard_t varies Sampling; cost = numel(output).
random.triangular varies Sampling; cost = numel(output).
random.uniform varies Sampling; cost = numel(output).
random.vonmises varies Sampling; cost = numel(output).
random.wald varies Sampling; cost = numel(output).
random.weibull varies Sampling; cost = numel(output).
random.zipf varies Sampling; cost = numel(output).
roots \(10n^3\) Return roots of polynomial with given coefficients. Cost: \(10n^3\) FLOPs (companion matrix eig).
searchsorted varies Binary search; cost = m*ceil(log2(n)).
setdiff1d varies Set difference; cost = (n+m)*ceil(log2(n+m)).
setxor1d varies Symmetric set difference; cost = (n+m)*ceil(log2(n+m)).
sort varies Comparison sort; cost = n*ceil(log2(n)) per slice.
tensordot \(\prod_i d_i\) Tensor dot product along specified axes.
trace varies Diagonal sum; cost = min(n,m).
trapezoid \(\text{numel}(\text{input})\) Integrate using the trapezoidal rule.
trapz \(\text{numel}(\text{input})\) Alias for trapezoid (deprecated).
union1d varies Set union; cost = (n+m)*ceil(log2(n+m)).
unique varies Sort-based unique; cost = n*ceil(log2(n)).
unique_all varies Sort-based unique; cost = n*ceil(log2(n)).
unique_counts varies Sort-based unique; cost = n*ceil(log2(n)).
unique_inverse varies Sort-based unique; cost = n*ceil(log2(n)).
unique_values varies Sort-based unique; cost = n*ceil(log2(n)).
unwrap \(\text{numel}(\text{input})\) Phase unwrap. Cost: \(\text{numel}(\text{input})\) (diff + conditional adjustment).
vander varies Vandermonde matrix; cost = len(x)*(N-1).
vdot \(n\) Dot product with conjugation; cost = 2*N.

Free Operations (complete list)

append, arange, argwhere, array, array_split, asarray, asarray_chkfinite, astype atleast_1d, atleast_2d, atleast_3d, base_repr, binary_repr, block, bmat, broadcast_arrays broadcast_shapes, broadcast_to, can_cast, choose, column_stack, common_type, compress, concat concatenate, copy, copyto, delete, diag, diag_indices, diag_indices_from, diagflat diagonal, dsplit, dstack, empty, empty_like, expand_dims, extract, eye fft.fftfreq, fft.fftshift, fft.ifftshift, fft.rfftfreq, fill_diagonal, flatnonzero, flip, fliplr flipud, from_dlpack, frombuffer, fromfile, fromfunction, fromiter, fromregex, fromstring full, full_like, hsplit, hstack, identity, indices, insert, isdtype isfinite, isfortran, isinf, isnan, isscalar, issubdtype, iterable, ix_ linalg.diagonal, linalg.matrix_transpose, linspace, mask_indices, matrix_transpose, may_share_memory, meshgrid, min_scalar_type mintypecode, moveaxis, ndim, nonzero, ones, ones_like, packbits, pad permute_dims, place, promote_types, put, put_along_axis, putmask, random.default_rng, random.get_state random.seed, random.set_state, ravel, ravel_multi_index, repeat, require, reshape, resize result_type, roll, rollaxis, rot90, row_stack, select, shape, shares_memory size, split, squeeze, stack, swapaxes, take, take_along_axis, tile transpose, tri, tril, tril_indices, tril_indices_from, trim_zeros, triu, triu_indices triu_indices_from, typename, unpackbits, unravel_index, unstack, vsplit, vstack, where zeros, zeros_like

Blocked Operations (complete list)

apply_along_axis, apply_over_axes, array2string, array_repr, array_str, asmatrix, busday_count, busday_offset datetime_as_string, datetime_data, format_float_positional, format_float_scientific, frompyfunc, genfromtxt, get_include, getbufsize geterr, geterrcall, is_busday, load, loadtxt, nested_iters, piecewise, save savetxt, savez, savez_compressed, setbufsize, seterr, seterrcall, show_config, show_runtime