Changelog
Unreleased
Fixed
fnp.random.default_rng()andfnp.random.RandomState()now properly count FLOPs. Sampler methods on the returned objects (e.g.rng.standard_normal(),rs.randn()) deduct FLOPs from the active budget and returnFlopscopeArrayinstead of rawnumpy.ndarray. Previously these silently bypassed FLOP accounting — a real risk for the ARC Whitebox Estimation Challenge, since submissions could burn arbitrary compute without deducting a single FLOP. Closes flopscope#18.
Changed
fnp.random.__getattr__no longer silently forwards unknown attributes tonumpy.random. Bit-generator classes (BitGenerator,MT19937,PCG64,PCG64DXSM,Philox,SFC64) pass through unchanged. Anything else now raisesAttributeErrorwith a pointer todefault_rng(). Usenumpy.randomdirectly if you need an unwrapped/unsupported function. Module-level samplers (fnp.random.randn,normal,uniform, …) are unchanged — same semantics as numpy, no warnings.
Notes
- Downstream repos (
whestbench,whest-starterkit) can now drop thefnp.asarray(rng.uniform(...).astype(...))workaround arounddefault_rng/RandomStatesampler outputs — the wrap is no longer needed.
Added
-
Method-level registry entries for
random.Generator.<method>andrandom.RandomState.<method>(~94 entries with categoriescounted_random_method/free_random_methodand acost_formulafield).scripts/numpy_audit.pynow drift-checks the new slice on every numpy version bump, so future numpy releases that add a new sampler method will fail the audit until the maintainer adds a registry entry. -
NumPy
__array_ufunc__(NEP 13) and__array_function__(NEP 18) protocols onFlopscopeArray. Calls likenp.add(flopscope, x),np.add.reduce(a),np.add.outer(a, b),np.divmod(a, b),np.modf(a),np.frexp(a),np.add.at(a, idx, val),np.add.reduceat(a, idx), plus ~108 function-form callables (np.sort,np.transpose,np.linalg.solve, …) now route through flopscope's FLOP-counted wrappers automatically. Closes #58 (ndarray methods bypassing tracking), #38 (in-place dunders onSymmetricTensorrebinding instead of mutating), and #62 (no-symmetrySymmetricTensortype ambiguity). See the new Edge cases when SymmetricTensors meet NumPy protocols section for the corner-case rules. -
25 ndarray method overrides on
FlopscopeArray.a.sum(),a.dot(b),a.argsort(),a.compress(),a.trace(),a.round(),a.clip(), etc. now produce the same FLOP count asfnp.sum(a),fnp.dot(a, b), etc. -
In-place dunder rewrites with symmetry-corruption guards.
A_sym += BmutatesA_symin place when the result preservesA_sym's declared symmetry, and raisesValueErrorwhen it would weaken or destroy it (instead of silently rebinding to a new array, the pre-#67 behaviour). Covers__iadd__,__isub__,__imul__,__itruediv__,__ifloordiv__,__imod__,__ipow__,__iand__,__ior__,__ixor__,__ilshift__,__irshift__,__imatmul__. In-placesort/partitionsimilarly refuse onSymmetricTensor. -
Multi-output ufuncs
np.divmod/np.frexp/np.modfnow route through flopscope with fullout=(o1, o2)support, including partial allocation (out=(o1, None)). Both outputs preserve any symmetry the input had. -
Symmetry-aware cost adjustment for
ufunc.outerandtensordot— placeholder model chargesdense_cost × unique_output_elements / dense_output_elementsto reflect the savings a symmetry-aware implementation could realise. AboveSymmetryGroupdegree 12, the adjustment is skipped (Burnside enumeration onS_nforn > 12is infeasible) and a newCostFallbackWarningfires once per(op_name, degree)pair per process. Suppress viaflops.configure(symmetry_warnings=False)(shares the flag withSymmetryLossWarning). -
CostFallbackWarningadded to both the core library and the client package. Subclass ofFlopscopeWarning. -
Wall-clock time limits.
BudgetContextnow acceptswall_time_limit_sto set a wall-clock deadline. When exceeded,TimeExhaustedErroris raised at the next operation boundary with diagnostic info (operation name, elapsed time, limit). The deadline is checked both before and after each numpy call (cooperative enforcement). The entry banner shows the time limit when set. -
Timing attribution. Every operation now records its backend duration in
flopscope_backend_duration_sand its attributed flopscope overhead inflopscope_overhead_duration_s. The budget summary (both plain-text and Rich) showswall_time_s,flopscope_backend_time_s,flopscope_overhead_time_s, andresidual_wall_time_s. Usebudget.summary()orflops.budget_summary()to see the timing data. -
TimeExhaustedErroradded to both the core library and the client package. -
Einsum path caching. Contraction paths are now cached in a module-level LRU cache (default 4096 entries). Repeated
fnp.einsum()calls with the same subscripts, shapes, optimizer, and symmetry structure reuse the cached path instead of recomputing it. New public API:fnp.clear_einsum_cache(),fnp.einsum_cache_info(), andflops.configure(einsum_path_cache_size=N). -
Multi-version NumPy support. flopscope now supports NumPy 2.0, 2.1, and 2.2 (
>=2.0.0,<2.3.0). Default install resolves to NumPy 2.2. Functions not available in older NumPy versions raiseUnsupportedFunctionErrorwith an actionable message at call time (not import time). -
matvecandvecmat— new FLOP-counted wrappers for NumPy 2.2's matrix-vector and vector-matrix product ufuncs. Cost =output_size * contracted_axis(weight 1.0). -
UnsupportedFunctionError— new exception for calling functions that require a newer NumPy version than what's installed. -
CI NumPy version matrix — tests now run against NumPy 2.0, 2.1, and 2.2.
Changed
-
Renamed package from
mechestimtoflopscopeto reflect the new challenge name "ARC Whitebox Estimation Challenge". The import convention changes fromimport mechestim as metoimport flopscope as we. -
Symmetric BLAS classification restored. Pairwise contractions with symmetric inputs now correctly report
SYMM,SYMV, orSYDTBLAS types instead of the genericGEMM,GEMV,DOT. This was disabled during the subgraph-symmetry refactor because per-input symmetry wasn't being looked up; now each step's inputs are queried viasymmetry_oracle.sym(ssa_to_subset[ssa_id])before callingcan_blas. -
Symmetry detection rewritten — the induced-symmetry mechanism is replaced by a subset-keyed subgraph symmetry oracle (
SubgraphSymmetryOracle). The oracle analyses the bipartite structure of the einsum expression, evaluates symmetry lazily per operand subset, and caches results. This correctly handles intermediates (not just the top-level contraction) and eliminates over-eager per-step propagation. -
Every optimizer is symmetry-aware — the
symmetry_oraclekwarg is plumbed through_PATH_OPTIONSso that optimal, branch-*, greedy, random-greedy, and dynamic-programming algorithms all receive symmetry information and use the exactunique/denseratio for scoring. DP uses a subset-keyed ratio cache (get_ratio(s, legs)) co-located with the existingbitmap_to_subsetclosure insideDynamicProgramming.__call__, amortizing the int↔str label translation across all_dp_compare_*helper calls for a given subset. Previously only greedy received symmetry info in some code paths. -
Silent fallback deleted — the previous code silently fell back to dense costs when detection produced no result. The oracle now enforces that symmetry information is consumed. Enforcement is verified by
tests/test_no_silent_symmetry_drop.py.
Removed
symmetric_flop_count'sinput_symmetriesparameter (high-level API)propagate_symmetryand related helpers_detect_induced_output_symmetryand related helpersinduced_output_symmetrykwarg oncontract_path
Fixed
-
bitmap_to_subsetin DP now correctly handles operand renumbering. Previously, when_dp_parse_out_single_term_opsremoved or renumbered operands before the DP loop (e.g., oneinsum('i,ab,cd->abcd', v, X, X)wherevhas a unique index that reduces to a scalar), the bitmap-to-subset mapping would point at the wrong original operand positions, causing the oracle to return symmetry for an unrelated intermediate. This bug was latent under the conservative 2× heuristic and only surfaces with exact ratio scoring. -
Heterogeneous block dimensions in
unique_elements— the stars-and-bars block-cardinality calculation assumed all axes within a block had the same dimension. For rectangular block-symmetric tensors (e.g.einsum('ab,cd->abcd', X, X)withXof shape(3, 4)), it computedn**s = 3**2 = 9instead of the correct product3*4 = 12, silently underestimating the unique-element count by up to ~8× on rank-3 cases.block_cardis now computed asprod(size_dict[c] for c in blocks[0]), which reduces to the old formula for per-index groups and gives the correct product for block groups with differing axis sizes.
Added
- Enriched
PathInfodisplay —fnp.einsum_path().format_table(verbose=False)(called by__str__) now shows anOptimizer:header line resolvingoptimize='auto'/'auto-hq'to the inner choice that actually ran (e.g.optimal,dynamic_programming,random_greedy_128), acontractcolumn giving the path-supplied contraction tuple, and aunique/densecolumn showing the bare element counts that the symmetry savings derive from. Callformat_table(verbose=True)for an indented detail row per step showing the merged operand subset, the intermediate's output shape, and the running cumulative cost — the most useful view when debugging why a particular step's savings are what they are. - New
PathInfo.optimizer_used: strfield and newStepInfofieldspath_indices: tuple[int, ...]andmerged_subset: frozenset[int] | None. Themerged_subsetfield is the exact keySubgraphSymmetryOracle.sym(...)uses for its lookups, making the symmetry column directly attributable to the oracle's view of each intermediate.
0.2.0 (2026-04-03)
Second release with unified einsum cost model, NumPy compatibility testing, and expanded operation coverage.
New features
- Unified einsum cost model — all einsum-like operations (einsum, dot, matmul, tensordot) now share a single cost model based on opt_einsum's contraction path optimizer
- Symmetry-aware path finding — the opt_einsum path optimizer now factors symmetry savings into contraction ordering decisions, producing different (cheaper) paths for symmetric inputs
- NumPy compatibility test harness — run NumPy's own test suite against flopscope via monkeypatching; 7,300+ tests passing across 7 NumPy test modules
- Polynomial operations —
polyval,polyfit,polymul,polydiv,polyadd,polysub,poly,roots,polyder,polyintwith analytical FLOP costs - Window functions —
bartlett,hamming,hanning,blackman,kaiserwith per-function cost formulas - FFT module —
fft,ifft,rfft,irfft,fft2,ifft2,fftn,ifftn,rfftn,irfftnand free helpers (fftfreq,rfftfreq,fftshift,ifftshift) - Client-server architecture —
flopscope-clientandflopscope-serverpackages for sandboxed competition evaluation over ZMQ - Global default budget — a 1e15 FLOP budget auto-activates on first use, so explicit
BudgetContextis no longer required for quick scripts FLOPSCOPE_DEFAULT_BUDGETenv var — configure the global default budget amountbudget_live()— Rich-based live-updating budget display context managereinsum_path()— inspect contraction plans with per-step symmetry savings without spending budget- 90%+ test coverage gate enforced in CI
Breaking changes
- Einsum cost formula now uses
product_of_all_index_dims × op_factor(op_factor=2 for inner products, 1 for outer products), matching opt_einsum convention. Previously used a different formula. fnp.dotandfnp.matmulcosts are now computed via the einsum cost model instead of separate formulas.
Bug fixes
- Accept scalars and array-likes in all flopscope functions
- Fix symmetry-aware greedy algorithm to actually use symmetry in path selection
- Fix
contract_pathcost reporting for output indices - Correctly handle
symmetric_dimspropagation through multi-step contraction paths
Documentation
- Comprehensive how-to guides for einsum, symmetry, linalg, budget planning, and debugging
- Architecture docs for client-server model and Docker deployment
- AI agent guide with
llms.txt,ops.json, and cheat sheet - NumPy compatibility testing methodology docs
0.1.0 (2026-04-01)
Initial release for warm-up round.
- Einsum with symmetry detection and FLOP counting
- Pointwise operations (exp, log, add, multiply, etc.)
- Reductions (sum, mean, max, etc.)
- SVD with truncated top-k
- Free tensor creation and manipulation ops
- Budget enforcement via BudgetContext
- FLOP cost query API
- NumPy-compatible API (
import flopscope as we)