Competition Guide
Everything you need to compete within a FLOP budget.
You will learn:
- How to set budget limits with
BudgetContext - How to use the
@flops.budgetdecorator form - How wall-time limits work via
wall_time_limit_s - How to read budget summaries
- Common competition pitfalls and tips
Setting a FLOP budget
Every competition submission runs inside a FLOP budget. Use BudgetContext to declare how many FLOPs your code is allowed to spend:
import flopscope as flops
import flopscope.numpy as fnp
with flops.BudgetContext(flop_budget=50_000_000, namespace="solver") as budget:
A = fnp.ones((256, 256))
x = fnp.ones((256,))
h = fnp.einsum('ij,j->i', A, x)
h = fnp.exp(h)
result = fnp.sum(h)If your code exceeds the budget, Flopscope raises BudgetExhaustedError before the offending operation executes. The error message includes the cost of the failed operation and the remaining budget.
The namespace parameter sets the root namespace prefix for that budget context. Nested fnp.namespace(...) scopes extend it with dotted segments, but they do not create child budgets or split the FLOP limit into separate pools.
Decorator form
For cleaner code, use @flops.budget to attach a budget directly to a function:
import flopscope as flops
import flopscope.numpy as fnp
@flops.budget(flop_budget=50_000_000, namespace="forward-pass")
def forward(W, x):
h = fnp.einsum('ij,j->i', W, x)
h = fnp.maximum(h, 0)
return fnp.sum(h)
result = forward(W, x)
flops.budget_summary()Each call to the decorated function runs inside the same BudgetContext; the namespace is only the root prefix used for attribution, not a separate budget pool. Repeated calls reuse that context and keep accumulating on the same budget and operation log.
Wall-time limits
In addition to FLOP budgets, competitions may enforce a wall-clock time limit via wall_time_limit_s. This prevents solutions from stalling on operations that are analytically cheap but slow in practice:
with flops.BudgetContext(flop_budget=10**9, wall_time_limit_s=60.0) as budget:
# Must finish within 60 seconds AND within 1 billion FLOPs
...If the wall-clock time is exceeded, Flopscope raises TimeExhaustedError. The timer starts when the context is entered and is checked after each counted operation.
What wall_time_limit_s does and does not do:
- It is a
BudgetContextsetting, so you configure it in the same place you setflop_budget. - It measures total wall-clock time for the active context, not FLOPs.
- It is checked cooperatively before and after counted NumPy calls, so overshoot is bounded by the duration of one NumPy call.
- It is a clean diagnostic limit inside flopscope. Hard process/container kills still belong to the outer execution environment.
Reading the budget summary
Call budget.summary() when you want the current context's summary, or flops.budget_summary() for the accumulated session/global view. Both stay flat by default; use by_namespace=True only when you want a namespace breakdown:
print(budget.summary()) # context summary, flat by default
print(budget.summary(by_namespace=True)) # context summary with namespaces
flops.budget_summary() # session/global summary
flops.budget_summary(by_namespace=True) # session/global summary with namespacesUse these forms for different questions:
budget.summary()answers "what did this one explicit context spend?"flops.budget_summary()answers "what has this process/session spent overall?"budget.summary_dict(...)andflops.budget_summary_dict(...)return the same information as structured data instead of formatted text.
The block below shows print(budget.summary(by_namespace=True)) for the solver context:
flopscope FLOP Budget Summary [solver]
==================================
Total budget: 50,000,000
Used: 66,048 (0.1%)
Remaining: 49,933,952 (99.9%)
By namespace:
solver 66,048 (100.0%) [3 calls] Backend 0.000s Overhead 0.000s
By operation:
einsum 65,536 ( 99.2%) [1 call]
exp 256 ( 0.4%) [1 call]
sum 256 ( 0.4%) [1 call]
Total Wall Time: ...s
Flopscope Backend: ...s ( ...%)
Flopscope Overhead: ...s ( ...%)
Residual Wall Time: ...s ( ...%)
By operation (time):
einsum ...s ( ...%) [1 call]
sum ...s ( ...%) [1 call]
exp ...s ( ...%) [1 call]Key things to look for:
- Budget / Used / Remaining: the top rows show the explicit competition budget, current spend, and remaining headroom
- By namespace:
solveris the root prefix, and nested scopes show up as dotted paths likesolver.precompute. Usebudget.summary(by_namespace=True)for the current context orflops.budget_summary(by_namespace=True)for the accumulated session/global view - By operation: this toy pass is dominated by the single
einsum;expandsumare tiny by comparison - Wall / backend / flopscope overhead / residual time: wall time is total elapsed time for the context. Flopscope backend time is spent inside the underlying NumPy / BLAS / LAPACK calls being counted. Flopscope overhead is spent in flopscope's own dispatch code (wrapper preambles, FLOP cost computation, view-casts, post-call wrapping,
maybe_check_nan_infwhen opted in). Residual wall time is the measured remainder outside backend calls and flopscope overhead (user Python between ops, sleeps, GC pauses). The decomposition is exact:wall_time_s = flopscope_backend_time_s + flopscope_overhead_time_s + residual_wall_time_s - Flat default: the default summary stays flat unless you opt into
by_namespace=True
For programmatic access, use flops.budget_summary_dict():
data = flops.budget_summary_dict()
print(f"Used: {data['flops_used']:,} / {data['flop_budget']:,}")
# Per-namespace breakdown:
data = flops.budget_summary_dict(by_namespace=True)
print(data["by_namespace"]["solver"]["flops_used"])Quick tips for competition
Check costs before committing budget. Use cost query functions to estimate before executing:
cost = flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
print(f"This matmul will cost {cost:,} FLOPs") # 16,777,216Use namespaces for phases. Split your solution into named phases (e.g., "init", "solve", "refine") so the budget summary shows exactly where FLOPs are spent.
Exploit symmetry for savings. If your tensors are symmetric, wrapping them with flops.as_symmetric() can halve pointwise costs and significantly reduce einsum costs. See Symmetry Savings for details.
Prefer cheaper operations. A matrix-vector product via fnp.einsum('ij,j->i', A, x) costs m*n FLOPs, while a full matrix-matrix multiply costs m*n*k. Avoid computing more than you need.
Watch out for hidden costs. Operations like fnp.array() and fnp.concatenate() are not free -- they charge numel(output) FLOPs, and fnp.where() charges numel(condition). Check the cost of any operation you are unsure about.
When things go wrong
If you hit BudgetExhaustedError, see Budget Planning & Debugging for a systematic approach to diagnosing overruns and reducing costs.