Competition Guide

Everything you need to compete within a FLOP budget.

You will learn:

How to set budget limits with BudgetContext
How to use the @flops.budget decorator form
How wall-time limits work via wall_time_limit_s
How to read budget summaries
Common competition pitfalls and tips

Setting a FLOP budget

Every competition submission runs inside a FLOP budget. Use BudgetContext to declare how many FLOPs your code is allowed to spend:

import flopscope as flops
import flopscope.numpy as fnp

with flops.BudgetContext(flop_budget=50_000_000, namespace="solver") as budget:
    A = fnp.ones((256, 256))
    x = fnp.ones((256,))
    h = fnp.einsum('ij,j->i', A, x)
    h = fnp.exp(h)
    result = fnp.sum(h)

If your code exceeds the budget, Flopscope raises BudgetExhaustedError before the offending operation executes. The error message includes the cost of the failed operation and the remaining budget.

The namespace parameter sets the root namespace prefix for that budget context. Nested fnp.namespace(...) scopes extend it with dotted segments, but they do not create child budgets or split the FLOP limit into separate pools.

Decorator form

For cleaner code, use @flops.budget to attach a budget directly to a function:

import flopscope as flops
import flopscope.numpy as fnp

@flops.budget(flop_budget=50_000_000, namespace="forward-pass")
def forward(W, x):
    h = fnp.einsum('ij,j->i', W, x)
    h = fnp.maximum(h, 0)
    return fnp.sum(h)

result = forward(W, x)
flops.budget_summary()

Each call to the decorated function runs inside the same BudgetContext; the namespace is only the root prefix used for attribution, not a separate budget pool. Repeated calls reuse that context and keep accumulating on the same budget and operation log.

Wall-time limits

In addition to FLOP budgets, competitions may enforce a wall-clock time limit via wall_time_limit_s. This prevents solutions from stalling on operations that are analytically cheap but slow in practice:

with flops.BudgetContext(flop_budget=10**9, wall_time_limit_s=60.0) as budget:
    # Must finish within 60 seconds AND within 1 billion FLOPs
    ...

If the wall-clock time is exceeded, Flopscope raises TimeExhaustedError. The timer starts when the context is entered and is checked after each counted operation.

What wall_time_limit_s does and does not do:

It is a BudgetContext setting, so you configure it in the same place you set flop_budget.
It measures total wall-clock time for the active context, not FLOPs.
It is checked cooperatively before and after counted NumPy calls, so overshoot is bounded by the duration of one NumPy call.
It is a clean diagnostic limit inside flopscope. Hard process/container kills still belong to the outer execution environment.

Reading the budget summary

Call budget.summary() when you want the current context's summary, or flops.budget_summary() for the accumulated session/global view. Both stay flat by default; use by_namespace=True only when you want a namespace breakdown:

print(budget.summary())                   # context summary, flat by default
print(budget.summary(by_namespace=True))  # context summary with namespaces
flops.budget_summary()                       # session/global summary
flops.budget_summary(by_namespace=True)      # session/global summary with namespaces

Use these forms for different questions:

budget.summary() answers "what did this one explicit context spend?"
flops.budget_summary() answers "what has this process/session spent overall?"
budget.summary_dict(...) and flops.budget_summary_dict(...) return the same information as structured data instead of formatted text.

The block below shows print(budget.summary(by_namespace=True)) for the solver context:

flopscope FLOP Budget Summary [solver]
==================================
  Total budget:              50,000,000
  Used:                          66,048  (0.1%)
  Remaining:                 49,933,952  (99.9%)

  By namespace:
    solver                         66,048  (100.0%)  [3 calls]  Backend 0.000s  Overhead 0.000s

  By operation:
    einsum                     65,536  ( 99.2%)  [1 call]
    exp                           256  (  0.4%)  [1 call]
    sum                           256  (  0.4%)  [1 call]

  Total Wall Time:       ...s
  Flopscope Backend:     ...s  ( ...%)
  Flopscope Overhead:    ...s  ( ...%)
  Residual Wall Time:    ...s  ( ...%)

  By operation (time):
    einsum                 ...s  ( ...%)  [1 call]
    sum                    ...s  ( ...%)  [1 call]
    exp                    ...s  ( ...%)  [1 call]

Key things to look for:

Budget / Used / Remaining: the top rows show the explicit competition budget, current spend, and remaining headroom
By namespace: solver is the root prefix, and nested scopes show up as dotted paths like solver.precompute. Use budget.summary(by_namespace=True) for the current context or flops.budget_summary(by_namespace=True) for the accumulated session/global view
By operation: this toy pass is dominated by the single einsum; exp and sum are tiny by comparison
Wall / backend / flopscope overhead / residual time: wall time is total elapsed time for the context. Flopscope backend time is spent inside the underlying NumPy / BLAS / LAPACK calls being counted. Flopscope overhead is spent in flopscope's own dispatch code (wrapper preambles, FLOP cost computation, view-casts, post-call wrapping, maybe_check_nan_inf when opted in). Residual wall time is the measured remainder outside backend calls and flopscope overhead (user Python between ops, sleeps, GC pauses). The decomposition is exact: wall_time_s = flopscope_backend_time_s + flopscope_overhead_time_s + residual_wall_time_s
Flat default: the default summary stays flat unless you opt into by_namespace=True

For programmatic access, use flops.budget_summary_dict():

data = flops.budget_summary_dict()
print(f"Used: {data['flops_used']:,} / {data['flop_budget']:,}")

# Per-namespace breakdown:
data = flops.budget_summary_dict(by_namespace=True)
print(data["by_namespace"]["solver"]["flops_used"])

Quick tips for competition

Check costs before committing budget. Use cost query functions to estimate before executing:

cost = flops.einsum_cost('ij,jk->ik', shapes=[(256, 256), (256, 256)])
print(f"This matmul will cost {cost:,} FLOPs")  # 16,777,216

Use namespaces for phases. Split your solution into named phases (e.g., "init", "solve", "refine") so the budget summary shows exactly where FLOPs are spent.

Exploit symmetry for savings. If your tensors are symmetric, wrapping them with flops.as_symmetric() can halve pointwise costs and significantly reduce einsum costs. See Symmetry Savings for details.

Prefer cheaper operations. A matrix-vector product via fnp.einsum('ij,j->i', A, x) costs m*n FLOPs, while a full matrix-matrix multiply costs m*n*k. Avoid computing more than you need.

Watch out for hidden costs. Operations like fnp.array() and fnp.concatenate() are not free -- they charge numel(output) FLOPs, and fnp.where() charges numel(condition). Check the cost of any operation you are unsure about.

When things go wrong

If you hit BudgetExhaustedError, see Budget Planning & Debugging for a systematic approach to diagnosing overruns and reducing costs.