whestbench.
Participant GuideGetting started

Tutorial — The 5-stage ladder

Sourced from whest-starterkit @ aaa3882.

Tutorial — The 5-stage ladder

← Documentation

The tutorial trail. Each stage is a single command on the same estimator.py, with the harness adding one more level of formality at each step. Read top-to-bottom: Stage 4's subprocess isolation catches bugs that Stage 3 hides, so skipping ahead is rarely worth it.

StageCommandWhat it addsDoc
1uv run python estimator.pyThe math. Iterate locally with flopscope and local_engine.py; no whest CLI required.stage-1-standalone.md
2uv run whest validate --estimator estimator.pyContract correctness — class resolved, optional setup() runs, shape, finite values.stage-2-validate.md
3uv run whest run --estimator estimator.py --dataset hf://aicrowd/arc-whestbench-public-2026 --split mini --runner localReal scoring against the public Mini split (100 MLPs), in-process (so pdb works).stage-3-run-local.md
4uv run whest run --estimator estimator.py --dataset hf://aicrowd/arc-whestbench-public-2026 --split mini --runner subprocessSubprocess isolation — catches state-bleed between MLPs, dirty imports, RNG re-use.stage-4-run-subprocess.md
5uv run whest package --estimator estimator.py --output submission.tar.gzPackage the submission tarball for AIcrowd.stage-5-package.md

Each stage doc carries an "Expected outcome" callout so you know what success looks like before climbing — and a "Ladder" strip at the top so you always know where you are.

➡️ Where to look next

On this page