Participant GuideHow to
Pre-Submission Checklist
Sourced from whest-starterkit @
aaa3882.
Pre-Submission Checklist
🎯 When to use this page
The minute before you click "submit" on AIcrowd. Run through these checks; each one maps to a single command or a one-line confirmation.
Correctness
-
uv run whest validate --estimator estimator.pyends with a greenStatus: successpanel. (Catches: wrong shape, non-finite values, brokensetup().) -
uv run whest run --estimator estimator.py --runner local --seed 42 --n-mlps 3produces anadjusted_final_layer_scoreyou recognize. -
uv run whest run --estimator estimator.py --runner subprocess --seed 42 --n-mlps 3produces a score within ~1% of the local-runner score above. (Catches: shared global state, RNG re-seed differences, imports that fail in clean processes — see FAQ.)
Budget hygiene
- In the run report,
per_mlp[i].budget_exhaustedisfalsefor every MLP. Anytruemeans that MLP scored against zeros. -
per_mlp[i].time_exhaustedandresidual_wall_time_exhaustedare alsofalse(only relevant if you set--wall-time-limitor--residual-wall-time-limit). -
flops_usedis comfortably underflop_budget— leaves headroom for the harder MLPs in the grader suite.
Reproducibility
-
requirements.txtlists every non-flopscope, non-whestbench import your estimator pulls in.scipy,numpy-only utilities etc. — anything youimport. Test withuv pip install --target /tmp/probe -r requirements.txt && rm -rf /tmp/probeto confirm every name resolves. - No filesystem reads from outside
SetupContext.scratch_dir. The grader can't see your laptop. - No network calls in
setup()orpredict(). The grader has no outbound network. - No time-based seeds (
time.time(),os.urandom, …) and no participant-chosen seeds. If your estimator uses randomness insidepredict(), seed it frommlp.seed:fnp.random.default_rng(mlp.seed). If your estimator uses randomness insidesetup()(e.g. a fixed random projection basis), seed it fromctx.seed:fnp.random.default_rng(ctx.seed). Custom seeds at either site may be disqualified for prize eligibility — see Estimator Contract: Reproducibility. Do not callfnp.random.seed(...)— usedefault_rng(...)for an isolatedGenerator.
Sanity
-
predict()returns the post-ReLU mean for every layer, shape(mlp.depth, mlp.width). Off-by-one (returning depth+1 or depth-1 layers) is the most common silent bug. - If you ship a
setup(): it's idempotent and stays under the ~5ssetup_timeout_s. Heavy precompute belongs inSetupContext.scratch_dir. - No
print()left inpredict(). The grader runs many MLPs; stdout flooding is a reliable way to loseresidual_wall_time_s.
Final command
Once every box above is checked, ship it (run whest login first if you
haven't):
uv run whest submit --estimator estimator.py --watchwhest submit packages, uploads, and creates the submission in one step.
Prefer to inspect the artifact first? Build it with
uv run whest package --estimator estimator.py --output submission.tar.gz, check
tar tf submission.tar.gz (it should contain estimator.py and
manifest.json), then
uv run whest submit submission.tar.gz.