whestbench.

Concepts — Why this challenge exists

Sourced from whest-starterkit @ aaa3882.

Concepts — Why this challenge exists

← Documentation

Background reading. These three docs explain the problem framing, the scoring metric, and how ground truth is generated. Helpful before you start tuning; essential before debating leaderboard outcomes.

DocWhat it covers
problem-setup.mdThe MLP architecture, He initialization, and the research framing — why "competing with sampling" is the milestone this challenge targets. Includes a "Further reading" pointer to the relevant ARC posts and papers.
scoring-model.mdHow the leaderboard score is computed: ASCII pipeline diagram, explicit equation block for adjusted_final_layer_score and all_layers_mse, behavior when the FLOP budget is exceeded, and a calibration table from the bundled examples.
ground-truth.mdHow the evaluator generates the reference values you're scored against — Monte-Carlo sampling, sample counts, the inherent noise floor.

Read in order if you want the full picture. At minimum, skim scoring-model.md — it's what drives every number you'll obsess over.

➡️ Where to look next

On this page