Changelog
Release history and notable changes.
v0.9.2 (2026-06-01)
Fix
- bump to track the flopscope 0.4.2 fix for
fnp.random.default_rng()over the client/server grader boundary; theflopscope>=0.4.1floor auto-resolves to 0.4.2 once published (AIcrowd/flopscope#109)
v0.9.1 (2026-05-31)
Fix
- cli: whest submit --watch reaches terminal grading state (#74)
v0.9.0 (2026-05-29)
Feat
- cli: add whest login + whest submit (hop-A AIcrowd submission)
- add config-aware dataset authoring (#72)
- prepared-arrow: friendly upfront notice + CLI preflight sizing (#69)
v0.8.0 (2026-05-27)
Feat
- ux2: prepared-Arrow fast path on HF for multi-split datasets (#67)
Fix
- prepared-arrow: handle multi-shard parquet splits (#68)
v0.7.0 (2026-05-27)
Feat
- ux1: per-split configs + split-aware load + early default_split resolution (#66)
- metadata: optional default_split + CLI fallback for multi-split datasets
v0.6.0 (2026-05-27)
Feat
- add whest version command and version metadata in JSON
- cli: validate/init/smoke-test/profile-simulation adopt unified copy
- cli: package gets a bytes progress bar
- cli: doctor wraps probes in a status spinner + bookends
- cli: merge gets spinner + before/after copy
- cli: download surfaces preflight summary + progress + completion
- cli: upload gets a real progress bar + before/after copy
- cli: bake gets phased progress bars + before/after copy
- cli: rename dataset push/pull/inspect to upload/download/info + deprecation
- cli: --streaming end-to-end with prominent cache-trade-off warning
- cli: add --streaming flag to whest run
- cli: use metadata-based n_mlps clamp when ds is streaming
- scoring: make_contest_from_dataset supports IterableDataset
- cli: wrap hf:// dataset load with hf_download progress UI
- hf_progress: add hf_upload context manager
- hf_progress: add hf_download context manager with three modes
- hf_progress: add RichHFTqdm that forwards into active Rich Progress
- hf_progress: add hf_preflight() with cache detection
- hf_progress: add HFPreflight dataclass
- ui: add status spinner context manager + finalize ui.py
- ui: add progress_count context manager
- ui: add progress_bytes context manager
- ui: add say.* message helpers (intent/step/ok/warn/hint)
- ui: add format_throughput helper
- ui: add format_duration helper
- ui: add format_bytes helper
- template: emit configs: block in YAML for explicit split ordering
- package: record tool and runtime versions in submission manifest
Fix
- avoid duplicate JSON output in validate command
- keep final_layer_mse in narrow score subtitle
- guard profile-simulation JSON payload type for metadata wrapper
- cli: cache-hit download says "Loaded from cache" not "Downloaded"
- cli: drop stray comma in cache-miss download ok line
- hf_progress: bail preflight when revision cannot be resolved
- hf_progress: drop unused empty top-level upload task
- hf_progress: raise on nested hf_download/hf_upload
- hf_progress: subclass HF tqdm and guard disabled bars
- ui: match HF Hub env-var truthy semantics in _progress_disabled
- ui: roll over format_bytes at the next-unit boundary
- dataset_io: use attr-set for configs to satisfy Pyright
Refactor
- ui: cache the default Console as a module-level singleton
- ui: inherit handles from ProgressHandle Protocol nominally
v0.5.1 (2026-05-27)
Feat
- template: mini+full quick-start snippet leads with split="mini"
- template: recognise mini+full split pair in dataset card
Fix
- template: restore print(ds[0]['mlp_name']) smoke-test in generic quickstart fallback
- template: scope companion-disclaimer to public+holdout, fix whitespace + spelling
- test: import datasets.config submodule explicitly for pyright
- dataset_io: scope merge_datasets HF cache to tempdir by default
v0.5.0 (2026-05-27)
Feat
- load_dataset: add streaming=True support (closes #55)
- readme: per-split MLP counts + tighter Compute/Reproducibility wording
- readme: companion_repo template var + collapse hardware_fingerprints
Fix
- lint: silence intentional type-violation in mlp_at streaming test
- lint: narrow load_dataset return type via Literal[streaming] overloads
- lint: narrow set element types before sort in fingerprint collapse
v0.4.0 (2026-05-26)
Added
seed_protocol 3.0(whestbench_explicit_per_mlp_seeds): each MLP's seed is an independent input rather than a derivation from a single root. Eachmlp_seedvalue in the parquet column is the canonical input seed. Within-MLP three-stream derivation (weight/sample/estimator) is preserved viaSeedSequence(mlp_seed).spawn(3).whest dataset bake --mlp-seeds FILE(JSON array of N ints) for explicit per-MLP seeds. Omitting both--mlp-seedsand--seedauto-generates viasecrets.randbits(63).create_dataset(mlp_seeds=[...])/create_dataset_torch(mlp_seeds=[...]).MLP.from_row(row, *, seed_protocol_version=...): protocol-aware estimator-seed derivation.- Frozen fixture
tests/fixtures/single_split_v3_protocol/for schema-drift regression. - Multi-split dataset support: dataset directories can now contain multiple Parquet files in
data/, one per split, described by an optionalsplits:sub-dict inmetadata.json. Backward-compatible — single-split datasets are unchanged. whest dataset combine-splits INPUT_DIR... --output OUTPUT_DIRCLI subcommand for assembling multi-split datasets from N complete single-split inputs.whestbench.combine_split_datasets()Python helper (re-exported fromwhestbench).whest dataset bake --split <name>now accepts arbitrary split names matching[a-z][a-z0-9]*(-[a-z0-9]+)*(previously restricted topublic/holdout).whest dataset pull --split <name>andwhest run --dataset ... --split <name>for selecting one split from multi-split datasets.
Changed
create_dataset(seed=...)/create_dataset_torch(seed=...)andwhest dataset bake --seed Nnow reject with a migration hint pointing at--mlp-seeds.- Parquet
mlp_seedcolumn semantics: under 3.0, the column stores the input seed (was: derived estimator seed under 2.0).MLP.seed(participant-facing) is unchanged across protocols — derived locally from the input under 3.0. whest dataset inspectnow recognises multi-split datasets and prints a per-split summary, plus theseed_protocol: <name> (version <version>)line for all datasets.whestbench.load_dataset()returnsDataset | DatasetDictbased on the dataset shape; explicitsplit=always returnsDataset.whestbench.metadata()accepts aDatasetDictand an optionalsplit=filter that projects to single-split-shaped metadata.- The dataset-card template gains a multi-split branch with leaderboard-specific wording when splits are
{public, holdout}; the single-splitpublicbranch's wording is updated to point at the new evaluation repo.
Compatibility
whestbench.load_datasetreads bothseed_protocol 2.0and3.0datasets indefinitely. Existing published datasets (e.g.aicrowd/arc-whestbench-2026-smoke-test) continue to work unchanged.- New bakes only write 3.0.
schema_versionstays at"3.0". The protocol discriminator isseed_protocol.{name,version}.- The
splits:field is purely additive. - Old whestbench reading new multi-split datasets fails loudly with a missing-
n_mlpserror — upgrade whestbench to read multi-split.
0.3.0 — 2026-05-25
BREAKING
- Dataset format migrated from
.npzto HF Parquet+sidecar (schema 2.4 → 3.0). Datasets are now directories withdata/<split>-NNNNN.parquet,metadata.json, andREADME.md. Thewhest create-datasetcommand is replaced bywhest dataset bake. TheDatasetBundledataclass is removed; internal consumers operate ondatasets.Datasetdirectly. - Public estimator interface unchanged. Estimators still receive
MLPinstances viapredict(mlp: MLP).
NEW
whestbench.load_dataset(path_or_repo, revision=..., split=..., token=...)loads from local directories OR HF Hub.whestbench.iter_mlps(ds),whestbench.mlp_at(ds, i),whestbench.metadata(ds).whestbench.publish_dataset(local_dir, repo_id=..., tag=..., ...)for HF Hub uploads.whestbench.merge_datasets(input_dirs, output_dir=...)— concatenate partial bakes.whest dataset {bake, push, pull, merge, inspect}CLI subcommands.- Parallel bake via
--slice K/Nor--mlp-range START-ENDflags; merge withwhest dataset merge. whest run --datasetnow accepts HF Hub repos:hf://owner/repo@v1(inline revision) orowner/repo --revision v1.
MIGRATION
- Legacy
.npzdatasets cannot be loaded by 0.3.0. Re-bake withwhest dataset bakeat the same--seedto reproduce. - See dataset-format for the schema 3.0 specification.