whestbench.
API

load_dataset

Load a whestbench dataset from a local directory or HF Hub repo.

function · source

load_dataset(path_or_repo: "'Path | str'", *, revision: 'Optional[str]' = None, split: 'Optional[str]' = None, token: 'Optional[str]' = None, streaming: 'bool' = False) -> "'Dataset | DatasetDict | IterableDataset | IterableDatasetDict'"
Load a whestbench dataset from a local directory or HF Hub repo.

Returns `Dataset` for single-split datasets, and when `split=` is provided
for either single- or multi-split. Returns `DatasetDict` for multi-split
datasets when `split=` is not provided.

When ``streaming=True`` (HF Hub only), returns the equivalent streaming
types: ``IterableDataset`` (single-split or split-selected) or
``IterableDatasetDict`` (multi-split, no ``split=``). The metadata
side-channel works identically for both materialised and streaming
returns. ``iter_mlps()`` accepts both; ``mlp_at()`` requires a
materialised dataset and raises ``TypeError`` on streaming inputs.

Note: streaming datasets cannot currently be used with ``whest run
--dataset`` because the scoring path uses random-access indexing.

Metadata is attached via a weakref side-channel (accessible through
`whestbench.metadata()`):
- Dataset / IterableDataset (single-split or split-selected): the
  single-split-shaped metadata.
- DatasetDict / IterableDatasetDict (multi-split, no split=): the full
  multi-split metadata with the `splits:` dict.
- Each member inside a DatasetDict / IterableDatasetDict: the merged
  single-split-shaped metadata for that split.

Args:
    path_or_repo: Local directory path, or HF Hub repo id (e.g.
        "aicrowd/arc-whestbench-2026").
    revision: HF Hub git tag or commit SHA. Ignored for local paths.
    split: Optional split name. Required for multi-split datasets unless
        you want a DatasetDict; defaults to "public" for single-split
        datasets (preserves prior behavior).
    token: HF Hub auth token. Falls back to HF auth cache.
    streaming: If True, return HF streaming types (``IterableDataset`` /
        ``IterableDatasetDict``) instead of materialised ones. Only
        supported for HF Hub repos — raises ``ValueError`` for local
        paths.

Raises:
    InvalidDatasetError: missing/malformed/partial metadata, or unknown
        split name for a multi-split dataset.
    ValueError: ``streaming=True`` combined with a local path.