whestbench.
API

metadata

Return the metadata.json contents attached to a Dataset or DatasetDict.

function · source

metadata(ds_or_dsd: "'Dataset | DatasetDict | IterableDataset | IterableDatasetDict'", *, split: "'str | None'" = None) -> 'Dict[str, Any]'
Return the metadata.json contents attached to a Dataset or DatasetDict.

For a DatasetDict / IterableDatasetDict (multi-split, no split= at load
time): returns the full multi-split metadata dict (with `splits:`). Pass
`split="X"` to get a single-split-shaped merged dict for split X.

For a Dataset / IterableDataset (single-split or
`load_dataset(..., split=X)`): returns the single-split-shaped metadata.
`split=` is rejected with a TypeError — the split is fixed at load time
and cannot be re-selected here.

Raises:
    InvalidDatasetError: if no metadata is attached, or if `split=` names
        a split that's not in the DatasetDict-like.
    TypeError: if `split=` is passed for a single-Dataset-like.