`whest run`

Run local evaluation for an estimator.

whest run [options]

Option	Default	Description
`--estimator`		Path to estimator.py (see https://github.com/AIcrowd/whest-starterkit for starter files).
`--class`		Estimator class name to load from the estimator file (auto-detected if omitted).
`--runner`	`'local'`	Execution backend: 'local'/'inprocess' run in-process; 'subprocess'/'server' run in an isolated subprocess (default: local).
`--n-mlps`		Number of MLPs to evaluate. Default: 10 when --dataset is not provided; otherwise the full dataset size. Clamped to the dataset size when --dataset is set and --n-mlps exceeds it.
`--detail`	`'raw'`	Report verbosity: 'raw' for a concise summary or 'full' for expanded per-MLP detail (default: raw).
`--profile`		Collect and display per-MLP FLOP/budget profiling breakdowns in the report.
`--show-diagnostic-plots`		Include diagnostic plot panes in the rendered (non-JSON) report.
`--format`		Select output format: rich, plain, or json.
`--json`		Alias for --format json.
`--dataset`		Path to a baked dataset directory, or hf://owner/repo[@revision] for HF Hub.
`--streaming`		Stream the dataset from HF instead of downloading it. Iteration-only (no random access). Data is NOT cached — subsequent runs will re-fetch. Useful for small --n-mlps debugging runs. See docs/guides/datasets.md#streaming-mode.
`--revision`		HF Hub revision (tag or commit SHA) for --dataset.
`--split`		For multi-split datasets, the split to evaluate. Required when the dataset is multi-split; optional when single-split (defaults to the only split).
`--flop-budget`		Effective compute budget per MLP in FLOPs. Caps C_m = F_m + lambda*R_m (analytical FLOPs plus charged residual wall time). Always honored; any flop_budget stored in --dataset's metadata is ignored. Default: 68_000_000_000 (6.8e10).
`--lambda-flops-per-second`		Residual wall-time penalty rate lambda in C_m = F_m + lambda*R_m (FLOP-equivalents per second of residual wall time). Default: 1e11.
`--n-samples`		Ground truth samples per MLP (default: widthwidth256). Lower values speed up generation at the cost of noisier scores.
`--debug`		Show full Python tracebacks for errors instead of condensed messages.
`--fail-fast`		Stop on the first estimator error and let the raw Python traceback propagate (combine with --debug to show it).
`--wall-time-limit`	`60.0`	Wall-clock time limit per predict call (default: 60.0 seconds).
`--residual-wall-time-limit`		Time limit for non-flopscope operations per predict call (default: unlimited).
`--seed`		Random seed for the run. Without --dataset, seeds both MLP generation and estimator setup. With --dataset, MLP seeds come from the dataset; this flag seeds estimator setup only. Default: omitted (ctx.seed defaults to 0; run_config.seed is null in the JSON output).
`--max-threads`		Limit BLAS to at most N CPU threads.

whest run

whest run

On this page

`whest run`