sample_layer_statistics

function · source

sample_layer_statistics(mlp: 'MLP', n_samples: 'int', rng: 'Optional[fnp.random.Generator]' = None, *, progress: 'Optional[Callable[[Dict[str, Any]], None]]' = None) -> 'Tuple[fnp.ndarray, fnp.ndarray, float]'

Estimate per-layer activation statistics via chunked Monte Carlo sampling.

Feeds ``n_samples`` random Gaussian inputs through the MLP in memory-bounded
chunks and computes empirical statistics of the activations at each layer.

The returned values are used in two places:

* **Scoring** (``scoring.py``): ``final_mean`` and ``avg_variance``
  normalise the ``sampling_mse`` metric so that networks with
  naturally high variance are not unfairly penalised.
* **Dataset generation** (``dataset.py``): ``all_layer_means`` captures
  the ground-truth activation profile that estimators try to predict.

Args:
    mlp: The MLP network to evaluate.
    n_samples: How many i.i.d. N(0, 1) input vectors to draw.  Larger
        values give more precise estimates at the cost of compute time.
    rng: Optional NumPy-compatible random generator used for sampling inputs.
        If ``None`` a new generator is created once for the call.
    progress: Optional callback invoked once per processed chunk.

Returns:
    all_layer_means: ``(depth, width)`` float32 array — the mean
        activation of every neuron at every layer, averaged over all
        samples.
    final_mean: ``(width,)`` float32 array — the mean activation at
        the last layer (equivalent to ``all_layer_means[-1]``).
    avg_variance: Scalar — the mean per-neuron variance at the final
        layer, used as a normalisation baseline for ``sampling_mse``.