flopscope.numpy.random.zipf

fnp.random.zipf(a, size=None)[flopscope source]

Draw samples from a Zipf distribution.

Adapted from NumPy docs np.random.zipf

Arearandom

Typecustom

NumPy Refnp.random.zipf

Cost

16×per-operation

Flopscope Context

Sampling; cost = numel(output).

Samples are drawn from a Zipf distribution with specified parameter a > 1.

The Zipf distribution (also known as the zeta distribution) is a discrete probability distribution that satisfies Zipf's law: the frequency of an item is inversely proportional to its rank in a frequency table.

Note.

New code should use the zipf method of a Generator instance instead; please see the random-quick-start.

Parameters

a:float or array_like of floats: Distribution parameter. Must be greater than 1.
size:int or tuple of ints, optional: Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if a is a scalar. Otherwise, flops.array(a).size samples are drawn.

Returns

out:ndarray or scalar: Drawn samples from the parameterized Zipf distribution.

Notes

The probability mass function (PMF) for the Zipf distribution is

p(k) = \frac{k^{-a}}{\zeta(a)},

for integers $k \geq 1$ , where $\zeta$ is the Riemann Zeta function.

It is named for the American linguist George Kingsley Zipf, who noted that the frequency of any word in a sample of a language is inversely proportional to its rank in the frequency table.

References

footnote

1

Zipf, G. K., "Selected Studies of the Principle of Relative
Frequency in Language," Cambridge, MA: Harvard Univ. Press,
1932.

Examples

Draw samples from the distribution:

>>> a = 4.0
>>> n = 20000
>>> s = flops.random.zipf(a, n)

Display the histogram of the samples, along with the expected histogram based on the probability density function:

>>> import matplotlib.pyplot as plt
>>> from scipy.special import zeta  # doctest: +SKIP

bincount provides a fast histogram for small integers.

>>> count = flops.bincount(s)
>>> k = flops.arange(1, s.max() + 1)

>>> plt.bar(k, count[1:], alpha=0.5, label='sample count')
>>> plt.plot(k, n*(k**-a)/zeta(a), 'k.-', alpha=0.5,
... label='expected count')   # doctest: +SKIP
>>> plt.semilogy()
>>> plt.grid(alpha=0.4)
>>> plt.legend()
>>> plt.title(f'Zipf sample, a={a}, size={n}')
>>> plt.show()