flopscope.

flopscope.numpy.random.RandomState.zipf

fnp.random.RandomState.zipf(self, a, size=None)

Draw samples from a Zipf distribution.

Adapted from NumPy docs np.random.RandomState.zipf

Arearandom
Typecounted
Cost
numel(output)\text{numel}(\text{output})
Flopscope Context

Legacy Zipf sampler; cost = numel(output).

Samples are drawn from a Zipf distribution with specified parameter a > 1.

The Zipf distribution (also known as the zeta distribution) is a discrete probability distribution that satisfies Zipf's law: the frequency of an item is inversely proportional to its rank in a frequency table.

Note.

New code should use the zipf method of a Generator instance instead; please see the random-quick-start.

Parameters

a:float or array_like of floats

Distribution parameter. Must be greater than 1.

size:int or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if a is a scalar. Otherwise, flops.array(a).size samples are drawn.

Returns

out:ndarray or scalar

Drawn samples from the parameterized Zipf distribution.

See also

Notes

The probability mass function (PMF) for the Zipf distribution is

p(k)=kaζ(a),p(k) = \frac{k^{-a}}{\zeta(a)},

for integers k1k \geq 1, where ζ\zeta is the Riemann Zeta function.

It is named for the American linguist George Kingsley Zipf, who noted that the frequency of any word in a sample of a language is inversely proportional to its rank in the frequency table.

References

footnote
1

Zipf, G. K., "Selected Studies of the Principle of Relative
Frequency in Language," Cambridge, MA: Harvard Univ. Press,
1932.

Examples

Draw samples from the distribution:

>>> a = 4.0
>>> n = 20000
>>> s = flops.random.zipf(a, n)

Display the histogram of the samples, along with the expected histogram based on the probability density function:

>>> import matplotlib.pyplot as plt
>>> from scipy.special import zeta  # doctest: +SKIP

bincount provides a fast histogram for small integers.

>>> count = flops.bincount(s)
>>> k = flops.arange(1, s.max() + 1)
>>> plt.bar(k, count[1:], alpha=0.5, label='sample count')
>>> plt.plot(k, n*(k**-a)/zeta(a), 'k.-', alpha=0.5,
... label='expected count')   # doctest: +SKIP
>>> plt.semilogy()
>>> plt.grid(alpha=0.4)
>>> plt.legend()
>>> plt.title(f'Zipf sample, a={a}, size={n}')
>>> plt.show()