10 NumPy Random Number

In this lesson, we will learn how to use NumPy to generate random numbers and random samples. While base Python provides basic randomization tools through the random module, NumPy extends these capabilities with greater flexibility, efficiency, and access to a wide range of statistical distributions.

import numpy as np
import matplotlib.pyplot as plt

1 Random Sampling

The numpy.random.choice() function allows us to randomly select elements from a given set. It is commonly used for simulation, bootstrapping, and random assignment in experiments.

Syntax

np.random.choice(a, size=None, replace=True, p=None)

Parameters:

  • a: An array, list, or range from which samples will be drawn.
  • size: The number of elements to draw. If None (default), only one value is returned.
  • replace: Whether sampling is done with replacement (True, default) or without replacement (False).
  • p: An array of probabilities corresponding to each element in a. If omitted, all elements are equally likely.

Example: Rolling a Die

# A single roll of a fair die
die_roll = np.random.choice(range(1, 7))
print(die_roll)
5
  • We can also draw multiple samples:
eight_rolls = np.random.choice(range(1, 7), size=8)
print(eight_rolls)
[4 6 3 3 3 4 4 3]
  • If we set size=1, the function returns an array with a single value. If we leave size=None, it returns just an integer:
one_roll = np.random.choice(range(1, 7), size=1)
print(one_roll)
[3]

Example: Loaded Die

  • By default, all outcomes are equally likely. To assign different probabilities, we can use the p parameter:
loaded_die = np.random.choice(range(1, 7), size=10, 
                              p=[0.5, 0.2, 0.1, 0.1, 0.05, 0.05])
print(loaded_die)
[1 4 1 1 4 2 2 2 1 2]

Sampling from Non-Numeric Data

  • np.random.choice() can also sample from non-numeric arrays or lists:
my_sample = np.random.choice(['A', 'B', 'C'], size=10)
print(my_sample)
['A' 'A' 'C' 'C' 'C' 'B' 'C' 'C' 'B' 'C']

Sampling Without Replacement

  • Setting replace=False ensures that each element is selected at most once:
names = ['Anna', 'Beth', 'Chad', 'Drew', 'Emma', 
         'Fred', 'Gary', 'Hana', 'Iris', 'Jake']

group = np.random.choice(names, size=5, replace=False)
print(group)
['Chad' 'Jake' 'Beth' 'Fred' 'Gary']

This is useful for creating random groups or assigning participants to treatment conditions.

2 Sampling from Statistical Distributions

NumPy provides many functions to generate random numbers from well-known probability distributions. Here we will focus on three important ones:

  • Uniform distribution
  • Normal distribution
  • Gamma distribution

The Uniform Distribution

A random variable that follows a uniform distribution on an interval [a, b] has an equal probability of taking any value within that interval.

We can generate such random numbers using np.random.uniform():

unif_sample = np.random.uniform(low=6, high=10, size=10000)
print(unif_sample)
[6.39219371 8.87599648 8.69689461 ... 8.98985336 7.63727258 7.22501507]
Code
plt.hist(unif_sample, bins=np.arange(5.5, 11, 0.5), 
         density=True, edgecolor='black')
plt.xlabel('Sampled Value')
plt.ylabel('Proportion')
plt.title('Uniform(6, 10) Distribution')
plt.show()

The Normal Distribution

A normal (Gaussian) distribution is characterized by two parameters:

  • Mean (μ): Determines the center of the distribution.
  • Standard deviation (σ): Controls the spread of the data.

We can sample from a normal distribution using np.random.normal():

norm_sample = np.random.normal(loc=10, scale=3, size=10000)
print(norm_sample)
[ 7.33218279 10.6069436  11.56781312 ...  7.75608678 11.25494874
  3.03328935]
Code
plt.hist(norm_sample, bins=np.arange(0, 20, 0.5), 
         density=True, edgecolor='black')
plt.xlabel('Sampled Value')
plt.ylabel('Proportion')
plt.title('Normal(μ=10, σ=3) Distribution')
plt.show()

Estimating Probabilities Using Simulation

We can use large random samples to estimate probabilities for a normal random variable:

X = np.random.normal(loc=10, scale=2, size=1000000)

print('Prob[X < 10] =', np.mean(X < 10))
print('Prob[X < 12] =', np.mean(X < 12))
print('Prob[8 < X < 12] =', np.mean((X > 8) & (X < 12)))
print('Prob[6 < X < 14] =', np.mean((X > 6) & (X < 14)))
print('Prob[4 < X < 16] =', np.mean((X > 4) & (X < 16)))
Prob[X < 10] = 0.500255
Prob[X < 12] = 0.841841
Prob[8 < X < 12] = 0.682532
Prob[6 < X < 14] = 0.954546
Prob[4 < X < 16] = 0.997355

The Gamma Distribution

A random variable that follows a Gamma distribution takes only positive values and is often right-skewed. It is commonly used to model waiting times — such as time until a machine fails or until the next earthquake occurs.

We can sample from a Gamma distribution using np.random.gamma():

np.random.seed(137)  # Setting the seed for reproducibility
gamma_sample = np.random.gamma(shape=3, scale=10, size=10000)
print(gamma_sample)
[48.73927222 48.12681794 20.54500295 ... 12.91214078 33.07870526
 40.04233959]
Code
plt.hist(gamma_sample, bins=np.arange(0, 150, 5), 
         density=True, edgecolor='black')
plt.xlabel('Sampled Value')
plt.ylabel('Density')
plt.title('Gamma(shape=3, scale=10) Distribution')
plt.show()

Other Distributions

NumPy random number generator methods

Method Description
uniform Draw samples from a uniform distribution
integers Draw random integers from a given low-to-high range
standard_normal Draw samples from a standard normal distribution
binomial Draw samples from a binomial distribution
normal Draw samples from a normal (Gaussian) distribution
beta Draw samples from a beta distribution
chisquare Draw samples from a chi-square distribution
gamma Draw samples from a gamma distribution