Generating Random Numbers and Statistical Distributions

Do computers think? Do logical computers generate random numbers? How do they generate numbers that are apparently random?

Computers do not generate completely random numbers. In fact, the numbers given by some computers are produced by deterministic algorithms. By knowing the algorithm, the next generated random number in the sequence can be calculated. A random variable is a function that maps events in the sample space to numerical values in \(\mathfrak{R}\). A statistical sample of a distribution is a sequence of generated random numbers that belong to the co-domain of the random variable following that distribution.

On this page, we focus on Monte Carlo methods, which are based on `random numbers'. There are various methods that use the Monte Carlo technique, and so they are called Monte Carlo methods. Quasi-Monte Carlo and Sequential Monte Carlo are alternatives to Monte Carlo in that they produce better samples under certain conditions. The previous two methods differ from the ordinary Monte Carlo methods in that the way they generate the sample reduces the error and improves the accuracy of the sample. A very brief introduction to Quasi-Monte Carlo methods will be given on this page.


Simulating random numbers from a uniform distribution

Pseudo-random numbers

Let's look at the uniform distribution and generate a sequence of numbers that follow a uniform distribution. The density function of a random variable that follows a uniform distribution with parameters \(a\) and \(b\) is given by $$ f(x):= \begin{cases} \frac{1}{b-a}&a < x < b;\\ 0&\mathrm{otherwise}. \end{cases} $$
D.H.Lehmer (1948) proposed a simple linear congruential generator (LCG) as the base source for generating random numbers (J.E.Gentle, 2005).

Let's generate a sequence of random integers from the set \(S=\{1,2,3,4,\dots, m\}\). We set the first element in the sequence \(x_0\) that is called the `seed' (primitive root) . We then define the following \(x_{i+1}=f(x_i)\) and \(x_{i+2}=f(f(x_i))\), which are in the general case, \(f^{[m]}(x_i):=\underbrace{f(f(...f(x_i)))}_{\text{m times}}\). We say that \(x_0\) has order \(k\) if and only if \(k\neq 1\) is the smallest positive integer such that \(f^{[k]}(x_{i})=x_i\). When the order of \(x_0\) is \(k=m\), the sequence of \(x_i\)'s is uniformly distributed over integers \([1,m]\). \(S\) is called a cyclic set if the order of \(x_0\) is equal to the number of integers in the set, " \(m\)". We can then generate all elements in \(S\) using \(x_0\) as a starting number in the sequence.

Example
Given \(S=\{1,2,3,4,5\}\) and \(x_0=3\), find a mapping to generate a sequence of uniformly distributed integers in the set \(S\).

Solution
Let \(f(1)=3, ~f(2)=5, ~f(3)=4, ~f(4)=2\) and \(f(5)=1\). The sequence \(x_i\) takes the form, $$ \{3,4,2,5,1 ~~3,4,2,5,1 ~~ 3,4,2,5,1~ \dots\}.$$ Every element in the sequence is repeated every 5 terms and so the terms are not independent. Note that \(\mathrm{P}(x_i=y)=\frac{1}{5}\) for any \(y\in S\).


We can generate such a sequence using LCG with \(0\leq x_i < m\), by defining $$ x_{i+1}=f(x_i)=(a~x_i+c )~\mathrm{mod}~ m.$$ The sequence maps each integer to a distinct integer in \([1,m-1]\) for \(c=0\) and some \(a \in N \backslash \{0\}\). The case of \(c\neq 0\), we generate integers between \([0,m-1]\).


Theorem. Hull-Dobell theorem.

The sequence, \(x_i\), defined above, is cyclic with a primitive root \(x_0\) of order \(m\) if
  • \(c\) and \(m\) are co-primes with the greatest common divisor \(gcd=1\)
  • \(a-1= 0~\mathrm{mod}~ p\), if \(p\) is a prime factor of \(m\)
  • \(a-1= 0~ \mathrm{mod}~ 4\), if \(4\) is a factor of \(m\)


Example Use Hull-Dobell theorem and generate each of the following cyclic sequences.
  1. to generate a cyclic sequence of integers between 1 and 10.
  2. to generate a cyclic sequence of integers between 0 and 98.
  3. to generate a cyclic sequence of integers between 1 and 100.

Solution.
  1. Set \(m=11\), \(c=0\) and \(a=2\). The generated sequence is then \( 5 , 10 , 9 , 7 , 3 , 6 , 1 , 2 , 4 , 8 , 5 , 10, \dots\).
  2. Set \(m=99\), \(c=5\) and \(a=11 \times 3+1=34\).
  3. Set \(m=100\), \(c=3\) and \(a=2 \times 5 \times 2 +1=21\). we then add 1 to each number in the sequence to transform the interval from [0,99] to [1,100] as required.


Standard uniformly distributed numbers in [0,1]

We set$$ u_i=\left\{ \begin{array}{cc} \frac{x_{i}-1}{m-2}:& c = 0\\ \frac{x_{i}}{m-1}:& c \neq 0 \end{array} \right\}, $$ which is a sequence of uniformly distributed numbers over the unit interval \([0,1]\). We say that \(u\) is a random variable that follows a standard uniform distribution (i.e. \(u\sim U[0,1]\)).

In order to have a sequence of random numbers, generated by a random seed, one can choose the seed \(x_0\) to be a number known to the computer that describes the time or to replace it by the last generated number in the sequence. We provide a new seed every time we generate a number \(x_i\) and therefore \(x_{i+1}\) does not depend directly on \(x_i\). Note that the possible values of \(x_i\) belong to the interval \([0,m-1]\), not to \([1,m-1]\) as in the previous example. $$ x_i= [time](i) ~\mathrm{mod}~ (m),\\ u_i=\frac{x_i}{m-1}. $$ Quasi-random numbers

In this section, one example is given on generating low discrepancy sequence of numbers. Low discrepancy sequence is a sequence of random numbers that spreads over the domain of integration as the size of the sample increases. The numbers in the sequence are called Quasi-random numbers. The Monte Carlo technique that uses Quasi-random numbers is called Quasi-Monte Carlo technique. It is a variance reduction technique to reduce the variance of the estimator such that the random number \(x_i\) depends on the previously generated numbers in the sequence.

Example \(\alpha\)-sequences

Let \(p\) be a prime number and define $$ \begin{array}{cll} &x_k=k\sqrt{p} &: \forall k \in \mathbb{N},~ 0\leq k\leq m;\\ &y_i=x_i-\lfloor x_i \rfloor &: \forall k \in \mathbb{N},~ 0\leq k\leq m, \end{array} $$
where \(y_i\) is the fraction part of \(x_i\), and \(\lfloor x_i \rfloor\) denotes floor \(x_i\) that is the integer part of \(x_i\). Then \(\{y_0,y_1,\dots,y_n\}\) are uniformly distributed numbers over \([0,1)\). If \(p\) and \(q\) are two distinct prime numbers, then the generated pairs \((y_i,z_i)\) corresponding to \((p,q)\) are uniformly distributed over \(I^2[0,1]\). Here \(y_i\) and \(z_i\) are independent random numbers (Proof is omitted). This is a way to generate \(s\) independent uniform distributions over \((I^s[0,1])\) by choosing \(s\) different prime numbers. Note that the sequence here does not repeat it self (Not cyclic).


Simulating random numbers from specific distributions
We have learnt how to generate numbers that follow a uniform distribution. In this section, we will learn how to use numbers that follow a uniform distribution to generate other statistical distributions.

Box-Muller algorithm
Let \(U\) and \(V\) be two independent random variables that are uniformly distributed over [0,1]. Then the two random variables \(Y\) and \(Z\) defined by $$ \begin{array}{c} \mathrm{Y}=\sqrt{-2\mathrm{log}(U)}~ \mathrm{cos}(2\pi V)\\ \mathbf{Z}=\sqrt{-2\mathrm{log}(U)}~ \mathrm{sin}(2\pi V) \end{array} $$ are independent and each follows a standard normal distribution \(N(0,1)\) with a mean of \(0\) and a standard deviation of \(1\). Knowing that \(\mathbf{Z} \sim N(0,1)\), we can generate any normally distributed random variable \(\mathrm{X}\) with mean \(\mu\) and variance \(\sigma^2\) (i.e. \(\mathrm{X}\sim N(\mu,\sigma^2)\)) by defining $$ \mathrm{X}=\sigma\mathbf{Z}+\mu. $$
Central Limit Theorem (CLT)
Another method to generate a standard normal distribution is to apply the CLT to a sample of independent identical distributions (iid) \(\mathrm{Y}_1,\mathrm{Y}_2,\dots,\mathrm{Y}_m\). When \(m\) is large enough, and if each distribution \(\mathrm{Y}_i\) has a mean of \(\mu\) and a standard deviation of \(\sigma\), we get $$ \mathrm{X}=\sum_{i=1}^{m}(\mathrm{Y}_i) \sim N\left(m ~\mathrm{E}(\mathrm{Y}_i),m ~\mathbf{Var}(\mathrm{Y}_i)\right).$$ Take the example of standard uniform iid \(U_1, U_2, \dots, U_m\), each with a mean of \(\frac{1}{2}\) and a variance of \(\frac{1}{12}\). Define the new random variable $$ \mathbf{Z}=\frac{\sum_{i=1}^{m}(U_i) -\frac{m}{2}}{\sqrt{\frac{m}{12}}}. $$ \(\mathbf{Z}\) is then normally distributed with \(\mathbf{Z} \sim N(0,1)\) when \(m \rightarrow \infty\). Note that using the CLT method to generate normal distributions is a time consuming method and not so efficient.


The inverse cumulative distribution function method
Theorem
Given a random variable \(X\) that follows a distribution and that its cumulative distribution function (CDF) is a monotonic non-decreasing function of \(x\), which is given by$$ P_{\mathbf{X}}(x)=Pr(\mathbf{X}\leq x). $$
Similarly we define the CDF of a random variable \(Y\) as \(P_{\mathbf{Y}}(y)=Pr(\mathbf{Y}\leq y)\) such that both functions \(P_{\mathbf{Y}}(y)\) and \(P_{\mathbf{X}}(x)\) have probability values in the interval \([0,1]\). If \(P_{\mathbf{Y}}(y)\) is continuous, then for every value \(x \in \mathfrak{R}\) there exists a value \(y \in \mathfrak{R}\) such that \(P_{\mathrm{X}}(x)=P_{\mathrm{Y}}(y)\). Now let \(\mathrm{Y}\sim U[0,1]\) such that \(P_{\mathrm{X}}(x)=P_{\mathrm{Y}}(y)=y\), which gives \(x=P_{\mathrm{X}}^{-1}(y)\). This is accurate when \(P_{\mathrm{X}}\) and \(P_{\mathrm{Y}}\) are strictly increasing functions, while we need to define \(P_{\mathrm{X}}^{-1}\) as $$P_{\mathrm{X}}^{-1}(y)=\text{inf} \{x:P_{\mathrm{X}}(x) \geq y\},$$ when \(P_{\mathrm{X}}\) is a non-decreasing function.

Example
If \(\mathrm{X} \sim Exp(\lambda)\) (exponential distribution) with density function \(f(x)\) defined for \(x\in[0,\infty)\) as \(f(x)=\lambda e^{-\lambda x}\). We then integrate \(f(s)\) over \([0,x]\) to get the cumulative distribution function of \(X\) \(P_{\mathrm{X}}(x)=1- e^{-\lambda x}\). Now let \(U\) be a random variable uniformly distributed over [0,1]. Then the random variable $$\mathrm{X}=-\frac{1}{\lambda}\mathbf{log}(1-U)=-\frac{1}{\lambda}\mathbf{log}(U)$$ follows an exponential distribution with parameter \(\lambda\).

Example
If \(\mathrm{X} \sim \text{Poisson}(\lambda)\) Poisson process with intensity \(\lambda\) and density function \(f(k)=\frac{\lambda^k e^{-\lambda}}{k!}\) for \(k=0,1,\dots\). We calculate \(P_{\mathrm{X}}(k)=Pr(\mathrm{X}\leq k)\), for \(k=0,1,\dots\), then generate \(\mathrm{Y} \sim U[0,1]\) and take \(\mathrm{X}=k\) if \(P_{\mathrm{X}}(k-1)<\mathrm{Y} \leq P_{\mathrm{X}}(k)\).

Exercise Given that \(\mathrm{X} \sim U[0,1]\) and that a sample of size 9 has been drawn from the distribution of \(\mathrm{X}\) with values \(\{0.11,0.22,0.33,0.44,0.55,0.66,0.77,0.88,0.99\}\), generate a sample of a poisson distribution from the previous sample given that the intensity \(\lambda=1\).


Accept-Reject method
There are some distributions that are difficult to generate by the inverse cumulative distribution function method. The Accept-Reject method can be an alternative. This method is based on Monte Carlo technique. We intend to generate random numbers that belong to the interval \([a,b]\) and that follow a statistical distribution with density function \(f(x)\). In order to do so, we take the rectangle bounded by the intervals [a,b] in x-axis and [0,M] in the y-axis where the mode of the density function \(f(x)\) is \(M\), with \(M \geq f(x) ~\forall x\in [a,b]\). We then plot the probability density function \(f(x)\) of the random variable \(\mathrm{X}\) inside the rectangle. We generate a pair of random numbers from two independent distributions \(\mathrm{Z}\sim U[0,1]\) and \(\mathrm{Y}\sim g\). \(g\) is a density function to be chosen such that we know how to generate a sample from its distribution. We then accept \(\mathrm{X}=\mathrm{Y}\) if \(\mathrm{Z} \leq \frac{1}{c} \frac{f(\mathrm{Y})}{g(\mathrm{Y})}\) with an acceptance rate of \(\frac{1}{c}\), where \(c\) satisfies \(\frac{f(x)}{g(x)} \leq c ~\forall x \in[a,b]\).

Why does the Accept-Reject method work this way? $$ P \left (\mathrm{Y} \leq x|\mathrm{Z} \leq \frac{1}{c} \frac{f(\mathrm{Y})}{g(\mathrm{Y})} \right)= \frac{P \left(\mathrm{Y} \leq x \cap \mathrm{Z} \leq \frac{1}{c} \frac{f(\mathrm{Y})}{g(\mathrm{Y})} \right) }{P \left(\mathrm{Z} \leq \frac{1}{c} \frac{f(\mathrm{Y})}{g(\mathrm{Y})} \right) }\\= \frac{\int_{-\infty}^x \int_0^{\frac{f(y)}{c~g(y)}} 1\mathrm{d}z~ g(y)\mathrm{d}y}{\int_{-\infty}^{\infty} \int_0^{\frac{f(y)}{c~g(y)}} 1\mathrm{d}z~ g(y)\mathrm{d}y}\\= \frac{\frac{1}{c} \int_{-\infty}^x \frac{f(y)}{g(y)} g(y)\mathrm{d}y}{\frac{1}{c}\int_{-\infty}^{\infty} \frac{f(y)}{g(y)} g(y)\mathrm{d}y}\\= \int_{-\infty}^x \frac{f(y)}{g(y)} g(y)\mathrm{d}y~=~P_{\mathrm{X}}(\mathrm{X} \leq x). $$ The simplest application is when we set \(Y\sim U[a,b]\) such that \(Y\) is uniformly distributed with density function \(g(y)=\frac{1}{b-a} ~\forall y\). We can then set \(c=\frac{M}{g(y)}=M(b-a)\).

Exercise
Given two sequences of independent random numbers, \(u_i\) and \(v_i\), each generated from a uniform distribution over [0,1]. Write a pseudo-code that uses Accept-Reject method to generate a sequence of normally distributed random numbers with zero mean and standard deviation of 1 for values between -3 and 3 with an acceptance rate of \(\frac{1}{3}\).


Multivariate normal distribution
Given a random vector of \(s\) random variables \(X=[X_1, X_2, \dots, X_s]^T\) and a mean vector of \(\mu=[\mu_1, \mu_2, \dots, \mu_s]^T\) such that the random variable \(X_i\) has a mean of \(\mu_i\). Introduce the symmetric covariance matrix for \(s\) random variables $$ \begin{align*} \Sigma= \left( \begin{array}{ccc} \sigma_1^2& \cdots & \sigma_{1,s} \\ \vdots & \ddots & \vdots\\ \sigma_{s,1}& \cdots & \sigma_s^2 \\ \end{array} \right),~ \end{align*} $$ where \(\sigma_i\) is the standard deviation of \(X_i\), and \(\sigma_{i,j}=\sigma_i \sigma_j \rho_{i,j}\) is the covariance between \(X_i\) and \(X_j\) with correlation coefficient \(\rho_{i,j}\). We then say that the random vector \(X\) follows a normal distribution with mean vector \(\mu\) and covariance matrix \(\Sigma\) and we use the notation \(X \sim N(\mu,\Sigma)\). Assume that \(Z=[Z_1, Z_2, \dots, Z_s]^T\) is a random vector consisting of \(s\) independent normally distributed variables each with a mean of 0 and a standard deviation of 1 such that \(Z\sim N(0_s,I_s)\), where \(0_s\) is a zero vector of size \(s\) and \(I_s\) is the \(s \times s\) identity matrix. We can then generate \(X\) from \(Z\) by defining \(X\) as $$ X=\mu+A Z, $$ where \(A\) is any \(s \times s\) matrix such that \(A A^T=\Sigma\). The matrix \(A\) is not unique and there are many ways to find such a matrix. Cholesky factorization method is one of the commonly used methods to calculate \(A\) as a lower triangular matrix.

Example Generate a random vector \(Y=[Y_1,Y_2]\) such that \(Y_1\) and \(Y_2\) are correlated with a correlation coefficient \(\rho\). Assume that \(Y_1\) has a mean of \(\mu_1\) and a standard deviation of \(\sigma_1\), while \(Y_2\) has a mean of \(\mu_2\) and a standard deviation of \(\sigma_2\). We then have$$ \mu= \left[ \begin{array}{cc} \mu_1 \\ \mu_2 \end{array} \right],~~ \Sigma= \left[ \begin{array}{cc} \sigma_1^2 & \sigma_1 \sigma_2 \rho \\ \sigma_1 \sigma_2 \rho & \sigma_2^2 \end{array} \right].~ $$
Let$$ A= \left[ \begin{array}{cc} a_1 & 0 \\ a_3 & a_4 \end{array} \right], ~~ \text{such that} ~~ AA^T= \left[ \begin{array}{cc} a_1^2 & a_1 a_3 \\ a_1 a_3 & a_3^2+a_4^2 \end{array} \right]. $$ From the comparison between \(AA^T\) and \(\Sigma\), we get$$ A= \left[ \begin{array}{cc} \sigma_1 & 0 \\ \rho \sigma_2 & \sqrt{1-\rho^2}~\sigma_2 \end{array} \right]. $$
We can then write$$ Y_1=\mu_1+\sigma_1 Z_1,\\ Y_2=\mu_2+\rho \sigma_2 Z_1+ \sqrt{1-\rho^2}~\sigma_2 Z_2. $$ where \(Z_1 \sim N(0,1)\) and \(Z_2 \sim N(0,1)\) are independent random variables.



References
  1. Casella G., (2008), Monte Carlo Statistical Methods.
  2. Glasserman P., 2004, Monte Carlo Methods in Financial Engineering, Springer.
  3. Gentle J.E., (2005), Random Number Generation and Monte Carlo Methods, 2nd edition, Springer.
  4. Hull. T.E. and Dobell A.R, (1962), Random Number Generators, SIAM Rev., 4(3) (July 1962), 230-254.
  5. Jiang X. and Birge J.R., (2004), Comparisons of alternative quasi-Monte Carlo sequences for American option pricing, working paper.
  6. Niederreiter H., (1992), Random Number Generation and Quasi-Monte Carlo Methods, Siam.

No comments:

Post a Comment