There is no way to produce true random numbers on a computer. Once a program is written, the numbers it will produce can be deduced (see example), so how could they be random? The best we can hope to do is to write programs which produce sequence of numbers having many of the same properties as random numbers. Such numbers are commonly called pseudo-random numbers: they are not really random, but they can be useful as an approximations to random numbers.
It's easy to see that approximate the property "each number is equally likely to occur" in a long sequence is not enough. For example, each number from the range [1, 100] appears exactly once in the sequence {1, 2, 3, ..., 99, 100}, but that sequence is unlikely to be useful as an approximation to a random sequence.In fact, in a random sequence of length 100 of numbers in the range [1, 100], it is likely that a few numbers will appear more than once and a few will not appear at all. If this doesn't happen in a sequence of pseudo-random numbers, then there is something wrong with the random number generator.
a[0] = seed;
for(i=1;i<N;i++)
a[i] = (a[i-1]*b + c) % M;
All the generated numbers are integers between 0 and M-1.
The algorithm looks very simple, but this simplicity is misleading because the devil lies in the details of choosing the algorithm's parameters. Here is a partial list of recommendations based on the results of sophisticated mathematical analysis (see Knuth, volume 2 for details):
class Random { public: Random() { a = seed = 1234567; }; int next(); private: int seed; int a; enum { M=100000000, b=31415821, c=1 }; }; int Random::next() { a = (a*b + c) % M; return a; }This simple implementation has several problems. The first of them is possible overflow. Although all our parameters are less than the largest integer that can be represented, but the first operation a*b+c can cause an overflow. However, since the part of the product that causes overflow is not relevant to our computation (remember, we are interested in only the last 8 digits), we can use the following trick that helps us to avoid overflow.
Assume that we need to compute the product of two integer numbers p and q modulo M. We will write p as
p = 104*p1 + p0and q as
q = 104*q1 + q0Then their product can be written as
p*q = (104*p1+p0) * (104*q1+q0)
= 108*p1*q1 + 104*(p1*q0+p0*q1) + p0*q0
Now, since we only want the last 8 digits of the result, we can ignore the first term. Thus,
(p*q) mod 108 = (104*(p1*q0+p0*q1)+p0*q0) mod 108To prevent the result of computation 104*(p1*q0+p0*q1) from overflow, we can compute this
((p1*q0+p0*q1) mod 104 )104instead.
class Random { public: Random() { seed = 1234567; }; int next(); private: int seed; int a; union { M1=10000, M=M1*M1, b=31415821, c=1 }; int mult(int p, int q); }; int Random::mult(int p, int q) { int p1 = p/M1, p0 = p%M1, q1 = q/M1, q0 = q%M1; return (((p0*q1+p1*q0)%M1)*M1 + p0*q0) % M; } int Random::next() { a = (mult(a, b) + c) % M; return a; }
Now, if we run the following code to generate 20 random numbers,
Random r; for(i=0;i<20;i++) cout<<r.next()<<endl;we will receive
35884508, 80001069, 63512650, 43635651, 1034472, 87181513, 6917174, 209855, 67115956, 59939877, 46594018, 29158779, 81642560, 50941761, 45000782, 12172023, 95775884, 27860765, 6163066, 78267187There is some obvious non-randomness in these numbers: for example, the last digits cycle through the digits 0 - 9. In general, the digits on the right are not random, a fact that is the source of a common and serious mistake in the use of random number generator of this type. Typically, in applications one wants to random numbers in the range [0, r), where r is application dependent and cannot conveniently be built in to the random number generator.
The following is a naive attempt that results in a bad program for generating random numbers is a small range:
int Random::next(int r) { return next() % r; }For example, the following code, which is trying to retrieve 20 random numbers in the range [0, 9]
Random r; for(i=0;i<20;i++) cout<<r.next(10)<<endl;will result in
8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7This problem is easily fixed by using the digits on the left instead. To retrieve the desired digits on the left we need to compute (a*r)/M, but, again, overflow must be circumvented,a s in the following implementation:
int Random::next(int r) { return ((next()/M1) * r) / M1; }If we use this implementation of the next(10) method, then the same code will produce
3, 8, 6, 4, 0, 8, 0, 0, 6, 5, 4, 2, 8, 5, 4, 1, 9, 2, 0, 7
The final version of the implementation of our random generator can be found on the example page.
The idea of the χ2 test is to check whether or not the numbers produced are spread out reasonably. If we generate N positive numbers less than r, then we'd expect to get about N/r numbers of each value. But - and this is the essence of the matter - the frequencies of occurrence of all the values should not be exactly the same. It turns out that calculating whether or not a sequence of numbers is distributed as well as a random sequence is not difficult. The "χ2 statistic" may be expressed mathematically as
| χ2 = | Σi=0 r-1 (fi-N/r)2 |
| N/r |