Specific Problems with Other RNGs
There are many random number generators already out there. On this page, we will look at a some of the most popular and well known. These RNGs have many good qualities, but we will focus mostly on their flaws.
Mersenne Twister
This RNG is one of the most widely used and highly recommended RNGs. It is provided in C++11 as mt19937 and mt19937_64, and is also the default random number generator for Python. (See the Mersenne Twister Wikipedia page for more details about the points listed below.)
Positive Qualities
Produces 32-bit or 64-bit numbers (thus usable as source of random bits)
Passes most statistical tests
Neutral Qualities
Inordinately huge period of 219937 - 1
623-dimensionally equidistributed
Period can be partitioned to emulate multiple streams
Negative Qualities
Fails some statistical tests, with as few as 45,000 numbers.
Predictable — after 624 outputs, we can completely predict its output.
Generator state occupies 2504 bytes of RAM — in contrast, an extremely usable generator with a huger-than-anyone-can-ever-use period can fit in 8 bytes of RAM.
Not particularly fast.
Not particularly space efficient. The generator uses 20000 bits to store its internal state (20032 bits on 64-bit machines), but has a period of only 219937, a factor of 263 (or 295) fewer than an ideal generator of the same size.
Uneven in its output; the generator can get into “bad states” that are slow to recover from.
Seedings that only differ slightly take a long time to diverge from each other; seeding must be done carefully to avoid bad states.
While jump-ahead is possible, algorithms to do so are slow to compute (i.e., require several seconds) and rarely provided by implementations.
Minstd and Unix rand
Minstd and rand (some implementations) are both examples of linear congruential generators with a non-power-of-two modulus. (See the Minstd Wikipedia page for more details.)
Minstd is provided as minstd and minstd0 in C++11, and is also currently the RNG that will be selected by LLVM's libc++ and GCC's libstdc++ if you select default_random_engine.
Positive Qualities
Uses a very small amount of memory
Efficient jump-ahead is possible
Negative Qualities
Short period
Predictable — after producing just one random number, we can completely predict its output.
Not especially good for producing 32-bit numbers (or a stream of random bits)
Only produces each number once
Fails many statistical tests
-
Minstd is fairly slow (whereas speed of
randvaries across implementations)Using a non-power-of-2 modulus require a division operation which can be slow to implement.
Typical implementations do not provide multiple streams, although it would be possible to do so
Typical implementations do not provide jump-ahead, although it would be possible to do so
Arc4Random
This RNG is provided on OS X and BSD systems as arc4random, which is based on the trade-secret RC4 algorithm—see the RC4 Wikipedia page for more details. (Recently, on OpenBSD systems, a generator based on the ChaCha20 stream cipher, discussed below, is provided under the name arc4random.)
Positive Qualities
Cryptographically secure (not predictable, widely used as a stream cipher).
Passes TestU01's empirical statistical tests (but see below)
Produces 32-bit numbers (thus usable a stream of random bits)
Neutral Qualities
Inordinately huge expected period of approximately 21699
Different initializations can be expected to provide different random streams
Negative Qualities
-
Statistically mediocre
Although the standard variant does pass TestU01's
BigCrushbattery, that isn't much of an achievement given 2064 bits of internal state—a simple LCG passes with 88 bits of internal state! If we reduce the number of S-boxes from 256 to 64, requiring 396 bits of internal state, it still passes, but if we reduce the number to 32, which is 170 bits of state, it fails badly.
-
Has been mathematically shown to be nonuniform.
In fact, even though the test is not included in TestU01's suite, tests exist that can distinguish the output of
arc4randomfrom a true random sequence.
Unlike some generators with a very large period, does not provide k-dimensional equidistribution.
-
Period varies based on seed.
In fact, in FreeBSD prior to the 7.1 release (late 2008), OS X prior to OS X Lion, and iOS prior to iOS 5, the implementation of
arc4randomdid not take steps to avoid very short periods—more than 1 time in 75,000, the period of the generator could be tiny, producing only 16320 integers before repeating.
Very slow.
-
The BSD implementations (which are the only widely used ones) have several additional issues
No facility for a user-provided seed, preventing programs from getting reproducible results
Periodically “stirs” the generator using kernel-provided entropy; this code must be removed if reproducible results desired (and is removed from the C++ adaptation I created for testing purposes)
Has been repeatedly “fixed” to address bugs and security issues.
OpenBSD was sufficiently dissatisfied with it that they replaced it with a generator based on the ChaCha20 stream cipher.
ChaCha20
This RNG is provided on OpenBSD systems as a replacement for arc4random (and somewhat confusingly is provided under the name arc4random for backwards compatibility). It is based on Daniel J. Bernstein's ChaCha20 stream cipher, which is a variant of his Salsa20 cipher.
Positive Qualities
Cryptographically secure (not predictable, has been scrutinized by cryptographic community).
Supports different variants (e.g., ChaCha8) which try less hard to be secure, but run faster by reducing the number of rounds.
Passes TestU01's empirical statistical tests (but see below)
Produces 32-bit numbers (thus usable a stream of random bits)
The ChaCha20 stream cipher is claimed be about 3× faster than the Arc4 stream cipher (but see below).
The ChaCha20 stream cipher is seekable, allowing jump-ahead (but this feature isn't provided by the OpenBSD implementation).
-
About 70% faster than
arc4randomwhen used as a general-purpose 32-bit random number generatorAssuming they're are compared on an equal footing (with periodic “stirring” code removed)
In the case of OpenBSD, their
arc4randomimplementation “stirs” (i.e., requests external entropy from the kernel) far more frequently than their ChaCha20 implementation, resulting in a 4.5× speed improvement on that platform.
Neutral Qualities
Designed for uses as a stream cipher, thereby produces blocks of random numbers. Although the block-a-time approach can allow some optimizations, it can waste space when as a typical one-at-a-time general-purpose RNG.
Negative Qualities
Slow, compared to general purpose generators (although it is 70% faster than
arc4random, it is an order of magnitude slower than high performing general-purpose RNGs).OpenBSD's implementation uses 1104 bytes to store the generator state (about 4× the space for
arc4random, because it computes sixteen 64-byte blocks at a time). More positively, the C++ implementation by Orson Peters only requires 104 bytes because it only stores a single 64-byte block.Has a much smaller period than might be inferred from the size of the generator state.
Fewer rounds result in poor statistical performance; ChaCha2 fails statistical tests badly, and ChaCha4 passes TestU01 but sophisticated mathematical analysis has shown it to exhibit some bias. ChaCha8 (and higher) are believed to be good. Nevertheless, ChaCha needs to go to more work to achieve satisfactory statistical quality than many other generators.
ChaCha20, being newer, has received less scrutiny from the cryptographic community than Arc4.
Unlike some generators with a very large period, it does not provide k-dimensional equidistribution.
-
Unlike C++ implementation by Orson Peters, OpenBSD's implementation (which is quite commonly suggested as a “good” implementation) has several additional issues:
No facility for a user-provided seed, preventing programs from getting reproducible results
Periodically “stirs” the generator using kernel-provided entropy; this code must be removed if reproducible results desired (in testing its speed, I deleted this code)
Provided as a single, global RNG.
Multithreaded programs must share the global RNG, and contend for the spin lock that protects it.
Unix drand48, Java's java.util.Random
These generators are examples of linear congruential generators with a power-of-two modulus. (See the Linear Congruential Generator Wikipedia page for more details.)
Positive Qualities
Fast (assuming multiplication is fast)
Uses a small amount of memory
Efficient jump-ahead is possible
Good for producing 32-bit numbers (or a stream of random bits)
Negative Qualities
Fairly short period
Predictable (although generating 48 bits and dropping the low 16 bits makes it very slightly harder to predict than
rand)Fails very many statistical tests
Typical implementations do not provide multiple streams, although it would be possible to do so
Typical implementations do not provide jump-ahead, although it would be possible to do so
Unix random
Unix random is an example of a linear-feedback shift register RNG. (See the Linear Feedback Shift Register Wikipedia page for more details.)
Positive Qualities
Fast
Good for producing 32-bit numbers (or a stream of random bits)
Negative Qualities
Fairly short period
Predictable — after producing 16 random numbers, we can completely predict its output
Fails very many statistical tests
Typical implementations do not provide jump-ahead, although it would be possible to do so (the implementation would be fairly complex, however)
Technically not uniform — zero will be produced once less often than every other output
XorShift (32-bit and 64-bit)
These examples of XorShift generators, kind of generalized linear feedback shift register. (See the XorShift Wikipedia page for more details.)
Positive Qualities
Fast (even if multiplication isn't fast)
Uses a small amount of memory
Good for producing 32-bit numbers (or a stream of random bits)
Negative Qualities
Fairly short period (32-bit version only)
Predictable. After producing just one random number, we can completely predict its output.
Fails many statistical tests
Technically not uniform — zero will be produced once less often than every other output
32-bit XorShift should usually not be used to produce 32-bit numbers, because it only produces each number once, and never produces zero.
Does not provide multiple streams
Typical implementations do not provide jump-ahead, although it would be possible to do so (the implementation would be fairly complex, however)
RanQ and XorShift* 64/32
These generators use a generalization of a linear feedback shift register design, coupled with a multiplicative step for the output function. RanQ and XorShift* 64/32 are the same generator with 64 bits of state, but XorShift* 64/32 returns only 32 bits of output, whereas RanQ outputs all 64 bits. (See the XorShift Wikipedia page for more details.)
Positive Qualities
Fast (if multiplication is fast)
Uses a small amount of memory
Good for producing 32-bit numbers (or a stream of random bits)
Easily passes empirical statistical tests (XorShift* 64/32 only)
Neutral Qualities
Period of 264 is enough for many uses, but in some contexts, it would be considered consider “too small”
Negative Qualities
Technically not uniform — zero will be produced once less often than every other output
Predictable — after producing just one random number, we can completely predict its output (RanQ only).
RanQ claims to be a 64-bit generator but its low-order 32 bits fail statistical tests.
Does not provide multiple streams
Typical implementations do not provide jump-ahead, although it would be possible to do so (the implementation would be fairly complex, however)
PCG Family
This page is supposed to be about other RNGs besides the PCG family, but we may as well show how the PCG family compares.
Positive Qualities
Fast (if multiplication is fast)
Uses a small amount of memory (although the extended generators allow it to use an arbitrary amount of additional memory to extend the period)
Good for producing b-bit numbers for any b (or a stream of random bits)
Easily passes empirical statistical tests (and offers better statistical performance than any of the above generators)
pcg32offers a 264 period and 263 distinct random streams, but arbitrarily large periods are possible (e.g., the library providespcg32_k16384which has a period of 2524352, which is vastly larger than the computational capacity of the universe)Uniform
Extended generation scheme provides k-dimensional equidistribution for arbitrary k
Challenging to predict
Offers jump ahead and distance-between-states
Neutral Qualities
Can perform party tricks, like creating a generator which will produce exactly 3,141,592,653,589,793,238,463 random numbers, and then suddenly output a Zip file containing Hamlet.
Negative Qualities
New
Although it is less trivial to predict than many mainstream generators, that does not mean it should be considered crypographically secure. Exactly how hard it is to break different members of the PCG family is unknown at this time.