Random Number Code Size

In an ideal world, a good random number generator ought not to require a lot of code code to implement or execute.

A Simple Example

Let's consider this example of code that prints out eight random numbers.

#include "pcg_random.hpp"
#include <cstdio>

int main()
{
    pcg32 rng;

    for (int i = 0; i < 8; ++i)
        printf("%08x\n", rng());
}

And let's see how much code we'd produce with PCG and if we swapped pcg32 for a few other generators. (The code was compiled with GCC 4.8 using -O2 -fno-asynchronous-unwind-tables -fno-exceptions -fno-align-loops.)

Generator Init Instrs Init Bytes Loop Instrs Loop Bytes
PCG 64/32 (XSH-RR) 4 35 15 52
Minstd (LCG) 3 16 17 61
XorShift* 64/32 3 25 17 60
Ran 6 55 33 115
MT 19937 51 197 49 174

We can see here that the PCG family is comparable in code size to other generators known for their compact code, and actually has fewer instructions inside the loop. The Mersenne Twister and Ran (from Numerical Recipes) are much worse for code size—their implementations are much more complex, and it shows in the generated code (the same also applies arc4random).

The Actual Code

FWIW, here's the code that was generated by GCC for the above code:

_main:
## Function prologue:
        pushq   %r13
        pushq   %r12
        pushq   %rbp
        pushq   %rbx
        subq    $8, %rsp
## Initialize RNG and loop constants
        movabsq $5573589319906701683, %rcx
        movl    $8, %ebp
        movabsq $6364136223846793005, %r13
        movabsq $1442695040888963407, %r12
        .align 4
## Loop
L2:
        movq    %rcx, %rbx
        movq    %rcx, %rsi
        xorl    %eax, %eax
        imulq   %r13, %rbx
        shrq    $18, %rsi
        leaq    LC0(%rip), %rdi   ## string "%08x\n"
        xorq    %rcx, %rsi
        shrq    $59, %rcx
        shrq    $27, %rsi
        rorl   %cl, %esi
        call    _printf
        addq    %r12, %rbx
        movq    %rbx, %rcx
        subl    $1, %ebp
        jne     L2
## Function Epilogue
        addq    $8, %rsp
        xorl    %eax, %eax
        popq    %rbx
        popq    %rbp
        popq    %r12
        popq    %r13
        ret

Interestingly, Clang produces quite different code:

_main:
## Function prologue:
        pushq   %rbp
        movq    %rsp, %rbp
        pushq   %rbx
        pushq   %rax
## Load string constant:
        leaq    L_.str(%rip), %rbx   ## string "%08x\n"
## Directly print out eight numbers
        movl    $676697322, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $420258633, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $-876335118, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $-699367085, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $-1029176017, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $257272927, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $-687915470, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        movl    $1330014364, %esi
        xorl    %eax, %eax
        movq    %rbx, %rdi
        callq   _printf
        xorl    %eax, %eax
## Function Epilogue
        addq    $8, %rsp
        popq    %rbx
        popq    %rbp
        retq

Because the generator is initialized with a fixed seed, and because the steps PCG uses to generate random numbers are simple, Clang's optimizer runs the generator at compile time and produces the output constants directly.