pythonrandomlcg

Linear congruential generator - how to choose seeds and statistical tests


I need to do a linear congruential generator that will successfully pass the selected statistical tests.

My question is: how to choose numbers for the generator properly and which statistical tests should I choose?

I thought about:

  1. Chi-Square Frequency Test for Uniformity

    • Collect 10,000 numbers per generation method

    • Sub-divide[0.1) into 10 equal subdivisions

  2. Kolmogorov-Smirnov Test for uniformity

    • Since K-S Test works better with a smaller set of numbers, you may use the first 100 out fo the 10,000 that you generated for the Chi-Square Frequency Test

Here is the code example:

def seedLCG(initVal):
    global rand
    rand = initVal

def lcg():
    a = 1664525
    c = 1013904223
    m = 2**32
    global rand
    rand = (a*rand + c) % m
    return rand

seedLCG(1)

for i in range(1000):
    print (lcg())

when it comes to choosing seeds, I was thinking about nanoseconds, but I have no idea how to implement it and will it make sense at all? The idea is to show that the selected seeds were chosen randomly and not so much from the cap


Solution

  • Wrt how to choose numbers for the generator properly, in Wiki page there is a description of Hull–Dobell Theorem which tells you how to pick a and c to have full period generator. You got your numbers from Numerical Recipes, and as far as I could tell you'll get full period [0...232) generator. Or you could look at Figure of Merit from this paper, there are (a,c) pairs for any size of period desirable.

    Concerning tests, take a look at paper @pjs provided.

    when it comes to choosing seeds, I was thinking about nanoseconds, but I have no idea how to implement it and will it make sense at all? The idea is to show that the selected seeds were chosen randomly and not so much from the cap. This is, I think, is not a good idea because you cannot guarantee seeds you picked from time/ceil/... won't overlap. LCG is basically bijective [0...232)<->[0...232) mapping, and it is relatively easy to overlap streams of random numbers so your result(s) are correlated.

    Instead I would propose to use another nice property of the LCG - logarithmic skip ahead (and back). So for simulating on N cores you could just pick single seed and run on 1st code, same seed but skip(N/232) for second core, seed and skip(N/232 * 2) and so on and so forth.

    Code for LCG with explicit state and skip is below, Win10 x64, Python 3.7 Anaconda

    import numpy as np
    
    class LCG(object):
    
        UZERO: np.uint32 = np.uint32(0)
        UONE : np.uint32 = np.uint32(1)
    
        def __init__(self, seed: np.uint32, a: np.uint32, c: np.uint32) -> None:
            self._seed: np.uint32 = np.uint32(seed)
            self._a   : np.uint32 = np.uint32(a)
            self._c   : np.uint32 = np.uint32(c)
    
        def next(self) -> np.uint32:
            self._seed = self._a * self._seed + self._c
            return self._seed
    
        def seed(self) -> np.uint32:
            return self._seed
    
        def set_seed(self, seed: np.uint32) -> np.uint32:
            self._seed = seed
    
        def skip(self, ns: np.int32) -> None:
            """
            Signed argument - skip forward as well as backward
    
            The algorithm here to determine the parameters used to skip ahead is
            described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
            Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
            O(log2(N)) operations instead of O(N). It computes parameters
            A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
            """
    
            nskip: np.uint32 = np.uint32(ns)
    
            a: np.uint32 = self._a
            c: np.uint32 = self._c
    
            a_next: np.uint32 = LCG.UONE
            c_next: np.uint32 = LCG.UZERO
    
            while nskip > LCG.UZERO:
                if (nskip & LCG.UONE) != LCG.UZERO:
                    a_next = a_next * a
                    c_next = c_next * a + c
    
                c = (a + LCG.UONE) * c
                a = a * a
    
                nskip = nskip >> LCG.UONE
    
            self._seed = a_next * self._seed + c_next
    
    
    #%%
    np.seterr(over='ignore')
    
    a = np.uint32(1664525)
    c = np.uint32(1013904223)
    seed = np.uint32(1)
    
    rng = LCG(seed, a, c)
    
    print(rng.next())
    print(rng.next())
    print(rng.next())
    
    rng.skip(-3) # back by 3
    print(rng.next())
    print(rng.next())
    print(rng.next())
    
    rng.skip(-3) # back by 3
    rng.skip(2) # forward by 2
    print(rng.next())
    

    UPDATE

    Generating 10k numbers

    np.seterr(over='ignore')
    
    a = np.uint32(1664525)
    c = np.uint32(1013904223)
    seed = np.uint32(1)
    
    rng = LCG(seed, a, c)
    q = [rng.next() for _ in range(0, 10000)]