randomnumpynumbersnon-repetitive

Non-repetitive random number in numpy


How can I generate non-repetitive random numbers in numpy?

list = np.random.random_integers(20,size=(10))

Solution

  • numpy.random.Generator.choice offers a replace argument to sample without replacement:

    from numpy.random import default_rng
    
    rng = default_rng()
    numbers = rng.choice(20, size=10, replace=False)
    

    If you're on a pre-1.17 NumPy, without the Generator API, you can use random.sample() from the standard library:

    print(random.sample(range(20), 10))
    

    You can also use numpy.random.shuffle() and slicing, but this will be less efficient:

    a = numpy.arange(20)
    numpy.random.shuffle(a)
    print a[:10]
    

    There's also a replace argument in the legacy numpy.random.choice function, but this argument was implemented inefficiently and then left inefficient due to random number stream stability guarantees, so its use isn't recommended. (It basically does the shuffle-and-slice thing internally.)

    Some timings:

    import timeit
    print("when output size/k is large, np.random.default_rng().choice() is far far quicker, even when including time taken to create np.random.default_rng()")
    print(1, timeit.timeit("rng.choice(a=10**5, size=10**4, replace=False, shuffle=False)", setup="import numpy as np; rng=np.random.default_rng()", number=10**3)) #0.16003450006246567
    print(2, timeit.timeit("np.random.default_rng().choice(a=10**5, size=10**4, replace=False, shuffle=False)", setup="import numpy as np", number=10**3)) #0.19915290002245456
    
    print(3, timeit.timeit("random.sample( population=range(10**5), k=10**4)", setup="import random", number=10**3))   #5.115292700007558
    
    print("when output size/k is very small, random.sample() is quicker")
    print(4, timeit.timeit("rng.choice(a=10**5, size=10**1, replace=False, shuffle=False)", setup="import numpy as np; rng=np.random.default_rng()", number=10**3))  #0.01609779999125749
    print(5, timeit.timeit("random.sample( population=range(10**5), k=10**1)", setup="import random", number=10**3))  #0.008387799956835806
    

    So numpy.random.Generator.choice is what you usually want to go for, except for very small output size/k.