pythonpandasdataframenumpy

How to create a DataFrame of random integers with Pandas?


I know that if I use randn, the following code gives me what I am looking for, but with elements from a normal distribution. But what if I just wanted random integers?

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 4), columns=list('ABCD'))

randint works by providing a range, but not an array like randn. So how do I do this with random integers between some range?


Solution

  • numpy.random.randint accepts a third argument (size) , in which you can specify the size of the output array. You can use this to create your DataFrame -

    df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
    

    Here - np.random.randint(0,100,size=(100, 4)) - creates an output array of size (100,4) with random integer elements between [0,100) .


    Demo -

    import numpy as np
    import pandas as pd
    df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
    

    which produces:

         A   B   C   D
    0   45  88  44  92
    1   62  34   2  86
    2   85  65  11  31
    3   74  43  42  56
    4   90  38  34  93
    5    0  94  45  10
    6   58  23  23  60
    ..  ..  ..  ..  ..