pythonlistrandom

Create an N length list by uniformly (in frequency) selecting items from a separate list in python


SETUP

I have a list days and a value N

days = ['Monday','Tuesday','Wednesday','Thursday','Friday']
N = 52

WHAT I AM TRYING TO DO

I am trying to create a list selections with length N where I uniformly in frequency sample values from days (remainders are fine). I would like the order of this list to then be shuffled.

EXAMPLE OUTPUT

NOTE HOW THE ORDER IS SHUFFLED, BUT THE DISTRIBUTION OF VALUES IS UNIFORM

selections

['Wednesday','Friday','Monday',...'Tuesday','Thursday','Monday']

import collections
counter = collections.Counter(selections)
counter
Counter({'Monday': 11, 'Tuesday': 10, 'Wednesday': 11, 'Thursday': 10, 'Friday': 10})

WHAT I HAVE TRIED

I have code to randomly select N values from days

from random import choice, seed

seed(1)

days = ['Monday','Tuesday','Wednesday','Thursday','Friday']
N = 52

selections = [choice(days) for x in range(N)]

But they aren't selected uniformly

import collections
counter = collections.Counter(selections)
counter

Counter({'Tuesday': 9,
         'Friday': 8,
         'Monday': 14,
         'Wednesday': 7,
         'Thursday': 14})

How can I adjust this code or what different method will create a list of length N with a uniform distribution of values from days in a random order?


EDIT: I obviously seemed to have phrased this question poorly. I am looking for list with length N with a uniform distribution of values from days but in a shuffled order (what I meant by random.) So I suppose what I am looking for is how to uniformly sample values from days N times, then just shuffle that list. Again, I want an equal amount of each value from days making up a list with length N. I need a uniform distribution for a list of exactly length 52, just as the example output shows.


Solution

  • The code you have is correct. You are seeing expected noise around the mean.

    Note that for higher N, the relative noise decreases, as expected. For example, this is what you get for N = 10000000:

    Counter({'Tuesday': 2000695, 'Thursday': 2000615, 'Wednesday': 2000096, 'Monday': 1999526, 'Friday': 1999068})
    

    If you need equal or approximately equal (deterministic, rather than random) numbers of each element in random order, try a combination of itertools.cycle, itertools.islice and random.shuffle like so:

    
    import random
    import collections
    import itertools
    
    random.seed(1)
    
    days = ['Monday','Tuesday','Wednesday','Thursday','Friday']
    N = 52
    
    # If `N` is not divisible by `len(days)`, this line ensures that the last 
    # `N % len(days)` elements of `selections` also stay random:
    random.shuffle(days)
    
    selections = list(itertools.islice(itertools.cycle(days), N))
    random.shuffle(selections)
    print(selections)
    
    counter = collections.Counter(selections)
    print(counter)
    

    Output:

    ['Friday', 'Friday', 'Wednesday', ...,  'Thursday']
    Counter({'Tuesday': 11, 'Monday': 11, 'Friday': 10, 'Wednesday': 10, 'Thursday': 10})