pythonpython-3.xsamplingbinary-string

Uniformly randomly sample binary strings of length N when N is large


I wish to generate a random binary string of length N such that each of the possible 2^N strings is uniformly randomly chosen. Note that choosing 1 or 0 with equal probability to build a string doesn't work because with high probability the string contains equal numbers of 0s and 1s, and so such strings are generated with higher probability. Another way is to generate the list of all 2^N strings and then choose one of them. However, this quickly becomes impractical when N is even 30. I need to work with N = 500. How can I achieve this? If python has an in-built function for such thing, it would be even better.

Edit Clearly I posed a wrong question; apologies. What I wanted was a uniform distribution over the number of 1s in the string. So the string with just two 1s should be as probable as the string with all 1s. I could achieve this though.


Solution

  • You've misunderstood how the probabilities work. Picking each bit uniformly at random produces exactly the desired distribution.

    It's true that this will tend to generate strings with roughly equal numbers of 0s and 1s, but that's exactly what it should do, because most possible bitstrings have close to equal numbers of 0s and 1s. Each individual possible bitstring still has a 1/2^N probability of being selected.

    (That doesn't mean you should implement this by manually picking bits one at a time with random.choice, though. That'd be slow. Something like '{:0{}b}'.format(random.getrandbits(N), N) would be faster.)