pythonalgorithmrandomdistribution

How to generate uniformly distributed subintervals of an interval?


I have a non-empty integer interval [a; b). I want to generate a random non-empty integer subinterval [c; d) (where a <= c and d <= b). The [c; d) interval must be uniformly distributed in the sense that every point in [a; b) must be equally likely to end up in [c; d).

I tried generating uniformly distributed c from [a; b - 1), and then uniformly distributed d from [c + 1; b), like this:

a = -100
b = 100

N = 10000
cs = np.random.randint(a, b - 1, N)
ds = np.random.randint(cs + 1, b)

But when measuring how often each point ends up being sampled, the the distribution is clearly non-unifrom:

import numpy as np
import matplotlib.pyplot as plt

hist = np.zeros(b - a, int)
for c, d in zip(cs, ds):
    hist[c - a:d - a] += 1

plt.plot(np.arange(a, b), hist)
plt.show()

histogram

How do I do this correctly?


Solution

  • If you divide the range [a,b) into N sections, and then select one of those sections at random, then the chance of selecting each point is uniform -- 1/N. It doesn't matter how you divide the range.

    Here's a solution along those lines that uses a pretty uniform selection of division points.

    from random import randint
    
    a, b, N = -100, 100, 1000000
    
    intervals = []
    while len(intervals) < N:
        # divide the range into 3 intervals [a,x), [x,y), and [y,b)
        # the distributions of the division points don't change the histogram
        # we're careful here to make sure none of them are empty
        p1 = randint(a+1,b-2)
        p2 = randint(a+1,b-2)
        x = min(p1,p2)
        y = max(p1,p2)+1
    
        # select one at random
        intervals.append([(a,x),(x,y),(y,b)][randint(0,2)])