pythonrandomprobabilityprobability-distributionnumpy-random

Generating a probability distribution P(y) from another probability distribution P(x) such that highest probability in P(x) is least likely in P(y)


So the problem at hand is that I have some values in a dictionary with counters, let's say

dict = {"cats":0, "dogs":0, "lions":0} 

I want to randomly select the keys from this dictionary and increment the counters as I select the particular keys.

But as I select the keys and increment the counters for those keys, I want the probability of selection to be higher of the keys whose counter values are lesser than the other keys.

I have implemented this idea in my answer below. Kindly let me know if this makes sense and if there are any better ways of doing this?


Solution

  • There are many ways of solving this, but as an alternative I'd be tempted to calculate the probabilities as:

    def iweight(k, *, alpha=1):
        p = 1/(alpha + np.array(k))
        return p / np.sum(p)
    

    which could be used as:

    counts = [0, 0, 0, 20]
    for _ in range(20):
        i = np.random.choice(len(counts), p=iweight(counts))
        print(i)
        counts[i] += 1
    

    the alpha is used in a complementary way to a Dirichlet process: small values will cause it to prefer drawing values from small counts, while large values will cause it to be more uniform.

    What's best will depend on the process you're modelling, e.g. how much should small counts be preferred to medium counts, should the largest counts ever be chosen, etc. It all depends on the distribution you're after and the statistics literature should have many examples of how to start thinking about this.