pythonnumpyprobabilitynormalization

Normalize list of floats to probabilities


I have a list of probability weights like weights = [3, 7, 4, 2] and I want to normalize it so that sum(weights) == 1.

This can be later used for something like "A weighted version of random.choice" with numpy.random.choice

Currently I am doing something like:

norm_one = sum(weights)
probabilities = [x / norm_one for x in weights]

I am wondering if there is any problem with what I am doing since floating points are represented with a finite number of bits and the sum of the list might not equal 1 and if there is a builtin function to normalize either a list or a numpy.array that I should use instead (or any better approach)


Solution

  • Technically you are correct, it can happen that the sum would equal 1 +- very_small_number but the packages like Numpy and Pandas do not care about that and it will just work (the magic of python).

    As other have pointed out in the comments use numpy for normalization

    1. It will be much faster when dealing with large arrays
    2. It will be less verbose

    Code:

    import numpy as np
    
    weights = np.array([5, 18, 7])
    probabilities = weights / weights.sum()