rsimulationmultinomial

Sample multinomial distribution from the log-probabilities


I have three numbers:

lp_1 <- -316480
lp_2 <- -85041
lp_3 <- -1197042

The goal is to sample a multinomial distribution with probabilities proportional to the exp(lp_i). With these numbers, one gets 0 if one takes the exponential, and even if we set one number to 0, we will not manage to get some probabilities proportional to exp(lp_i) because we will get either 0 or Inf when taking the exponential of the two other numbers.

Do you see any way to achieve this goal nevertheless? I think it's impossible, but I'd like to be sure.


Solution

  • Normally we could do this without loss of generality by subtracting the maximum value (equivalent to dividing through by the largest probability) but this isn't going to help enough here:

    logprobs  <- c(-316480, -85041, -1197042)
    lp2 <- logprobs - max(logprobs)
    

    Let's see how big these probabilities would be relative to the max:

    lp2/log(10)
    [1] -100512.7       0.0 -482935.9
    

    So the second-most-probable outcome is many orders of magnitude smaller than the most probable outcome (larger than a googol, although less than a googolplex). There is no way, before the sun burns out, that you're going to sample anything but the most probable outcome (i.e., category 2).


    Double-checking with a simple example to check my math:

    probs <- log(c(0.0001, 0.01, 0.000001))
    > probs
    [1]  -9.21034  -4.60517 -13.81551
    > (probs - max(probs))/log(10)
    [1] -2  0 -4
    

    This correctly concludes that the first and third outcomes are two and four orders of magnitude less probable than the middle.