I know that for static weights the recommended way to go is Walker Alias method because it has sampling complexity of O(1).
However I have a case where I need to be able to temporary adjust probability of specific items dynamically.
Say I have a map of keys with their respective weights:
A = 5, B = 10, C = 15, D = 20
I'm trying to figure out the most efficient way to increase relative chances of A happening by 20% and C by 15%. So right now chances of A and C are 10% and 30%. They should become 12% and 34.5%.
The most obvious way seems to be:
Apply changes to weights:
a = 5
c = 15
newA = a * 1.2 = 6
newC = c * 1.15 = 17.25
difference = (newA - a) + (newC - c) = (6 - 5) + (17.5 - 15) = 1 + 2.25 = 3.25
Ue difference to proportionally decrease other weights (b and d) to make sure that the total weight remains the same:
b = 10
d = 20
bAndD = b + d = 30
newB = b - (b / bAndD) * difference = 10 - 1.083333333333333 = 8.916666666666667
newD = d - (d / bAndD) * difference = 20 - 2.166666666666667 = 17.83333333333333
Resulting weights:
a = 6
b = 8.916666666666667
c = 17.25
d = 17.83333333333333
totalWeight = remains 50
So the chances now are:
chanceA = 12% (20% increase)
chanceB = ~17.83% (~10.83% decrease)
chanceC = 34.5% (15% increase)
chanceD = ~35.66% (~10.83% decrease)
It's great and all but ideally I'd like to plug all of this into some efficient structure if possible. Dynamic Walker Alias or Fenwick Tree or Segment Tree. And in that case if I understand correctly I probably don't want to update EVERY weight every time I need to temporary adjust chances.
Are there any better alternatives where I could modify only A and C weights? Or this is the best way? If it is, does it mean that all efficient structures and trees go out of the window and I have to use standard "cumulative weight / weight sum" method to pick entry after that?
const totalWeight = weights.reduce(
(sum, weight) => sum + weight,
0,
);
const randomWeight = Math.random() * totalWeight;
let cumulativeWeight = 0;
for (let index = 0; i < weights.length; index++) {
cumulativeWeight += weights[index];
if (randomWeight < cumulativeWeight) {
return index;
}
}
UPD: Managed to find solution that doesn't modify weights of B and D. This maybe will allow to update Fenwick tree/Dynamic Walker Alias with less cost (since we update only specific elements instead of all).
Wa + Wb + Wc + Wd = Wtotal (new total weight)
desired probability:
Wa / Wtotal = 0.12
Wc / Wtotal = 0.345
Wtotal = Wa + Wc + 30
Wa = Wtotal * 0.12
Wс = Wtotal * 0.345
Wtotal * 0.12 + Wtotal * 0.345 + 30 = Wtotal
0.535 * Wtotal = 30
Wtotal = 30 / 0.535
Wtotal = 56.07476635514019
Wa = 6.728971962616822 = 12% chance
Wc = 19.34579439252337 = 34.5% chance
This may be a case of pre-mature optimization. While O(1) is nice, O(n) and even O(n log n) scale pretty well. O(n^2) and O(n^n) are the trouble-makers. However you should not have to exceed O(n) complexity.
The table required for the Walker Alias method takes O(n) time to generate, where n is the number of items. Do you have millions or billions of items? You may not see any performance problems until you reach trillions or quadrillions of items. Or the platform is very low-powered.
Next, how often do you need to temporarily adjust the probability of specific items? If less than multiple times per second, you still may not need to optimize anything.
First, I would try adjusting the weights as you described in your question. I think this operation will will take at most O(n) time. Just changing A and C's weights might be possible, but it probably won't result in any significant performance gains (unless you have a lot of items and/or are constantly changing the weights). Even if you just change one item's weight, Walker alias must still precompute the entire table. (I'm not sure if there would be an optimization to re-use the previous table...)
Then plug the new weights into your algorithm of choice, like Walker Alias. I'm guessing lookup is the more commonly used operation, which will still only take O(1) constant time.
Even if you have to update the table with every lookup, the run time complexity is still at most O(n).
You can cache the original weights if the new weights should only be temporary.