artificial-intelligenceselectiongenetic-algorithmroulette-wheel-selection

Genetic Algorithm roulette wheel selection


I am having issues understanding the algorithm. Here is the most popular one seen online

for all members of population
  sum += fitness of this individual
end for

for all members of population
  probability = sum of probabilities + (fitness / sum)
  sum of probabilities += probability
end for

loop until new population is full
  do this twice
    number = Random between 0 and 1
       for all members of population
          if number > probability but less than next probability 
             then you have been selected
       end for
      end
  create offspring
end loop


for all members of population
  probability = sum of probabilities + (fitness / sum)
  sum of probabilities += probability
end for

^^^this piece in particular confuses me. What are the "sum of probabilities" and even "probability" in the context of an individual in a population? Are these like values individuals have on inception?


Solution

  • That is a very obfuscated piece of code.

    In that second block of code, probability is a variable attached to each member of the population, and sum of probabilities is a global variable for the whole population.

    Now, what the roulette wheel metaphor is saying, is that the entire population can be represented as a roulette wheel, and each member of the population has a slice in that roulette wheel proportional to its relative fitness. That code is doing the dirty work behind that metaphor-- instead of wedges on a wheel, the members are now represented by proportional intervals on the line segment [0,1], which is a customary way to represent probabilities.

    To do that, you technically need two numbers, a start and an end, for each member. But the first member's start is going to be 0; the second member's start is going to be the end of the first member; etc, until the last member, which has an end of 1.

    That's what that code is doing. Sum of probabilities starts out at 0, and the first time through the loop, the probability is what you intuitively thought it would be. It is marking the end point of the first member. Then the "sum of probabilities" is updated. The second time through the loop, the "probability" is what you intuitively thought it would be... shifted over by the "sum of probabilities." And so it goes.

    So the first loop is summing fitness values as a prelude to normalizing things. The second loop, that you ask about, is normalizing and arranging those normalized probabilities in the unit interval. And the third (most complex) loop is picking two random numbers, matching them up with two members of the population, and mating them. (Note that the assumption is that those members are in some array-like format so that you can sequentially check their endpoints against the random number you've rolled.)