luashufflefisher-yates-shuffle

Write a function to shuffle a given list. Make sure that all permutations are equally probable


Whilst reading Programming in Lua book, I came across an exercise which I'm not sure if I solved right, hence I wanted to ask you all.

Problem statement:

Exercise 6.4: Write a function to shuffle a given list. Make sure that all permutations are equally probable.

Background: I've very basic mathematical and programmings knowledge

I researched a bit online, read a bit about combinations vs permutations, how does that come into play in this problem, and then I finally came across Fisher-Yates Shuffle which I think led me to a "working" solution.

My ask:

My solution:

local printList = function (a)
  for _, value in ipairs(a) do
    io.write(value, " ")
  end
  print()
end

local interchangeElements = function(list, i , j)
  local temp
  temp = list[i]
  list[i] = list[j]
  list[j] = temp
  return list
end

local getShuffledList = function(list)
  -- math.randomseed(1) --> gives constant shuffled list on each call
  local j
  for i = #list, 1, -1 do
    j = math.random(1, #list)
    interchangeElements(list, i, j)
  end
  return list
end

printList(getShuffledList({1, 2, 3, 4}))
printList(getShuffledList({1, 2, 3, 4}))
printList(getShuffledList({1, 2, 3, 4}))
printList(getShuffledList({1, 2, 3, 4}))

Output:

3 2 4 1 
3 4 2 1 
4 3 1 2 
1 2 3 4 

[Process exited 0]

Solution

  • Is my solution right?

    This is not a correct Fisher-Yates shuffle, and the results are not equally probable. This is not obvious, but easy to show with some well-known math.

    Let's call the length of the list n. In each iteration of your loop, you choose a number between 1 and n. There are n iterations, so there are n^n equally likely ways that all the numbers can be chosen. The number of permutations of the list is n factorial. (Except in trivial small cases), there is no way to divide n factorial permutations evenly into the n^n outcomes, thus it is impossible for all the permutations to be equally likely. For example with n=4, there are 24 permutations, but 256 ways the random numbers can be chosen. 24 doesn't divide into 256 evenly, so there is no way to assign an permutation to each of the 256 random outcomes so that they all appear the same number of times.

    To get the proper Fisher-Yates algorithm, change j = math.random(1, #list) to j = math.random(1, i). This means that the previous counterargument no longer applies: the first random number is from 1 to n, then 1 to n - 1, etc. So, the number of ways the random numbers can be chosen is also n factorial. This doesn't automatically prove that the algorithm works properly; you need to think about it and realize that each permutation can be reached by exactly one way of choosing the random numbers (or you only have to prove that each permutation is possible since the numbers are the same).

    One thing I noticed is if I add math.randomseed(1) [...], I always get the same results[...] Why does that happen?

    math.random is a pseudo-random number generator. This means that it has some hidden internal state, and each time you call it, it does some mysterious but deterministic operation to modify its internal state and return a mysterious number. But, it is not actually random (because common computer operations are inherently always deterministic) (hence "pseudo-random").

    math.randomseed sets the hidden internal state. Because everything is deterministic, if you start with the same state, you will get the same result. This feature can be useful, e.g. to reproduce the same result later, but it is not useful if you just want "random" numbers.

    In Lua 5.4, Lua tries to automatically choose a different starting seed value each time you run the program, so there is not any great need to manually provide a seed when the program starts (but also a no-argument version of math.randomseed() is added to allow this behavior to be invoked explicitly). In earlier versions of Lua it is common to do something like math.randomseed(os.time()) so that you get a different seed each time (well, it will be different if at least a second of real time has passed).