algorithmprobabilitygreedygame-theory

Optimal Algorithm for Winning Hangman


In the game Hangman, is it the case that a greedy letter-frequency algorithm is equivalent to a best-chance-of-winning algorithm?

Is there ever a case where it's worth sacrificing preservation of your remaining lives, for the sake of a better chance of guessing the correct answer?

Further clarification of the problem:

Motivation: This question is inspired by the interesting discussion at http://www.datagenetics.com/blog/april12012/index.html

They suggest an algorithm for solving the word game "Hangman" optimally.

Their strategy can be summarised thus (edited for clarification):

At each stage, we are guessing the letter (not previously guessed) which occurs in the largest number of remaining possible words.

There is some motivation to like this algorithm - we are always minimally likely to lose a life.

But, it strikes me that this isn't necessarily the best solution: if we're trying to guess the word (within a certain number of lives), it's not necessarily always the case that the most frequently occurring letter is the most useful letter to distinguish between the remaining available words.

I'm not sure, though, as it does seem opportune to avoid losing a life wherever possible. Will it ever be the case that optimal strategy permits us to sacrifice a life for a better opportunity to win?

Question: is it the case that this greedy algorithm is equivalent to a best-chance-of-winning algorithm, or not? And can you prove it?

An example dictionary+game would be ideal to show a disproof.


Solution

  • Assume the following dictionary: ABC ABD AEF EGH. (I'll capitalize unguessed letters.)
    Assume you have only 1 life (makes the proof so much easier...).

    The algorithm proposed above:

    Starting with A, you lose (1/4) or are left with aBC aBD aEF (3/4).
    Now guess B and lose (1/3) or are left with abC abD (2/3).
    Now guess C or D and you lose (1/2) or win (1/2).
    Probability to win: 3/4 * 2/3 * 1/2 = 1/4.

    Now try something else:

    Starting with E, you lose (1/2) or are left with AeF eGH (1/2).
    Now you know what to guess and win.
    Probability to win: 1/2.

    Clearly the proposed algorithm is not optimal.