pythonlistrandomprobabilitycoin-flipping

I am working on a python practice problem, that involves calculating the probability of consecutive coin flips. I can't tell why percentages over 100%


I am working on this practice problem from the book "Automate the Boring Stuff with Python", the second edition. Here is the exercise I am doing:

Coin flip streaks

For this exercise, we’ll try doing an experiment. If you flip a coin 100 times and write down an “H” for each heads and “T” for each tails, you’ll create a list that looks like “T T T T H H H H T T.” If you ask a human to make up 100 random coin flips, you’ll probably end up with alternating head-tail results like “H T H T H H T H T T,” which looks random (to humans), but isn’t mathematically random. A human will almost never write down a streak of six heads or six tails in a row, even though it is highly likely to happen in truly random coin flips. Humans are predictably bad at being random.

Write a program to find out how often a streak of six heads or a streak of six tails comes up in a randomly generated list of heads and tails. Your program breaks up the experiment into two parts: the first part generates a list of randomly selected 'heads' and 'tails' values, and the second part checks if there is a streak in it. Put all of this code in a loop that repeats the experiment 10,000 times so we can find out what percentage of the coin flips contains a streak of six heads or tails in a row. As a hint, the function call random.randint(0, 1) will return a 0 value 50% of the time and a 1 value the other 50% of the time.

You can start with the following template:

import random
numberOfStreaks = 0
for experimentNumber in range(10000):
    # Code that creates a list of 100 'heads' or 'tails' values.

    # Code that checks if there is a streak of 6 heads or tails in a row.
print('Chance of streak: %s%%' % (numberOfStreaks / 100))

Of course, this is only an estimate, but 10,000 is a decent sample size. Some knowledge of mathematics could give you the exact answer and save you the trouble of writing a program, but programmers are notoriously bad at math.

My attempt using the template code provided

import random
numberOfStreaks = 0
for experimentNUmber in range(10000):
    # Create a list of 100 heats or tails values.
    coinFlips = []
    for i in range(100):
        if random.randint(0, 1) == 0:
            coinFlips.append('H')
        else:
            coinFlips.append('T')

    # Check if there is a streak of 6 heads or tails in a row
    numberOfHeads = 0
    numberOfTails = 0
    for i in coinFlips:
        if i == 'H':
            numberOfHeads += 1
            numberOfTails = 0  # Reset the streak of tails
        elif i == 'T':
            numberOfTails += 1
            numberOfHeads = 0  # Reset the streak of heads
    
        if numberOfHeads == 6 or numberOfTails == 6:
            numberOfStreaks += 1
            numberOfHeads = 0  # Reset the streak of heads
            numberOfTails = 0  # Reset the streak of tails

print('Chance of streak: %s%%' % (numberOfStreaks / 100))

When I run this code, I am getting outputs saying the change of a streak is over 100% (around the 150%-153% range). I have not been able to figure out why this is, I'm not sure where in my code is causing it to be this high, or if even it's a problem with my code at all, and it's actually the template code that's causing the problem. Or if I just misunderstood the question, because it took me a minute to really understand what it was asking. Any help would be appreciated. Thanks.


Solution

  • You're counting more streaks than needed. You need to include a break at the end of the section if numberOfHeads == 6 or numberOfTails == 6 of your code, when a streak of 6 heads/tails comes up.

    I'm adding an image because Stackoverflow doesn't seem to support latex.

    enter image description here

    Here's the code rewritten with the variables I've defined.

    import random
    
    m = 100
    k = 6
    n = 10000
    
    numberOfStreaks = 0
    for experimentNUmber in range(n):
        # Create a list of m heats or tails values.
        coinFlips = []
        for i in range(m):
            if random.randint(0, 1) == 0:
                coinFlips.append('H')
            else:
                coinFlips.append('T')
    
        # Check if there is a streak of k heads or tails in a row
        numberOfHeads = 0
        numberOfTails = 0
        for i in coinFlips:
            if i == 'H':
                numberOfHeads += 1
                numberOfTails = 0  # Reset the streak of tails
            elif i == 'T':
                numberOfTails += 1
                numberOfHeads = 0  # Reset the streak of heads
        
            if numberOfHeads == k or numberOfTails == k:
                numberOfStreaks += 1
                # IMPORTANT : A streak of k heads/tails comes up
                break 
    
    print('Chance of streak: %s%%' % ((numberOfStreaks / n)*100))
    

    The output is approximately 80.00% (it changes with each code execution due to randomness in the simulation).