rstatisticsbinomial-coefficients

Compute expected (remaining) wins in a playoffs series, given current state of the series


Consider a basketball series, best 4 out of 7. In R, we have the following function for computing the expected number of wins in such a series for a team with a certain single-game win probability wp_a:

get_expected_wins <- function(wp_a = 0.50, num_games = 7, to_win = 4) {
  # compute expected wins for team_a
  # wp_a: a team's odds to win a single game
  # num_games: the maximum number of possible games remaining in the series
  # to_win: how many more games a team needs to win the series
  # 7,4 correspond to winning a best 4 out of 7 series
  
  # expected wins for the team
  prob_to_win_n_games <- dbinom(x = 0:num_games, size = num_games, prob = wp_a)
  num_wins <- c(0:to_win, rep(to_win, num_games - to_win))
  ewins <- sum(prob_to_win_games_a * num_wins)
  
  # and return
  return(ewins)
}

In the function, prob_to_win_n_games should be the team's probability of winning 0, 1, 2, up to num_games number of games. Consider a playoff series where a team is trailing 0-3, and we are trying to compute their expected remaining number of wins in the series. Keep in mind that 1 more loss by the team would end the series. We want to call get_expected_wins(0.5, 4, 4)

In this series, this team has a 50% chance of winning 0 more games (lose the next game), 25% to win 1 game (win, then lose), 12.5% to win 2 games (win, win, then lose), 6.25% to win 3 games (win, win, win, then lose) and 6.25% to win 4 games (win 4x). Their expected wins in the series is then 0.5*0 + 0.25*1 + 0.125*2 + 0.0625*3 + 0.0625*4 = .9375

In this example, num_games = 4 and to_win = 4, and prob_to_win_n_games is incorrectly computed as 0.0625 0.2500 0.3750 0.2500 0.0625. The binomial fails to account for the series ending after an additional loss. It computes a 25% chance of 3 wins, based on the calculation (4 choose 3) * (0.5 ^ 4), however 3 of the 4 possible sequences (L W W W, W L W W, W W L W) are not possible in our theoretical playoff series where one additional loss by the team would end the series. Only W W W L gets the team to 3 wins.

How can we update this function to correctly compute a team's probability of winning a certain number of games, given the parameters we set for the playoff series.


Solution

  • If the number of wins is less than to_win, you have to subtract 1 from the top number in the binomial coefficient (first argument of choose) from what dbinom would give.

    The reason for this is that the only way to lose a series is to lose the final game of the series. There is no other restriction on the ordering of the wins/loses for the loser. This means the wins for the series loser can be distributed among all but the last game, which is why we must subtract one from the top number in the binomial coefficient.

    This will return the probability of seeing 0:to_win wins:

    get_expected_wins <- function(wp_a = 0.50, num_games = 7, to_win = 4) {
      i <- to_win:1
      wins <- choose(num_games - i, to_win - i)*wp_a^(to_win - i)*(1 - wp_a)^(num_games - to_win + 1)
      setNames(c(wins, 1 - sum(wins)), 0:to_win)
    }
    
    get_expected_wins(0.5, 7, 4)
    #>       0       1       2       3       4 
    #> 0.06250 0.12500 0.15625 0.15625 0.50000
    get_expected_wins(0.5, 6, 3)
    #>       0       1       2       3 
    #> 0.06250 0.12500 0.15625 0.65625
    get_expected_wins(0.5, 4, 4)
    #>      0      1      2      3      4 
    #> 0.5000 0.2500 0.1250 0.0625 0.0625
    

    Alternatively,

    get_expected_wins <- function(wp_a = 0.50, num_games = 7L, to_win = 4L) {
      k <- 0:(to_win - 1L)
      n <- (num_games - to_win + 1L):num_games
      wins <- dbinom(k, n, wp_a)*(n - k)/n
      setNames(c(wins, 1 - sum(wins)), 0:to_win)
    }