# Ivars Peterson's MathTrek

October 27, 2003

## Seven-Game World Series

In professional baseball's World Series, the championship is decided in a best-of-seven format. The first team to win four games gets the pennant. Curiously, series that go on for the full seven games appear to occur more often than simple probability arguments would suggest.

Suppose that two, evenly matched teams have made it to the finals. Each one has a 50:50 chance of winning any given game. So, for a given team, the probability for winning one game is .5, the probability of winning two games in a row is .5 x .5, or .25, and so on.

For the series to end after four games, one of the two teams must win four games in a row. The probability of one team or the other doing so is (.5 x .5 x .5 x .5) + (.5 x .5 x .5 x .5), or .1250.

For the series to end after five games, one team must win exactly three out of the first four games and then win the fifth. The probability of a series going to five games is .2500. Similarly, the probability of a series ending after six games is .3125 and after seven games is also .3125.

It's quite possible that the two teams in the World series aren't evenly matched. In this case, the chance of a series going to seven games is likely to be less than 31.25 percent. The superior team should end a series against a weaker opponent in fewer games.

What do the data show?

 No. games Calculated probability (%) World Series 1952-2002(%) World Series1923-2002(%) World Series 1952-1976(%) 4 12.5 16 17.7 16 5 25 16 19.0 16 6 31.25 20 21.5 12 7 31.25 48 41.8 56

In the last 50 years (1952 to 2002), 48 percent of World Series have gone the full seven games, a rate significantly higher than the 31.25 percent predicted by the simple probability model for evenly matched teams.

Going back to 1923, the rate drops a bit to 41.8 percent. (Before 1923, some series included tie games and several involved eight games.) Interestingly, the period from 1952 to 1976 saw 14 out of 25 series (56 percent) go the distance, apparently at the expense of six-game series. Since then, 9 out of 25 (36 percent) have done so. That's closer to the calculated probability.

The data suggest that a simple probability model doesn't include factors that influence a series outcome. What are those factors? Some possibilities include the confounding role of home-field advantage and the part played by baseball strategy. Managers and players often make decisions based not only on game situations but also on which game in the series is being played.

Why was there a particularly large imbalance from 1952 to 1976? That's just another historical quirk that statistics-mad baseball fans can argue about.