I follow baseball pretty regularly, and with the 2016 season yet to start this is of course time for everyone to talk about team projections for the upcoming season. Most of this discussion focuses on the projected win totals, and there’s lots of talk, both good and bad, about the win totals on various sites. (If you’re interested, I think Phil Birnbaum’s blog has some of the best posts, though I’m too lazy to find the particular posts I’m thinking of right now.)
Finding a real flaw in the mean projected win totals is pretty hard and takes tons of data, so instead of talking about the actual projected win totals let’s think about projecting playoff odds instead. FanGraphs has projected playoff odds to go with their mean win totals in the standings. In other sports you have things like FiveThirtyEight’s basketball playoff odds, or Football Outsiders’ DVOA playoff odds (which they don’t have up right now, of course).
I’m not sure of the exact method used to generate these playoff odds, but I can say I’m pretty confident that most of them are at least somewhat wrong. The main problem is that we can’t be certain what a particular team’s actual true talent level is. While projections, at least in baseball, are pretty good at getting the mean right, there’s still certainly some deviation between a team’s actual true talent level and the projections’ estimate of that team’s true talent level. It turns out that even if you are actually correct on the average, you still get playoff odds wrong by simulating seasons using a fixed true talent level for each team.
There’s a simple illustration of this that I think is convincing, though it’s not perfect. Let’s look at the NL West, and assume that the projection for the Dodgers to win 94 games is exactly correct, and additionally let’s assume that the Diamondbacks, Padres, and Rockies always win fewer than 94 games. So we’re left with just the Giants to consider as contenders to the Dodgers in winning the division.
Given a set true talent level and some ass, it’s possible to analytically solve for the probability of a team winning at least a given number–in this case 94–games in a season. Steve Staude at The Hardball Times created a spreadsheet that does just that, along with simulating playoff series. If we assume that the Giants’ true talent winrate is exactly .540, as the Fangraphs projections have, then they win 94 or more games 13.4% of the time (and I’ll assume this means they always win the division). Fangraphs has the Giants’ division odds at 23.5%, so this estimate doesn’t seem horrifically wrong given that I’m assuming the Dodgers always win exactly 94 games. In reality, of course, there are a lot of times the Dodgers win fewer than 94 games if the projections are correct, and since decreasing variance favors the favorite we should expect our quick estimate to be low.
Ok, now on to the reason most simulated projections are wrong: let’s add in variation in the Giants’ true talent level. I’m not sure what the actual standard deviation for true talent compared to projections is in baseball, and I’m not aware of anyone looking at that particular question, so I’m just going to make a quick assumption and say that it’s 3-ish games. In fact, since this is just going to be a quick estimate, let’s assume that the Giants have a true talent winrate of .540 1/3 of the time, a true talent winrate of .555 1/3 of the time, and a true talent winrate of .525 1/3 of the time. Importantly, this means our projections are still correct on average.
With a true talent of .555, the Giants win 94 games 23.5% of the time. With a true talent of .540, as I said, they win 94 games 13.4% of the time, and with a true talent of .525 they win 94 games 6.82% of the time. So this new situation gives the Giants a combined 14.6% chance of winning the division, instead of our original 13.4% estimate. As expected, variance here favors the underdog.
Running the same exercise with the Giants fixed at 88 wins and looking at the Dodgers, we get am 80.7% chance for at least 88 wins with a fixed .579 true talent, or a 79.2% chance with the same varying true talent. So it works both ways as we expect.
The problem is that it is absolutely impossible to simulate variable true talent across an entire season using all 30 teams. Using a fixed true talent, to simulate an entire season once means using 2430 random numbers, essentially. Or, working with just orders of magnitude, 103 numbers. Doing this 10,000 times brings us to 108. This is entirely doable on a daily basis.
Adding in the simplest reasonable true-talent variation is problematic, to say the least. Giving each team three possible true talent levels means you need to do 330 as many simulations, at a minimum (one for each possible combination of team true talent levels). 330 is the same order of magnitude as 1014. Now we’re up to 1022 random numbers, which you have to then compare to another number to determine which team wins. There are only 105 seconds in a day, so you need to simulate each game in less than 10-17 seconds to run this simulation daily, as FanGraphs does for their playoff odds. That’s not feasible on a home computer. You can reduce the number of times you simulate each true-talent combination season from 10,000 down to something lower, but even if you go to 1 I still think it’s not something you can reasonably run daily on a home computer (I might check this at some point).
Since this is the simplest reasonable case, and it’s already too complicated to actually simulate, it’s safe to say that this form of true-talent variation is most likely not what FanGraphs does for their simulations. So, unless they compensate for the increased variance in some other way in their playoff odds, their playoff odds are wrong. I think they’re pretty good–the effect here is maybe a percentage point or so in magnitude–but I certainly wouldn’t trust the decimal points.