There was something about the number 72 that seemed perfectly appropriate, an echo from Baseball Prospectus history. Our forecasting system PECOTA's most famous projection, if a projection can ever be really famous, came in 2007, when Nate Silver's machine spit out a 72 for the Chicago White Sox. They were coming off a 90-win season, which had itself followed a World Series title. Oh, did people fume -- a Chicago Tribune response was so angry it earned a Fire Joe Morgan fisking, while Kenny Williams snarked that the projection was "a good sign for us because usually they're wrong about everything regarding our dealings." And then the White Sox won ... 72 games.
So on the night before we released our PECOTA projections at Baseball Prospectus this winter, and I saw with terror that we had put a 72 on the Royals. PECOTA keeper Rob McQuown and I went over every detail of that projection to make sure it wasn't a mistake, and I was also amused. Maybe we'd hit another 72?
Nah. The Royals passed 72 in mid-August and are on pace to win 95. Once again, a 72 has become one of our most famous projections -- drawing, once again, snark from team execs, this time justified. What happened? Let's perform an autopsy.
More from FoxSports
There are three foundations to a projection system: predicting who will play, predicting how well they will play, and converting those individual performances into team wins. Every projection "misses," in the sense that none predict with 100 percent accuracy, but we can see, in five critical projections, how the Royals especially diverted from their team-wide outlook. (Stats through Sept. 11)
1. Wade Davis
Projected: 3.54 ERA, 0.6 WARP
Actual: 0.94 ERA, 2.0 WARP
Here's an example of a projection where one might plausibly argue that humans, even with all our biases and fallacies and short memories and selective logic, are "smarter" than a system like PECOTA. The computer gets smarter with more data, but in the case of Davis it had very little relevant data to go on and lots of data that might almost qualify as irrelevant: just 72 innings of historically good relief in 2014, following a career spent mostly as a mediocre starter. It's very hard to change PECOTA's mind with 72 innings when it has hundreds of previous innings that it isn't designed to forget. When a player's outlook changes fast and obviously, as Davis' seemed to have upon achieving relief-ace dominance, you and I can adjust more quickly than PECOTA can.
But Davis isn't just a single example; he's the Royals in a nutshell -- a dominant reliever who is quite possibly more valuable than his WARP suggests and who didn't project (for any of a number of reasons) to be as dominant as he has been. Bullpens, with frequent personnel changes, higher rates of injury and seasons that never quite get out of small-sample territory, are perhaps the most difficult challenge for a projection system. (PECOTA was way too optimistic on Greg Holland, for instance; relievers are hard!) The best pitchers in that bullpen also have an outsized effect on wins and losses, pitching as they do almost exclusively in high-leverage situations. The Royals have had the best bullpen ERA in the American League this season; anytime you hear that about a team, you can bet that a lot went unexpectedly right. You can probably also bet that PECOTA shot low on that team's win total.
2. Lorenzo Cain
Projected: .258 True Average, 2.1 WARP
Actual: .313 True Average, 6.4 WARP
The most optimistic view of the Royals entering the season held that Eric Hosmer and Mike Moustakas, two former blue-chip prospects who had disappointed to different degrees, might finally each take steps forward. (This view was supported by each player's postseason heroics last fall.) PECOTA has a long memory -- it remembers each of these players' minor-league excellence -- but had largely given up on Moustakas and tempered its previous expectations of stardom for Hosmer:
There are five unforecast wins in that table -- but, of course, every team is going to have a couple players outperform projections. The Phillies have the worst record in the league, even though they got five more wins out of Odubel Herrera and Cesar Hernandez than PECOTA projected. Zoom out and see that the Royals also had plenty of players underperform their projections.
So, Hosmer and Moustakas outperformed their projections by a bunch, but then Omar Infante and Alex Rios and Sal Perez and Christian Colon and Alex Gordon underperformed theirs. But there's one huge gap there, between the projected value and the actual value of Lorenzo Cain. In fact, almost the entire difference between the Royals' lineup projection and lineup reality can be attributed to Cain emerging as an MVP candidate. This is effectively the fourth consecutive season that Cain has improved as a Royal, and given his backstory -- he took to the game later than most, so later development makes some sense, even after he was a fairly mediocre 27-year-old -- this might have been foreseeable. Of course, you can find an exception for almost every player if you want to; PECOTA, wisely, prefers consistency. Players like Cain, who have unique development paths, sometimes make us pay for that.
3. Edinson Volquez
Projected: 4.71 ERA, -0.6 WARP
Actual: 3.49 ERA, 2.0 WARP
This isn't so much about Volquez as it's about the Royals defense. Volquez himself doesn't really change. Here is his cFIP- by season since 2009 -- cFIP being the most predictive pitching estimator (100 = average, higher = worse):
Any improvements he has actually made have been slight -- and this is a pitcher who had a 4.46 ERA in that timespan, before this year, so a 4.71 ERA in a harder league would pass the sniff test. However, what Volquez has now is the league's best defense behind him. While PECOTA tries to adjust its pitcher projections for defense, it's obviously very difficult with publicly available data to fully assess a team's defense or fully account for its effect on each individual pitcher. We saw a team with a shaky rotation that would struggle without James Shields; in retrospect, we didn't give the Royals' defense enough credit.
4. Ben Zobrist
Projected: Oakland Athletic
Actual: Kansas City Royal
Playing time should be the most controversial part of a projection system -- it's the one part that's entirely subjective. While performance projections are based on formulas and algorithms that minimize human bias, the playing time is based on ... members of our staff, surveying the team's roster, debating among each other, factoring in each player's health history, and (in effect) attempting to predict what the manager's whim will be on some random day game in August.
But PECOTA did a pretty swell job projecting the Royals playing time:
With one exception: We didn't think Ben Zobrist would play for the Royals at all. Imagine that. But this is one area where projection systems have an impossible task: Teams change the strength of their roster as the year goes on based on how close to a playoff spot they are. You could make a case, if you're developing your own projection system, that winning teams' projected wins totals should be bumped up a notch or two based on the expectation that competitive teams load up at the trade deadline, and the reverse for losing teams. But, as the Royals show, it's also not easy to pick in March which teams are going to be those reloaders in July; if we'd tried that based on our projections, we would have had the Royals probably unloading pieces in July, and perhaps dropping down to 70 wins, or 69.
Anyhow, Zobrist and Johnny Cueto have already added about 2 WARP that we couldn't have projected; there's two of our missing wins.
5. Kendrys Morales
Projected: .275 True Average, 1.2 WARP
Actual: .290 True Average, 2.0 WARP
Hey, pretty good projection! We thought Morales would be a pretty good hitter, and he was in the neighborhood -- about a win better than his forecast, within the margin of error for this sort of thing. Of course, he has also been phenomenally "clutch" -- the sixth-most clutch player in baseball, according to FanGraphs; his OPS goes up 300 points with men on, 350 with runners in scoring position -- and if you swap out his context-free hitting stats for something like Win Probability Added, he turns into a three- or four-win player. Suddenly, our projection of a below-average player looks way low for a guy who has arguably been as valuable as an All-Star -- even though we were pretty close to projecting his raw performance.
Here, again, we have a "Royals in a nutshell" projection. Yes, PECOTA undersold the Royals. It expected they would score 657 runs and allow 719; instead, they're on pace to score 728 and allow 646. We saw a below-average team, and, like Morales, the Royals clearly have been above average; by third-order wins, which is generally our best way of measuring a team's true, observed talent, they are on pace to win 87 games. That they'll win closer to 95, instead, goes to the Royals' own "clutch" performance this year (they're tops among offenses in FanGraphs' measure of it), and perhaps to an ability to further leverage defense and bullpen excellence to win close games, and perhaps to some other element of success that analysts haven't figured out yet.
While PECOTA aspires to be perfect, what it really does is this: It projects players, individually; it converts those performances into expected runs, based on how baseball usually works; then it converts those runs into expected wins, based on how baseball usually works. At each step along the way, it gets harder to be perfect, and the Royals demonstrate that challenge well. Some players did better than we expected; some offered incomplete data on which to project them; some were added to the roster at midseason; some found the right fit. None of us is arrogant enough to think that projection systems are magic; baseball is impossible to predict with the sort of precision that avoids situations like 2015 Royals. We all know we can't outrun the bear.