Yankees win the SALCS…

Laird's perfect day helps Surprise win again
Jon Heyman wants Joba in the pen

…Where the S stands for spreadsheet. Baseball Prospectus’s Clay Davenport reran his LCS projection numbers and has come up with new figures to express the odds of each team advancing to, and then winning, the World Series. Before we go into how greatly the computer favors the Yankees I want to quote from Davenport’s post, because his methodology is a special kind of wonkiness.

Game 3, LA vs Philladelphia, expecting Kuroda (for the Dodgers) and Lee to pitch. The Phillies had a team EQA of .276; in a 4.5 rpg environment that works out to 5.22 runs (.276 divided by .260, raised to the 2.5, times 4.50 = 5.22). Home game, so raise by 4% to get 5.43. They’re going against a RHP, and they had a .779 OPS aginst RHP, and .781 overall. Run scoring changes with the ratio of OPS, squared, but we can only count on the starter to be in the game for about six innings (and frequently less). So we’ll have six innings with a run rate of 5.43 * (779/781)^2, and three innings where we’ll use the 5.43 rate, so now we have them at 5.41. Their opponent, Kuroda, carries a 4.82 NRA but, once again, he’s only in the game for six innings. The other three go to the Dodger bullpen, which we’ve rated – by taking the average NRA of the five relievers most likely to be used – at 2.88. The total Dodger team rating with Kuroda becomes 4.17. So we take the Philly run total of 5.41, multiply by 4.17/4.50, to get an estimate of 5.01 runs.

If we do the same math for the Dodgers, we end up with an estimate of 3.89 runs. The win probability for the Phillies is just the Pythagorean percentage from 5.01 runs scored and 3.89 allowed – or .624.

Because we don’t typically use it here, NRA is defined as, “Normalized Runs Allowed. ‘Normalized runs’ have the same win value, against a league average of 4.5 and a pythagorean exponent of 2, as the player’s actual runs allowed did when measured against his league average.” Now that we have the spreadsheet nerd business out of the way, we can see how much the computer favors the Yankees.

Using the above-described simulation, the Yankees would win 73.34 percent of the time in the ALCS against the Angels. That’s a pretty heavy advantage against the Angels, and I suspect the teams are a bit more evenly matched than that. Even more remarkably, the Yankees win the World Series in these simulations 40.55 percent of the time, against 8.3 percent for the Angels, 28.4 percent for the Dodgers, and 22.7 percent for the Phillies.

Unfortunately for the computers, they’ll play the real games on the field. But we can still have fun with the numbers these players produced during the season. If nothing else, this shows just how dominant the 2009 Yankees were, and should continue to be.

Laird's perfect day helps Surprise win again
Jon Heyman wants Joba in the pen
  • Dela G

    i just want #27

    pretty please?

  • PaulF

    How many SWS titles does the franchise have? I’m sure Jeter has his SWS ring from 1998.

    I love math.

    • http://www.secondavenuesagas.com Benjamin Kabak

      The Yanks totally won the 2001 SWS and the 2004 SWS too. That’s two right there.

      • Chris

        But didn’t they lose in 2000?

  • http://twitter.com/JamalG_BB Jamal G.

    I’m glad he noted that CC Sabathia’s probable three starts, and Joe Saunders starting Game Two were factored in.

  • dkidd


  • pat

    Raging clue.. RAGING CLUEEEEEEEE

  • Jake K.

    Are these results official? Does this count?

  • Accent Shallow

    I actually buy the RLYW results in the SALCS over the BP results. They have the Yankees winning the series “only” 60.6% of the time or so.


    • Rob in CT

      SG > Davenport. :)

  • http://twitter.com/hopjake Jake H

    Why even play the games.

  • Mike

    I do love numbers, but these statistical evaluations are now ridiculous!! This is a great site for fans and I appreciate all your work to keep us informed. With that being said, I think RAB should stick to news updates and opinions. Leave the “advanced” statistical analysis to Theo and James.

    • JobaWockeeZ

      This isn’t really advanced.

    • Stryker

      i, for one, love the statistical analysis and i’m glad the guys have dived in. it’s a part of the game – why completely disregard it?

      • Mac

        + 1

    • pat

      They didn’t do this themselves, they’re reporting other people’s findings.

  • wilcymoore

    Statistical predictions are nice, but you are familiar with the following phrase? … “That’s why they play the games.”

    • http://www.riveraveblues.com Joseph Pawlikowski

      “Unfortunately for the computers, they’ll play the real games on the field. ”

      Uh, considering that’s exactly what I wrote, I’d say I am.

  • a realist

    Yes!! See you in the Spreadsheet Canyon of Heroes!!

    • Nady Nation

      Are the Spreadsheet World Series Champion locker room shirts and hats available at Modell’s yet?

  • V

    It’s always nice to see how many people responding who don’t know the purpose of a forecast (I work in financial forecasting).

    No one is saying the Yankees WILL win. They’re saying that, sight unseen, the Yankees have a higher percentage chance of winning. So, if someone offered you the following wager: you put up $1, and if the Yankees win, I’ll give you $2, you should take it.

  • Tank Foster

    Yeah that’s all well and good, but you know even with spreadsheet games, you don’t play them on a spreadsheet.

    [That was a feeble attempt at humor intended to catch the attention of TSJC]

    • http://twitter.com/tsjc68 tommiesmithjohncarlos a/k/a Ridiculous Upside
      • Tank Foster

        Wow. I expected love.

        • dkidd

          tough room

  • toad

    I’m very dubious about this, for two reasons.

    First, just about every number that goes into this sort of thing seems to be based on empirical data that necesarily contains measurement error. That is, if you throw an OBP into the simulation you’re saying that the number you use is the true long-run OBP, which it probably isn’t. It might be close, but mash together a lot of numbers and relationships that all contain error and the final result is very imprecise. Or consider run environment. Isn’t that likely to be affected by the weather, and wouldn’t that change the results?

    I’d be more impressed if Davenport told us what the confidence interval, or something, of that number, is.

    Second, the number doesn’t check against some simple calculations. Suppose the Yankees are 55% to win any individual game against the Angels. Then their chance of winning the series is about 60%. Are the Yankees much more than 55% to beat the Angels in a given game? Make it an unrealistic 60% and you get a series win probability of .710, a still lower than the simulation results.

    It would be interesting to know what the simulation would predict about the regular season series.

  • Pingback: Yankees win the sAL East | River Avenue Blues