# How many championships has Mariano Rivera been worth?

By*The following is a guest post by Rebecca Glass of This Purist Bleeds Pinstripes. You can read her (slightly longer) versions on her site, in four parts: Part 1, Part 2, Part 3, and Part 4. We’re republishing it here because a) it took a lot of work and b) it’s really meant to be read as one article, anyway.*

**Special acknowledgment: ** This is far and away the most advanced, in depth thing I’ve ever tried. Without question, the best similarity I can come up with is asking someone who’s taken only high school Economics course to run the IMF, that’s basically what’s happened. As with any such endeavor, most of the actual work was done by others. **With thanks to Jonathan Mayo, Will Moller, Joe Pawlikowski, Mike Axisa, Jim Johnson, Jamal Granger, Dave Cameron, Brent Nycz, Joshua Rosenberg, Dan Dilworth and Greg Fertel.**

In this article, Rob Neyer dares us to come up with a way to measure how many Championships Mariano has been worth. Guess who enjoys masochism?

So, as you may know, there’s a myriad of stats out there, many of which I can only understand in theory, but there’s one measure that’s been created for the regular season that is very useful. You may have heard of it, as it’s called WAR — wins above replacement player.

NOTE: There are two measures we could use here, WAR and WARP, which try to accomplish the same thing (discussed below), but use two different sets of stats/data to do so. I’m going to stick with WAR because I think it sounds cooler. ANYWAY. So to understand WAR, two concepts are crucial: replacement level and leverage. I understand that many of you reading this will already be familiar with both of these, but since my hope is that those that don’t delve into stats very often can follow, and for the sake of my sanity, hope you won’t begrudge me a refresher.

**Replacement Level**

The idea behind replacement level is that you take any player in any line up on any given day and replace him with someone whose level of performance is what an average team can expect when trying to replace a player at minimal cost. In English, it’s saying that if, say, Andrew McCutchen went down on the Pirates with the flu, what’s the baseline production that the Pirates could expect from John Doe, who’s the cheapest available player to fill the spot? That production is replacement-level production.

Why not just use a league-average performance as a replacement? The answer is that the MLB statistics are largely skewed — MLB “regulars,” the guys putting up the big enough numbers to stay in lineups every day are a minority — while fringe players, those that struggle to stay in the big leagues, are much more common. Simply put, it’s easier to find a player that hits .250 than one that hits .330, but, like that student you wanted to kill because he got an A on that Spanish test while no one else did above a C, the one that hits .330 destroys the curve.

So, instead, you take into consideration what a GM and manager is likely to go for in the event of a player suddenly going down for a game or two–i.e., your utility infielder. Most teams–and the Yankees, of course, are not most teams–will go for whatever option is least costly–dipping into the pool of fringe Major Leaguers, the pool considered “freely available talent.” Of course, if a player is lost for a season, it’s an entirely different thing, but that gets beyond our scope.

What you end up with is on one end, you have your normal team–say the 2009 Yankees, and on the other, replacement-level team you’ve a line up where Wil Nieves is your best hitter, or Sidney Ponson as your best pitcher. What WAR does, then, is like having Nick Swisher go up to Joe Girardi before game six, and say, “Dude, I gave the Yanks, like x number more wins this season than you would have if Jerry Hairston had been your every day right fielder.”

(Note: via fangraphs, Hairston’s 2009 registered a WAR of 1.0, which indicates he performed above replacement level. Actually, this is helpful to give you an idea of how poorly a team with all replacement-level players would perform over the course of a season. Replacement Level is *not* the bench guys on the Yankees; it’s the bench guys on the Nationals and the Pirates.)

So before we move on, let’s make sure we understand everything that’s been discussed:

- The concept of Replacement Level enables us to compare performances of MLB “regulars” vs low-cost, “freely-available” replacement players.
- WAR is designed to measure how many more wins player X will net his team over player Replacement Level (i.e., our Swisher/Hairston faux metaphor).
- The values set for what a replacement level-performance entails varies by position — i.e., shortstops aren’t supposed to hit like right fielders, etc. Pitchers, too, have WAR. Over here you can see the rankings for pitchers, by WAR, for the 2009 season. To no one’s surprise, Zack Grienke tops the list. The type of season he had will do that to you.

Now here, what we want to do is find the WAR for *only* Mo’s postseason innings, and then convert that to Championships–one championship being eleven wins. A reliever’s WAR is likely to be lower than a starter’s because a reliever pitches so many fewer innings–and innings pitched/endurance is a relevant stat–ie, when you’re looking for “innings-eaters” and the like, that’s to what you are referring.

That said, a reliever’s innings — especially a closer’s — are often more high stress and involve more critical game situations. So what we need, then, is a way to account for leverage, one of the main components of a reliever’s WAR. You’ve seen leverage stats before–just think about those WPA graphs you see. This is the WPA graph from Game Six of the World Series:

.

Game Six doesn’t exactly have a ton of high leverage situations; the Yankees took a lead fairly early and then built on it, eventually leading 7-1 and the game never being in much doubt. Many times when a reliever comes in, that line is closer to the middle.

To explain further: The closer to the top or bottom of the graph that the line gets, the more in favor the outcome of a game is for a particular team. For example, in this game, we see the line go more and more towards the top of the graph–and on the side you see the top half labeled as “Yankees.” So the more this game went on, the more in favor of the Yankees it was. Many times when a reliever comes in, that line is closer to the middle, not pointing decidedly towards either team, and the bars on the bottom of the graph give you an indication as to how important that particular situation in the game is: the higher the bar, the more critical the situation.

Now here’s where, depending on your outlook, things get really cool or, if you’re me, your head explodes: You can calculate WPA with “series probability added” which would mean the probability that takes into account the current situation in a series — that is, are the Yankees up 3-1, down 1-2, tied with the Phillies or something else — Without going into specifics for the moment, this is a pretty simple concept; the deeper into a series you go, the more high leverage each at bat becomes.

Think about it this way: in 2004, before Roberts steals second, the Yankees are up 3-0 in the series and up in that game. The likelihood they’re going to win–only a few outs away from the World Series–is probably around, say 90% (this is a total guess, but you can probably find the data somewhere) for the game–and probably the series too. Now, move forward a few days and it’s game seven, and things have changed drastically. The series is now 3-3, and thus every pitch thrown matters that much more, every at bat that much more high leverage. Of course, there was that early Damon grand slam, and whatnot, but, yet again, I digress.

What does all this matter? It comes down to this: *You cannot accurately measure a reliever’s WAR, especially a postseason WAR, without taking into account the leverage situations in which they pitch.*

Now, of course, the question is how do we do this? Some suggest that we use WPA — win probability added — but there are problems with this. WPA is a team-oriented stat at it’s core, and the idea is to measure how much more likely a team is to win a game given a certain event, and it’s so dependent on leverage that it’s not a decent measure of raw talent. Read up for more info. I especially recommend clicking on the link to *The Hardball Times* that explains how WPA is more about a “feeling” than anything concrete.

Now, here’s where it gets really cool: The Leverage Index on which WPA is based is not an abstract concept; it’s based on real numbers. I direct you to this article from *The Hardball Times* that explains the math and then the shiny finished product and the chart of Leverage Index, which has the actual numbers.

We can find out, for any team, any inning and any situation–men on base and number of outs–exactly how much more or less likely that team is to win the game, and how important that particular situation is. For individual pitchers (since we’re dealing with pitchers), we can also calculate their average leverage index based on a) a player’s LI for all game situations (pLI), b) A pitcher’s LI for when he enters the game–as in a reliever that enters the game in the seventh, etc. (gmLI), c) A pitcher’s LI based on the inning in which he enters (inLI) and d) a pitcher’s LI when he leaves the game (exLI).

If you go back to this chart, you can see Rivera’s *average* numbers for these situations. For comparison’s sake, here are Ryan Madson’s numbers of the same. As you can see, the leverage situations in which the pitchers enter is roughly the same. They’re both back-end relievers pitching the most or some of the most critical innings for their teams, but that the WPAs are different.

Pitchers can’t control the leverage when they enter a game, but the leverage is a good indicator of the stress a pitcher might face coming in to relieve and how that pitcher responds to it is perhaps most valuable as a measure to determine how composed a pitcher remains in a tight spot. Hence our need for the Leverage Index.

So what do we have so far?

- We’re attempting to figure out Mariano’s WAR for his postseason innings and then convert that to the number of championships Mariano is worth all by his lonesome.
- We’ve explained the theory behind WAR and replacement level (though we haven’t gotten into the nitty gritty just yet)
- We’ve discussed why it’s still all about the leverage, how really smart people have come up with absolute numbers for every conceivable innings-baserunners-outs situation, and how WPA, while shiny and a fun toy, is not as helpful as we would like because it’s a probability stat more than an absolute number.

Fortunately for us, since Fangraphs provides us with LI numbers, it’ll save us a bit of work.

The other key component of WAR is FIP, or fielding-independent pitching. The goal of FIP is simple: figure out how well a pitcher pitches in terms of events that are not dependent on the fielders–strikeouts, walks and home runs. A little more advanced: The theory here is that things such as singles, doubles and triples, may be affected as much by the way the fielders play as by the way the pitcher pitches. The only events a pitcher directly controls occur when a batter does not make contact with the baseball, or when he hits a home run. Basically, all or nothing.

The formula provided by Beyond the Box Score for FIP is:

(HR*13+(BB+HBP-IBB)*3-K*2)/IP.

This formula will give you an odd looking decimal result; generally speaking you add 3.20 to it to get the FIP.

A note of caution: I was never able to get the formula to add up to the same results given by Fangraphs; since different sources do use different formulas (some don’t account for hit-by-pitches, some don’t account for intentional walks, etc.), I’m going to attribute the difference to using a different formula than Fangraphs–since my results were generally close. What we need to do, then, is to figure out the WAR numbers.

Let me state this as simply as I can: I have no issue understanding the theory behind WAR, as hopefully I have successfully explained above; however, WAR is a very complex calculation that is way beyond the scope of my doing. Fangraphs and Beyond the Box Score give formulas as to how to calculate them for pitchers here and here , but the problem is that every time I tried doing the formula myself, I’d end up with very, very wacky numbers.

Like, 74.

I know Mo’s the Hammer of God, and all, but even the Hammer of God isn’t worth 74 wins all by his lonesome.

To make a very long and frustrating story short, I was saved by an Angel of Mercy (who has asked their identity not be revealed — which, alas, means no revelation of formula. Don’t worry, though, I still can’t figure it out). Said AoM sent me a magical calculator thingy (okay, a spreadsheet), and, well, now it’s just a question of doing the following:

- Calculating Mo’s WAR for each of his postseasons.
- Adding the totals together.
- Converting them to a scale that will allow us to compare what he’s done in the postseason with what he’s done in the regular season.

Simple, right?

Alas, this is the part that is incredibly time-consuming. WAR involves constants in their formulas that change from year to year, and since Mo’s been pitching in postseasons since 1995, that’s a lot of constants to go back and find.

Anyway, before I go into the hard data and the results I got, here are a couple of caveats so if you want to try this on your own, you might choose to adjust accordingly. I’d recommend that if you try, you use the link to the Fangraphs explanation, which calculates WAR for Felix Hernandez from scratch. The theory’s easy to follow, but because there are so many components to it, just one not-so-hot constant can throw it off. This is, of course, why they pay people to do these things.

1. We need to understand park factors, but this is a simple concept. Some ballparks favor pitchers and some favor hitters; the park factor is simply a number that attempts to describe whether a ballpark favors hitters or pitchers. The park factor used for 2009 is 0.975, which would make Yankee Stadium (and here you will laugh) a slight pitcher’s park–A neutral park that favors neither pitchers nor hitters will have a park factor of 1. What’s that you say? That can’t be right?

Well…yes. And no.

ESPN has a handy-dandy Park Factor sheet. Now you’ll see for 2009 the New Yankee Stadium is actually *middle of the pack* and registers 0.965. So why increase it to 0.975? New Yankee Stadium has only been around a year, which is a very small sample size. The extra 0.010…let’s just say it’s a nod to all those home runs.

For all the years before 2009, I took an average of park factors from 2001 to 2008 and came up with 0.962. Don’t worry–2005 had a park factor of 1.4+, but this was balanced by an absurdly low 2004. Our 0.962 constant is really right there in the median.

(Going backwards, the numbers we use are 1.040+0.987+0.877+1.403+ 0.694+0.933+0.957+0.805)

2. The leverage used is a modified gmLi (see above) that gives the pitcher some credit for the leverage of his situation, but not all–basically it says that it’s not Mo’s fault that Brian Bruney left the bases chucked with no one out, but if Mo puts his own runners on base and then lets *them* score, he’s gotta be accountable for that, too.

3. The FIP stats are from Fangraphs.

4. As for the constants in the formulas? People get paid to figure those things out. I rely on the AoM’s Magical Calculator Thingy.

So how does one use the Magical Calculator Thingy? One needs five pieces of data:

- The pitcher’s FIP,
- The league’s RA (this is the average runs per game per team. Here we’re going to use the postseason).
- The park factor-we’re using the 0.975 constant for 2009 and the average of 0.962 for every other year.
- The pitcher’s innings pitched.
- The modified gmLI.

Since numbers 1, 2, 4 and 4 will change season to season, the calculations have to be done separately for each season. Depending on the park factor you use, this number can also vary.

As I’ve said before, calculating FIP is itself complex, but we can cheat and just look at advanced pitching stats from Fangraphs. Like many of the advanced stats, can use various formulas depending on the publication you are reading. Anyway, the bottom line here is that we use Fangraphs’ Data because super smart people have already done this for us.

Doing the RA isn’t hard, but it IS tedious. RA is simple–it’s just the total runs scored, divided by games played, and then divided again so you get an average runs scored per team. This takes into account all runs, not just earned runs–since earned runs are a somewhat sketchy stat. It’s easy enough to find the RAs for a particular season, since ESPN handily lists average runs at its stat page, but the numbers haven’t been done for the postseason.

What does this mean?

We’ve got to total every run scored by every team in every game in the ALDS, NLDS, ALCS, NLCS and World Series, and then divide that by the total games played in the postseason, and then divide again to get the number per team. Normally you just used it for the AL or the NL, but I like to consider the postseason a league in its own right.

Again, this isn’t hard, it’s just tedious.

Innings Pitched. This one’s easy. Google “Mariano Rivera stats” and click on any of the links. Since WAR is heavily dependent on innings pitched in terms of value, relievers’ PREWAR (postseason reliever WAR, don’t look at me like that, I just think it sounds cool) will be low–anything over 1 being positively insane.

To compare it to what a WAR would be at the same FIP over a full season, we’ll also plug in the numbers not just for postseason IP, but also for 70 IP, which is about what a closer will pitch over a full season. This will give us an idea of how valuable Mariano would be if he pitched at that same level over the course of a regular season (I’ve explained it more after we get our results).

When we get our numbers and we’re doing our aggregate totals, we’ll use both the PREWAR numbers and the converted PREWAR numbers, so if you want to try some fancy stats work on your own, you can go right ahead and do so.

Let’s start with the 2009 postseason.

We take the Magic Calculator Thingy and input the following:

Innings pitched: 16.

Fangraphs’ FIP: 2.28

Next, we go to Baseball Reference and click on the ‘postseason’ tab. We look at every box score of every postseason game and add up every single run.

ESPN also has run totals here, per team, but they only go back to 2002 and we will need to (eventually) go all the way back to 1995, so knowing how to do it just looking at the BR box scores is of some use.

That gives us a total of 260 runs scored.

We then add up the total number of games played–13 in the division series (three 3-game sets, one four gamer), 11 in the League Championship Series (six and five) and six in the World Series, for a total of thirty games played.

We divide 260/30 to get a total of 8.66 runs scored per game, and we divide that by two again and get a total of 4.33 runs scored per team on average.

Whew.

We’ve got our first three variables, the fourth, park factor, has been preset at 0.975, so now we just need the fifth, the leverage index. We’re going to use gmLI, because that’s the leverage index that gives us an average leverage number for when a pitcher enters a game, and, well, Mariano loves him some high leverage.

As we journey back to Fangraphs, we find a gmLI of 1.45, but don’t enter that in just yet. As discussed above, we need to modify it a little bit. We do this by adding one (1.00, a neutral leverage) and then splitting the sum. That gives us our split leverage of 1.225.

**NOTE: **Postseason gmLI has only been calculated for the 2002 postseason onwards. To get Mariano’s PREWARs for the years before, the regular season gmLIs will be used, minus .44–which is the average difference between regular season gmLIs and postseason gmLIs taken from the years 2002-2009. It should be noted, however, that this figure is slightly skewed by the years in which Rivera appeared in just one postseason inning (mid 00s are chock full of these), and that the actual postseason leverage is probably a tad higher.

So we take our variables, enter them into the Magical Calculator, bada bing, bada boom, we come out with **0.542** PREWAR for the 2009 postseason.

WAIT! You say, how do you know that it works?

It’s pretty simple–plug in the values for the regular season (which uses pLI), and compare the results to the WAR listed on the Fangraphs’ leaderboard. Since those numbers are equal, we can assume they are correct and thus proceed.

So we have our 0.542 PREWAR. How does that compare to regular season WAR?

We change the IP from 16 to 70, or roughly a closer’s regular season innings, and get a result of **2.37**, which is better than Mariano’s 2009 regular season WAR, though less than his 3.1 WAR in 2008 (3.1 is an utterly monstrous number for a reliever, even a closer, and Mariano’s 2008 *was* that good).

Anyway, so, in our PREWAR spreadsheet, we can fill in our first three columns, under the columns “year”, “PREWAR” and “converted PREWAR”.

2009: 0.542, 2.37

Now, it’s a question of doing the same for every postseason from 1995 onwards, with the exception of 2008, because the Yankees weren’t in it, and 2009, because, well, we just did that.

Anyway, here’s what we’ve got in terms of a final tally:

Year | PREWAR | Converted PREWAR |

1995 | 0.200 | 2.64 |

1996 | 0.145 | 0.712 |

1997 | -0.18 | -3.8 |

1998 | 0.389 | 2.04 |

1999 | 0.476 | 2.71 |

2000 | 0.213 | 0.957 |

2001 | 0.535 | 2.34 |

2002 | 0.0221 | 1.55 |

2003 | 0.882 | 3.86 |

2004 | 0.686 | 3.81 |

2005 | 0.077 | 1.80 |

2006 | 0.0045 | 0.317 |

2007 | 0.178 | 2.71 |

2009 | 0.542 | 2.37 |

This will give us our totals:

**PREWAR: 4.17
CONVERTED PREWAR: 24.016**

Now, before we can go into what this data actually means, we need a couple of notes:

- The data is slightly skewed because of the years in which the Yankees lost in the first round of the postseason. Just look at how much lower the numbers are for 2002, 2005, 2006 and 2007 to get an idea.
- The Sandy Alomar Jr home run in 1997 kills Mariano’s PREWAR. For comparison’s sake: in 1997, Mariano’s postseason FIP was over 8.6(!) In 2003, his best postseason (and it’s not even close), the number is 1.28.

The raw, unconverted PREWAR figure is 4.17, so let’s do that one first.

The unconverted number says that Mariano is worth over four wins in the postseason–the equivalent of one round, all by his lonesome self–but there’s a caveat here. The raw numbers here measure a win as having the same value as a win during the regular season–i.e., one win in 162 games. In the postseason, one win is worth a lot more. Since Mariano pitches relief innings only, his innings totals in the postseason are thus suppressed–he’s never thrown more than 16 innings in a postseason–which in turns suppresses the value for WAR.

Now, the raw PREWAR numbers are useful, but they will be most useful when we can compare them to other postseason relievers–this is the epilogue post that you will see following this one, which, if I can figure out how to make one, will have a nice shiny graph. Anyway, enough with the digressing.

So what we want to do here, then, is to convert Mariano’s PREWAR numbers to a number that would be representative to what Mariano would be worth if he pitched at the same scale in the regular season.

The conversion has been done in the table above, but just a refresher: to convert the numbers using the Magical Calculator Thingy, you change the input for Innings Pitched to 70, which is roughly what a closer would pitch over a full season (since becoming a full-time closer, Mariano has pitched between 60 and 80 innings per year, so this number actually works very well).

When we total up the CONVERTED PREWAR numbers, we get 24.016.

That would be, then, 24 wins. Now, let’s go back and remember our very basic assumption, that it takes eleven wins to win a Championship. Twenty four divided by 11 is, of course, just over two.

This means, adjusted to a regular-season scale, **the Yankees have won two of their last five World Series, potentially for no other reason than that Mariano Rivera, and not another closer, was on the mound in the ninth inning.**

Every time we go and we think that Rivera is the Hammer of God, something else comes around to show us that he’s even greater…

wow…just..wow.

….bow wow?

*waits patiently for head explosions to commence*

It’s only been 11 minutes and this thing’s 4000 words long. Wait for it.

…did I, or did I not say I was waiting

patiently, man? I’ve got a 500 page book I just started anyway. I’m good.=P

wow… impressive rebecca….

:head explodes:

(slow clap)

damn this is a good read

very impressive

=)

Excellent post. One of the things that’s really difficult with assessing Mo is that there are no real comps for him. And comps are a fundamental way baseball players are understaood.

Mo is so good that trying to put some boundaries on his brilliance is a real worthwhile endeavor.

One note – as brilliant as Mo may look from Rebecca’s analysis, FIP will systematically underestimate Mo’s skill set as a pitcher and will not fully bring to light how good he is.

ERA would have been a more interesting way of doing this analysis in many respects.

Either way it’s a great piece of analyses and writing.

Mo’s FIP this year was 2.89. David Robertson’s was 3.05 despite a BB/9 three times higher than Mo’s. FIP tends to overvalue the big strikeout guys, hence Randy Johnson’s 3.61 WHIP in 1992 despite walking 144 guys (AJ led the AL this year with 97)

that’s why Gagne’s 4.5 WAR has to be about the upper limit of what a reliever can achieve. Fangraphs only goes back to 1974, but Gagne’s 2003 FIP (minimum 30 innings) of 0.86 was the lowest one on record, and no one else is even close.

*bows down before Rebecca*

*dubs thee sir Evan*

It’s funny, before I read the article, just looking at your title, I said to myself, two.

Thanks for all the hard work to confirm that wild ass guess.

You’re welcome =)

I Like It.

Whoo!

Excellent article. Well done Rebeeca.

Thanks!

So here’s my major issue with the conclusion: I know that replacement level is generally our baseline for comparison, and I understand why. But the flaw here is that, assuming that we’re looking only at Rivera’s postseason numbers and assuming that the Yankees would have made the postseason with a closer we’ll call not-Rivera, we can’t really assume that not-Rivera would have been only replacement level. Not-Rivera would be at least average and probably above average. In that sense, can we still credit Rivera with two full World Series championships?

I understand what you’re saying, but the assumption in this article is how much Mariano is worth over the replacement-level closer.

I understand how it sort of flaws the argument because it supposes a vacuum, but for the sake of argument, we’re pretending the New York Yankees are not the New York Yankees. =)

My intuition is that yes, you probably can. What if the Yankees had kept John Wetteland as their closer after 1996? Let’s use him as our Not-Rivera. If Mariano is worth 22 wins, and thus two World Series Championships, maybe Wetteland is worth at least 11, or one championship.

I could very well be mistaken, as I’m also not sure how to scale an good-to-All-Star caliber non-Mariano reliever to Mariano based on Rebecca’s results.

One of the followups I have planned is to compare Rivera to the other postseason relievers in 2009, just so we get an idea as to how it relates.

The problem with postseason comparisons is that data is relatively limited. We know what Rivera’s impact in the “just getting to October” scale is. It’s far more challenging to figure out October impact, and I think replacement level closer for the regular season is far different for “replacement level playoff closer.”

If the worst closer in the postseason is, say, Joe Nathan because he threw few innings and they weren’t good, then, that’s to whom you have to compare Rivera at least during the playoffs, right?

Hence my comment about a follow up post.

But then your conclusions will change, and Rivera probably will be responsible for one of five instead of two.

I’m not pushing back to minimize Rivera’s contributions, but baseball watchers tend to overvalue closers are a whole. Most relievers — and note “most” — can close out a game. The Yanks probably would have won Game 6 without using Rivera to protect a four-run lead over five outs. The idea is pinpoint here where Mo differs from that replacement-level playoff closer, not the generic replacement level closer, and what benefit the Yanks derive from that.

At the same time, we’ve seen how blowing one single lead can cost the team an entire postseason a la 2003. Problem is, we can’t possibly know what would have happened if for instance Fuentes doesn’t serve one up to A-Rod this year. Angels could have ended up winning the World Series if not for that. This is by far the closest I think anybody will actually come to quantifying his impact.

In statistical analysis, blaming Rivera for what I think you mean to be 2004 is a logical fallacy. Rivera’s actions — the inability to prevent the Red Sox from tying the game — in Game 4 and, to a lesser extent, in Game 5 caused the Yanks to lose each of those games. The result of that was eventually a series loss, but you can’t pin that solely on Rivera through WPA or any sort of leverage analysis. The rest of the team failed miserably in Games 6 and 7.

In legal teams, Rivera was a but-for cause of the Yanks’ loss. But for Rivera’s pitching, they could have won, but he wasn’t, as you seem to allege, the sole but-for cause of their loss.

I may be a little rusty with this but-for stuff, but I think you’re kind of misusing it a bit here. Saying “but-for Rivera’s pitching, they COULD have won” doesn’t really mean anything – it doesn’t actually make Rivera a but-for cause of the loss. The but-for test would only mean something if “but for” Rivera’s pitching the Yankees WOULD have won. So I don’t think he was a but-for cause of the 2004 loss, if we’re being technical about it.

(Waiting for colleagues in RAB LLP to tell me I don’t remember the but-for test correctly and that I’m wrong.)

How about taking a different approach? Look at the Yankees’ average run differential in games that are close after the 7th inning when Mo pitches and compare it to other playoff teams’ average run differential when their closer pitches after the seventh. Add it all up and voila!

The data might not be readily available, but then again I’m sure a lot of research went into the WAR attempt.

FanGraphs gives only about half credit for the leverage a reliever finds himself in. This is because of “chaining” which you alluded to. Rivera wouldn’t be replaced by a replacement-level pitcher, he’d be replaced by a replacement level closer, AKA an 8th inning guy.

Uh… why would you use 24 and not 4 as the final number? It should be fairly obvious that 24 is way too high for wins added over only 88 games. That’s a little more than a season, and for comparison Albert Pujols has been worth about 24 wins above replacement over his last three seasons combined.

The 24 is just a conversion to give an idea of scale. It’s not 24 over 88 games, it’s 24 over 70 innings per 162 games times fifteen years.

Ok, but you go on to use that number in the context of 88 games again at the end of the article. Mariano rivera is not the sole cause of two of the yankees’ championships.

No, no one’s saying he’s the sole cause–simply that the Yankees may have very well not won two of their championships without him.

There is a difference, however fleeting, hah.

At any rate, if the conversion doesn’t work for you (it seems to work for some and not for others), stick with the raw numbers =).

“the Yankees have won two of their last five World Series, potentially for no other reason than that Mariano Rivera, and not another closer, was on the mound in the ninth inning.”

Yeah, my wording sucks. It’s the very last thing I write after three days of non stop work. I’m tired and have the sore throat of doom. Work with me, man =)

Fair enough, great article besides that one point.

Rebecca,

Did you give Rivera full credit for his leverage or only half-credit?

Took me a good 40 mins coupled in with some breaks for eating/T.V.

This is hardcore stuff

But how many World Series is Rivera responsible for losing?

Just 2001 if any at all.

Pinning that on Mo would be so incredibly stupid though.

04?

Holy. Crap.

…MIND EXPLOSION!

Sabermetrically, Mo’s been worth 2 championships. Realistically, he’s been worth 27 (duh!).

I’m going to sit down and really read this tomorrow (ah, it’s 3:20 AM). But, based on what I “see,” there was a lot of effort put into this. That alone gets props.

Wow, never has a lurker like me been given more of an incentive to respond. Very impressive methodology, good job. Perhaps the only thing that is to be said is that as a caveat, “correcting” the WAR values by extrapolation assumes that his performance level is constant – which of course is not the case.

But that’s just nitpicking. Well done, well done!

This was well worth the read. Some of these stats are kind of confusing, but I like the main point that Mo’s worth 40% of his rings. Take that Papelbon!

“So what we want to do here, then, is to convert Mariano’s PREWAR numbers to a number that would be representative to what Mariano would be worth if he pitched at the same scale in the regular season.”

So basically, you are prorating Mariano’s results from each postseason over a 162 game season and then adding them up to get to 24.016. That doesn’t show how many wins Mariano is responsible for during his postseason career.

Doing the playoff WAR numbers yourself is pretty impressive on your part, and impressive for Mo. I don’t think they get anymore impressive by prorating them because, like you said, they are basically inline with his regular season numbers.

Mariano’s affect on the Championship teams is never going to be explained in statistics. He affects the way every other Yankee pitcher is used. He also puts more pressure on the opposing team to get a lead by the 8th so the Sandman doesn’t come in.

I think the easiest way to show people Mo’s value in the future will be to show clips of analysts (regardless of how little they may know about baseball) talking about how all the Yankees need to do is have the lead after 7 innings and they are a lock to win.

Hmmm… no.

Mariano’s affect on the Championship teams is never going to be explained in statistics.Funny, this article did just that, and much more simpler throwing out his post season stats would do that.

I think the easiest way to show people Mo’s value in the future will be to show clips of analysts (regardless of how little they may know about baseball) talking about how all the Yankees need to do is have the lead after 7 innings and they are a lock to win.Seriously, just seriously, facts >>>>> what some random baseball analyst says.

How does prorating Mariano’s postseason stats over a full season explain the affect he had on innings that he didn’t pitch in. The idea that all the Yankees have needed to do over his career is get a lead by the 8th inning and they win the game and the pressure it puts on the opponent can’t be shown in statistics.

I understand not liking the idea of using analysts to explain Mo’s dominance, what I meant is that it’s going to be stories about Mariano that explain his greatness.

I still don’t understand how you apply the prorated WAR to postseason wins.

Do the Yanks pay people to crunch numbers and stats like this? If not, should Rebecca be on the payroll? I think Ben and Rebecca replace Sterling and Waldman because hey – you can’t predict baseball…..Great piece

No, the Yankees do not pay anyone to crunch numbers. Cashman has always surrounded himself with mystics and soothsayers. I believe this year they used more blood and fewer bones in their ceremonies, and that’s how they got Nick Swisher.

After all the stat-crunching, converting the wins to a 162 game schedule and then throwing the new regular season figure back into a postseason context seems pretty arbitrary. You would get very different numbers if the regular season were shorter or longer than it is, even though that would have no bearing on what Mo has done in the postseason.

The raw WAR numbers don’t seem to do him justice, and I like the idea of using LI to bring that out, but there is a lot of hand waving in this.

Rebecca,

First of all, awesome job. This analysis is a huge undertaking, and it takes a lot of guts to put something like this out there for public consumption.

Just a few constructive criticisms:

1) Ben said this earlier, but I think postseason replacement level is different than regular season. However, figuring out the new baseline for postseason replacement level is a huge undertaking in itself, so great job working with what you had. I look forward to the follow up.

2) I think putting his numbers into a regular season scale is sort of arbitrary, as it doesn’t really show his value in the postseason alone. That said, I understand why you used that methodology.

Thanks for the good read, and once again, awesome job.

Congrats, Rebecca. This was a massive, massive undertaking. Despite some flaws, you did a damn good job on this.

this hurts my brain, but warms my heart.

bloody brilliant.

This is the kind of analysis that I don’t really think is worth doing because it ignores context and relies on small comparative samples. Also, in the post season, games probably do have a greater impact on each other (e.g., manager’s planning, pressure faced by players, etc.), so isolated game-by-game analysis doesn’t account for that either.

The effort is definitely impressive, but there really isn’t much to give me confidence in the conclusion.

Well done Rebecca.

http://www.youtube.com/watch?v.....re=related (safe)

Of course you can’t mathematically decide exactly how valuable a player has been because baseball (like all sports) isn’t played in a vacuum. But having statistical breakdowns like this, along with judging with your eyes, and having an understanding for the game can help anyone judge player value. It’s not the be-all and end-all, but it’s a great tool/concept to have in your pocket.

Nice job!

Of course you can’t mathematically decide exactly how valuable a player has been because baseball (like all sports) isn’t played in a vacuumYes you can. The sport isn’t played in a vacuum, but the calculations made to determine a player’s value are not made in a vacuum either. The context of the season changes every single year and the adjustments are made for the calculations of value.

+1 for understanding relative win values. It may not be calculated in a vacuum, but it doesn’t need to be in order to be accurate. We just need to understand the variations in the baseline from year to year.

Why go into stat nonsense when you can just use common sense and the eyeball test???

If this post season doesnt tell you that Rivera is the most valued Yankee and player of the last 15 yrs you arent watching the games. They may have zero titles without him. Hes irreplaceable. Just ask the managers and fans of the Angels, Twoinns, and Phils.

Oh, Bo. So reliable.

Hahah! You make me laugh, Bo. The “eyeball test” and “zero titles without him” are just ludicrous statements to make. Well played.

By the way, what city do the Twoinns play in? Minnnneaopolis?

Just write “tl;dr,” it’ll save you the time.

First off, amazing work.

Is WAR for closers different than WAR for relievers, I.E. is the replacement level closer different than just a replacement level reliever? I don’t believe so, but correct me if I’m wrong.

If its the case then we’re comparing WAR to the available reliever, which is realistically never going to be a closer, especially not for a team in the playoffs. So Mo is 2 WS wins better than a terrible closer, but the better comparison would be to a league avg closer, which obviously would take a ton more work, and I believe basically what Ben mentioned.

Rebecca – It looks like there’s some decent constructive criticism here in the comments (and I’m sure over at your blog, too). It would be awesome, if you’re not too sick of this research to revisit it, if you went back and maybe took a look at some of it and decided if it would maybe improve your work to implement some of the ideas here. Then again, if you’re just over it at this point, that couldn’t be more understandable.

All that aside, though, this was awesome. This is an intimidating subject to take on, so good for you for even attempting it and for seeing it through to the end. Very interesting stuff, and a very interesting read.

The amount of work that went into this report was Mo-like in of itself (I follow Rebecca on the Twitterz and saw the hell she put herself through to make this).

Well done Girl. Well Done!

My brain hurts.

My question is, if you do this for all the players that played on the Yankees 5 championships, will is equal 5? It’s hard to imagine that all the other players combined are only worth 3 championships. This just doesn’t pass the smell test. Like people mentioned earlier, the Yankees could have had Robertson closing this year, and someone else closing in 2000 and 1999. Maybe they wouldn’t have won all 5, who knows, but it seems like they wouldn’t have lost 2 of the 5 if they had someone else closing.

Well, no, because they won postseason games in years they didn’t win a championship. Again, the idea is that 11 wins = championship, but those 11 wins could really come at any time.

OK, I’ll admit it. When I first read this, I read about two paragraphs, and then scrolled down to see what the answer was, and then later on came back to read it in its entirety.

Wow.

Very good article. And I love all the work you showed, rather than just say 24!!!! WAR

Rivera is the greatest reason why the Yankees are the Yankees in his period of playing. Without him, and a mere mortal hurling balls at the plate in the 9th, we wouldn’t be able to overcome the deficiencies in the bullpen that we experienced for so many years.

Wow.

I think I just fell in love with Rebecca.

This is some ridiculously intense work.

Consider my lack-of-brains blown away.

Rebecca,

Congratulations for the heroic effort. You managed to stay on track, carry us through to the end, and stay charming throughout. I may be in the minority here in that I generally tune out most of the sabertmetric calculations floating around, but your engaging explanation did a lot to help me understand them.

And your effort was a noble one, because it actually resulted in a measurable stat. And it’s one that has a strong feel of plausibility, and that right there captures my weakness in the face of the new wave of statistics.

Nice work, but I question whether eleven wins equals a championship. Obviously, the right eleven wins does, but the ratio of post-season games won to World Series’ won is a lot higher than eleven.

To take a simple example, the Phillies won 20 post-season games the past two years. I don’t think that it would be correct to say that a Phillie who had a converted postseason WAR of 5.5 over the two years should get credit for half a championship.

Maybe that falls out in the calculations somewhere and I just missed it.

This is one of the best articles I have ever read, from both print sources and online. The level to which you took the statistics is phenomenal! I found it very easy to follow and understand (though this may have something to do with the fact that I adore statistics and am going into a field where I will deal with them on a daily basis. ;D)

I agree with whomever stated above that you should look into taking the given constructive criticism and revisiting this. I’m especially curious about the question regarding if everyone else adds up to three wins.

One OT end note: the gmLI stat? I’ve always read as “Gimli” and I think you’re one of the few people who can appreciate that

Great job! I can’t wait to read more from you in the future!

[...] the comments of Rebecca’s post about Mariano Rivera’s playoff WAR, commenter CB noted that FIP does not treat Rivera fairly. It consistently estimates his ERA above [...]

[...] a 0.70 ERA with a 0.759 WHIP in the playoffs, allowing a grand total of four runs from 2002-2011. Post-2009 calculations estimated his career postseason WAR at a hair more than 24.0 considering the differences in [...]