By the numbers, A-Rod not un-clutch

Derek Jeter and Lou Gehrig all tied up
Bronx baseball businesses cast a wary eye toward 2009

For the next nine years, we’ll continue having the same debate over and over again: Is A-Rod clutch? Clearly, he’s not doing himself any favors this year. His lack of timely hitting is pretty indisputable this year, though he has brought his average with RISP up to .268, from .248 at the end of August.

Many people think that clutch is unquantifiable and/or a luck-prone stat and disregard it. That’s been a popular sentiment since Baseball Prospectus became relatively mainstream. While I’m not sure where I fall on the issue, I do know that there seems like a perfect stat to qualify clutch situations: Leverage Index.

We saw this stat last year, when we ran some WPA graphs after early-season games. I’ve linked to the definition of Leverage Index above, but the premise is that the higher the leverage index, the more critical the situation. This takes into consideration score differential, outs, runners on, and inning. Basically, it answers the question: How important is this at-bat to fate of my team?

Last week, Carl Bialik, The Wall Street Journal’s Numbers Guy, examined A-Rod’s clutchiness. He uses A-Rod’s OPS in high, medium, and low-leverage situations. He funs:

His career OPS in high-leverage situations is .975. In medium-leverage, it’s .960. And in low-leverage, it’s .972. That’s consistent with the American League as a whole during his career, when each year batters in high-leverage situations hit somewhere between 1% worse and 6% better than they did in low-leverage situations.

Since we’re talking about A-Rod’s failures this year, Bailik shows us that yes, A-Rod hasn’t been that clutch in 2008″

In 2004, he hit 19% better in high-leverage situations than in low-leverage ones. In 2005 and 2006, he hit 17% worse. Last year, he hit 15% better. And this year, he’d hit 32% worse, through Monday.

Bialik went to Jim Albert, Bowling Green State University statistician, for further findings.

The problem is small sample size: In a typical season with the Yankees, Mr. Rodriguez only gets about 130 plate appearances in clutch situations. That’s also why we can’t learn much from his 44 at bats in the last three postseasons, when his performance was abysmal.

Prof. Albert was apologetic about his findings: “Sorry for not giving you better news — no significance is generally not front-page stuff — but this illustrates the dangers of trying to make too much from this type of situational data.”

Take what you will from this. For me, it’s just more uncertainty in the perennial clutch debate.

Derek Jeter and Lou Gehrig all tied up
Bronx baseball businesses cast a wary eye toward 2009
  • radnom

    The reason you saber guys fail in any discussion about “clutch” is that you fail to consider the players mental state and only consider stats. You can’t deny some people thrive under pressure, while others flounder.
    This is not just a misconception.

    Now, this gets muddled from a regular fans perspective in baseball because of the high rate of failure for a batter and relatively small sample sizes but do you not see it in other sports? You can’t tell me there is no such thing as a big game pitcher.

    That being said, yeah Arod is having a down year this year in certain situations, but the Arod haters are ridiculous. I refuse to even debate them.

    • steve (different one)

      The reason you saber guys fail in any discussion about “clutch” is that you fail to consider the players mental state and only consider stats. You can’t deny some people thrive under pressure, while others flounder.

      i don’t understand this. if someone “flounders” or “thrives”, wouldn’t the stats capture that?

      • radnom

        Small sample sizes.

        Even if someone is a monster in clutch situations, say would hit +30 above average, they will still make out a majority of the time and that means if they hit a few bullets right at someone in those situations it seriously hurts their “clutch” stats. So luck DOES play a role, but only in how the stats turn out. How confident and geared up someone gets for a clutch situation is not effected by luck.
        Make sens enow?

        Anyway, I don’t think Arod is a victim of bad luck this year, I think he just got into a (mental) funk and never recovered.

    • Joseph P.

      Sorry, this comment pissed me off:

      “The reason you saber guys fail in any discussion about “clutch” is that you fail to consider the players mental state and only consider stats.”

      Yes, because the non-saber guys are fucking experts on the psychological states of people they’ve never met.

      • radnom

        There is a difference between being a “fucking expert” and at least considering a major aspect of the game. They are human you know, not robots.
        This is not necessarily an attack on you, but some people get so rapped up in sabermetrics that they immediately discredit anything that doesn’t fit in their model.

        • Joseph P.

          Trust me, no one things robots play baseball. However, I think that looking at stats in high leverage situations warrants just as much consideration in the clutch argument as the perception of a player’s mental state.

          • radnom

            I didn’t say that didn’t warrant consideration, and I don’t know why you are taking this as an affront you your post. Surely you know that there are a lot of people out there who completely and immediately dismiss any notion of ‘clutchness’ as anything more than pure luck. In fact you basically said the same thing yourself:

            “Many people think that clutch is unquantifiable and/or a luck-prone stat and disregard it.”

            I don’t think its necessarily a bad thing to look at it from a stat side, all I was implying was that these people sometimes get too in love with the stats, and any discussion about the topic is really incomplete unless you consider how a player mentally handles certain situations. (Unless of course, you feel that every player responds to pressure situations in exactly the same manor….)

            • Joseph P.

              Sorry. Your original post made it seem like we should disregard the stats and just focus on a player’s mental state.

        • ceciguante

          i couldn’t agree more with this comment. well said.
          stats are great, but they are just models of reality, not the reality. many (certainly not all) people citing stats on controversial issues as “clutchness” make the mistake, in my view, of assuming that randomness, luck, etc. — or anything not included in that statistical formula — just “evens out” over time. leads to lots of ridiculous ‘if you doubt my stat, you’re stupid, b/c numbers don’t lie’ type statements.

          there is no denying the confidence and mental state of a player — ask yogi berra. i’m a huge believer in the mental side of the game. arod was visibly more comfortable at the plate last year (reg season) compared to the playoffs (any year) or this season. it’s no wonder the stats on leverage index reflect this. i aint no psychologist, but it doesn’t take an excel spreadsheet to know he looked calm and deadly at the plate last season, while he’s squeezed the bat to dust at other times.

        • The Honorable Congressman Mondesi

          I understand the sentiment underlying your point but I think your statement is a bit misguided. Sabermetrics, or really the study of statistics in general, is about identifying quantifiable trends based on actual performance. The best evidence of such actual performance lies in the numbers, the stats, as opposed to anecdotal evidence, gut feelings and subjective observation. Sabermetrics is not about identifying the REASON for actual performance. I don’t think Bill James would tell you that statistics measuring a player’s record in particular situations mean that player was either affected by mental state or not. He’d simply tell you that “hey, this is what happened, whatever the underlying reason.” I can’t speak for others, but the problem I have with your statement, as someone who believes in the study of statistics as the most accurate and objective method of evaluating performance, is that you assume I think those statistics mean something that I don’t.

          • radnom

            “is that you assume I think those statistics mean something that I don’t.”

            No one said that you, “Honerable Congressman Mondesi”, thought that. But a lot of people do.

            As ceciguante said: “leads to lots of ridiculous ‘if you doubt my stat, you’re stupid, b/c numbers don’t lie’ type statements.”

            Its a shame that attitude is so prevalent because it undermines what is a lot of very good and valid stuff. I work in a field that relies heavily on math/stats and I do appreciate what it means for the game. And, like you, I understand its limitations.

            Like Joseph said, the “popular sentiment” has been to completely disregard all notion of “clutchness” since sabermetrics became popular. That seems to be inferring a bit of reason behind the stats, don’t you think??

            • The Honorable Congressman Mondesi

              I guess I had a different impression of what Joe meant in his post. I thought he was saying that stats used to identify “clutch” have been seen as unreliable and not dispositive. He said “clutch is (an) unquantifiable and/or a luck-prone stat and (many people) disregard it.” (Parentheticals added by me.) I didn’t read it as an indictment of the very concept of clutch – the concept that there are some situations that are higher leverage situations than other situations and that some players may perform at a higher level in those situations than other players.

              As far as “ridiculous ‘if you doubt my stat, you’re stupid, b/c numbers don’t lie’ type statements'”… I don’t know… I mean, numbers DON’T lie. I think what you mean to argue against are certain people’s interpretation of those numbers, and that’s fair game. But Joe, in his post, certainly didn’t echo that sentiment at all, so I just think saying it in this context is kind of tilting against windmills.

              Whatever… In the end, I don’t think I’m really disagreeing with you so much. I’m just saying that in the same way you think it’s wrong for “Saber-people” to dismiss the opinions of others, I think it’s also unfair to assume all people who are interested in statistical analysis are completely devoid of any notion of human emotion and the effect it may or may not have on people’s performance. I know you didn’t say “You, anonymous commenter who goes by the asinine name ‘The Honorable Congressman Mondesi,’ think that human emotion and mental state don’t affect performance because you clutch to your nerdy little numbers.” But you did say “the reason you saber guys fail.” Saying something like that just touches a nerve in the old “crusty old baseball mind vs. saber geek” fight. (Not saying you used those words, just explaining why, to me, they undercut your argument a bit with certain members of your audience.)

              • The Honorable Congressman Mondesi

                And just by the way, as a PS, I appreciate your point of view and the discussion/argument in general and hope you don’t take my comments as an attack on you. I’m a strong believer in learning through arguing and testing opinions against others. We learn nothing by all agreeing and cheerleading each other.

              • radnom

                (Not saying you used those words, just explaining why, to me, they undercut your argument a bit with certain members of your audience.)

                Understandable. Perhaps I could have phrased it less confrontationally. My bad.

  • pat


  • dan

    What looking at clutch statistics shows us is what a player has done in the past, and shows nothing about what he will do in the future. In other words, clutch stats are not indicative of skill and are not a forward-looking statistic.

    Remember how baseball players used to be labeled as “rbi guys” just because they racked up high rbi totals while hitting 5th? I think we’re at the point where we know rbi are not a good way of evaluating a player. Clutch is similar–if a guy is good in the clutch one season it has no bearing on the next season

    radnom: Did A-Rod’s mental state change that much from last season to this season? Did he go from being supremely confident (while playing under the pressure of a walk year, no less) to being a nervous wreck in just a few months?

    Just to put this in perspective for some people who don’t like when “small sample size” is used in an argument. Would you ever evaluate the whole of a player’s season, making sweeping conclusions about it, on May 21st of that season? I didn’t think so. If he’s bad, you’d say, “he’s just getting off to a slow start, it’s too early to tell.” In 2008, A-Rod has had 99 at bats in “high-leverage” situations. On May 21st of this season, he had 98 at bats total.

    Now how can anyone dispute that?

    • radnom

      radnom: Did A-Rod’s mental state change that much from last season to this season? Did he go from being supremely confident (while playing under the pressure of a walk year, no less) to being a nervous wreck in just a few months?

      No. But like you said, statistically small sample size accounts for some of it. Its also not so much that he is a “nervous wreck” right now, but he had perhaps the best season of his life last year. He got off to a great start, hit some HUGE 9th inning homeruns in April to win some games last year and never looked back. This year he got injured early, and while he has hit well overall, he hasn’t seemed to have that same comfort level (speculating) in tight spots. Also count in a number of times hes smoked the ball right at some one, with the fact that its under 100 at bats, and you have some less than pretty numbers.

      • dan

        “smoked the ball right at someone”

        Which means that it is not his mental state that is causing him to fail, it’s him getting unlucky. I can’t stay and chat, but I’ll respond later if you want to continue.

        • radnom

          Can’t it be some of both?

    • Nady Nation

      You can’t just use the contract year as being pressure-filled and not take into account the other main part of contract years: the fact that you’re playing for more money, which A-Rod is clearly motivated by (i.e. his decision to sign with a last place club b/c they were paying him the most). And maybe he’s just trying to do too much in high-leverage situations this year, who knows? I’m not trying to kill A-Rod here, nor am I discounting the evaluation of statistics in high-pressure situations to evaluate “clutchness”, but I do agree with radnom in the sense that the player’s emotional state should also be factored in to the “clutch” argument.

      • Nady Nation

        Unfortunately, it’s impossible to get into someone else’s head, making it hard to quantify the emotional aspect of “clutchness.” I don’t think I’m on either side, I’m mainly just trying to say that I understand radnom’s point

        • radnom

          Not to mention that someones mental state can change from day to day – year to year. Brad Lidge sure took a while to get back on track after one bad playoff series.

          • The Fallen Phoenix

            Except the numbers don’t really bear that out: he was still a really, really good pitcher, even after he gave up that home run to Albert Pujols.

  • Old Ranger

    Stats…the intricate, nitty stats I hate. I go by what I can see and with a little help from stats.
    Baseball is a game of stats (more than any-other sport) can’t live without them when judging a ballplayer. Let’s face it, many times we have heard “his fastball is in the low 90s or high 80s, everything is smoke and mirrors”? Come to find out hitters can’t center the ball very well against this guy, but his stats show 2Ks per 9, 10 hits, 0walks and era of .250 and a 12-4 record…going by the stats is this guy any good?
    Stats are needed, but I will always tie them into the man himself…let me see him for a year or 1/2 year, then tie everything together.

    • mustang

      “I go by what I can see and with a little help from stats.”

      “Stats are needed, but I will always tie them into the man himself…let me see him for a year or 1/2 year, then tie everything together.”

      I could not agree more I think in the past baseball place less worth on stats and now it drives the sport. Sometimes great stats don’t explain everything that happens on the field I think A-Rod is an excellent example of that.

  • Bo

    Use common sense and use your eyes. You can twist stats to fit any model.

    A-Rod was awful in the 7-9th innings this year. Especially in July-Aug.

    • mustang


      • steve (different one)

        and how would stats disprove that A-Rod was awful in 7th-9th innings from July-August?

        if he was awful in those situations, he’ll have awful stats.

        what are you even trying to argue?

        • mustang

          That it’s just not stat alone. I could tell about his stat with RISP and you can tell me about his OPS, but at the end what did we both see in clutch situations in the must have games in July-August.

    • Mike A.

      Did you ever see Memento?

      Memory can change the shape of a room; it can change the color of a car. And memories can be distorted. They’re just an interpretation, they’re not a record, and they’re irrelevant if you have the facts.

      You’ll remember the strikeout with the bases loaded long after you’ve forgetton the 2-out 2-run double to the opposite field.

      • mustang

        “You’ll remember the strikeout with the bases loaded long after you’ve forgetton the 2-out 2-run double to the opposite field.”

        Then how come with someone like David Ortiz or Manny it works the other way.
        There is a reason why A-Rod is look at in this way.

    • Old Ranger

      Hell, the whole team underachieved, so I don’t think one can put the blame on any one person. I don’t think you were doing that, there is enough to go around.

  • Manimal

    I’m curious to see what Jamal has to say about this.

    • mustang

      You know what he would say are you kidding me.

  • tommiesmithjohncarlos

    Comparison Time!

    Player – Low LI OPS / Medium LI OPS / High LI OPS (all career)

    Morneau — .861 / .811 / .914
    The Swedish Sheff — .884 / .925 / .938
    Larry Wayne Jr. — .975 / .940 / .938
    Delgado Del-got-it — .902 / .945 / .953
    Vlad the Impaler — .922 / 1.000 / .963
    Alexander Emmanuel — .972 / .960 / .977
    David Orcheats — .936 / .909 / .990
    Sexy Texy — .905 / .880 / 1.002
    Manny Beating Manny — .983 / 1.015 / 1.017
    Reggie Stocker — 1.026 / 1.080 / 1.037
    Poo-Holes — 1.043 / 1.008 / 1.146

    What do the stats “say”? That while ARod is probably no better or worse of a player in game situations of increased importance, many–perhaps most–of his peers actually improve their performance in those high-leverage situations, some quite significantly.

    And for the record, his peers are the elite, middle of the order bats; that’s who he should be compared to. Defending ARod as being “clutch” by saying that his RISP OPS is 200 points better than the league average doesn’t mean much, because he’s supposed to be way better than league average; he’s a premier player and a significant part of our resource allocation strategy is committed to him being a great player in great moments. Matching up the statistical evidence with the anecdotal evidence, there’s a natural reason why many of these players would produce better outputs in high-leverage situations – namely, the game is structured strategically to be placed in their hands. ARod and Manny and Ortiz and Pujols come up countless times in their careers in late innings with runners on and a chance to get a hit that will determine the outcome of the game because the team and the lineup around those players is structured to give them those opportunities. ARod hasn’t proven to be consistently good in those situations, not as consistently good as his peers.

    • The Fallen Phoenix

      For a fair number of those jumps (say, for Delgado, Tex, Vlad, Morneau, and Sheffield), they’re starting at a lower low-leverage OPS “base” than A-Rod does; they need to elevate their game *just* to get to A-Rod’s “base” level of performance.

      In that light, it doesn’t really seem fair to “dock” A-Rod points for failing to raise his level of performance even higher than that. I think it’s perfectly fair to reward, say, Manny Ramirez for doing it – his numbers are simply unreal. The same goes for Pujols. But you can laud one or two truly exceptional hitters without turning A-Rod into a goat; again, over his career he has performed just as well in high-leverage situations as low-leverage situations, and that’s still *really* damned good.

      And we get so caught up in “leverage” and “the clutch” that we forget that what happens in “low leverage” situations (say, the first through third innings, when there’s no score) can also play a really big role in the game. I’d rather have someone perform well in *all* situations – as most elite baseball players, A-Rod included, do – than someone who is lesser the majority of the time, but does slightly better in situations that may or may not otherwise arise if those same players were better the rest of the time.

      Clearly that wouldn’t really apply to someone like Manny, Pujols, or – arguably – Ortiz, but I think it’s fair to say that applies to a Morneau, Delgado, or Sheffield. Not that an .880-.900 OPS isn’t something to sneeze at, or something that isn’t valuable, but you’re still talking between 60 to 90 points of difference to A-Rod’s “base” performance, and that’s going to lead to more runs over the course of the season.

      They won’t be distributed evenly, obviously, but I’ll take the player who will produce more runs over the course of the season than the player who doesn’t, since there’s certainly a higher chance that simply from producing a higher volume of runs, there will be a statistically better chance (however remote) of those runs impacting your team’s record more often than not.

      • tommiesmithjohncarlos

        Excellent points. I’m just trying to say, not every person in the world who says something critical of ARod thinks he’s a bum; I certainly don’t. But, those who do criticize ARod do have a sliver of truth in their criticism, because while he’s clearly heads and shoulders better than 95% of the baseball world, there are some other really good players who seem to handle pressure AB’s better than he does, and that is a bit disconcerting. Because I think we need him to be better in those situations in order to have an offense capable of keeping us in title contention for the duration of his stay in pinstripes.

        • Haggs

          Very well said.

          He’s not nearly as horrible in big spots as some make him out to be (myself included at times), but he could most certainly be much better than he has been.

          I often wonder how Arod would be perceived today if Dave Roberts got caught stealing in Game 4 of ’04 and the Yanks moved on to the WS. ARod might have been the ALCS MVP, instead he vanished in games 5-7 and for pretty much the rest of his Yankee playoff career.

          It’s a little different for Alex, but it doesn’t take much to go from zero to hero. Tino Martinez was gawd-awful in the playoffs until his grand slammer in ’98. He got benched in ’96.

          Damon was about 0 for 40 until his grand slam in game six of ’04. Now both players are considered clutch.

          Bonds was awful in the playoffs with the Pirates and heard a lot of the same criticism ARod hears today. And even though the Giants lost to the Angels, Bonds (and some steroids) had a playoff run for the ages that year and everyone stopped talking about how he never hit in the clutch.

          So hopefully this is not a permanent condition. But I think it’s definitely a current condition.

  • Gavino

    The problem with a stat like “Leverage Index” is that it masks true clutchness with an overall sense of the “value” of the player to a team. Thus people like A-rod, who, to the naked eye of people who watch every game, is not remotely clutch, come out looking more clutch than they are.

    I have been following MLB since 1960, and I can’t think of a single “big spot” that occurred before the 7th inning. So for the index to even give a little weight to anything before the 7th inning just skews what it is we’re trying to actually measure.

    I would claim that “clutchness” measures how a player does – pitcher or batter (and maybe even fielder as well) in a “big spot”. The problem then becomes one of defining what a “big spot” is.

    I submit it is this:
    – 7th inning or later
    – The batter’s team is down 4 or fewer runs

    I have considered the impact of runners on base. We could explore the relationship of runners on base to runs behind – thus, ostensibly, if the batter’s team is down, say, 2, and there are 2 or 3 men on base, that “big spot” would contain more pressure. But for now, I’m not going to include that.

    I would further define a “productive plate appearances” as follows:
    – A hit
    – A walk (ie, Base on Balls, NOT HBP)
    – A sac that scores a run
    – A productive out, ie, one that moves runners into scoring position
    – NOT an error
    – NOT a HBP (since the batter generally has no control over that)
    – NOT an unproductive out (K, any out that does not advance the runner(s))
    – NOT a DP, UNLESS a runner is advanced who eventually comes in to score without the game ending in a loss in that inning.

    Again, we could give weights to rank these productive plate appearances – anything that does not result in an out might be ranked higher than anything that does; anything that scores a run might rank higher than anything that does not. But I feel that just adds complexity which does not shed more light on what we are trying to measure – how much of the time does the player have a productive plate appearance in a big spot.

    So, then, the clutch percentage would simply be:

    # productive plate appearances in big spots/# of big spots

    I claim this simple stat would accurately reflect a batter’s clutchness as well as any other.

    I further claim that the very same stat could be used to measure a pitcher’s clutchness – with the understanding that the lower this ratio the MORE CLUTCH a pitcher is (whereas for a batter the opposite it true).

    I further claim that this same stat could be measured for teams, BOTH from a hitting AND a pitching standpoint.

    Finally, I suspect it could even measure a fielder’s clutchness, assuming there were a statistically significant number of fielding opportunities. That is, for a fielder, a “big spot” would have to have an additional condition – that the play should have involved them. This gets murky – for example, a bunt. Whether the catcher should have fielded it or the 3rd baseman or the pitcher is a judgement call. And where the ball should be thrown – get the lead runner at third? Get the sure out at first? – is also up for debate.

    What is clear, however, is that this stat is much easier to track, much clearer in definition and, I claim, as much if not more accurate than the description of the Leverage Index you link to.