Getting to know BABIP

Open Thread: Continuing the Torre story
Poor Bobby lowers his contract demands

Sometimes in baseball things happen that we just can’t explain, and when it does happen we call it luck. Good luck, bad luck, whatever. One of the biggest statistical luck fiends in BABIP, or Batting Average on Balls in Play. Nick Swisher posted a career low batting avg last year (.219) despite a career high line drive percentage (20.9%) how? Bad luck, evidenced by his absurdly low .251 BABIP, fourth lowest in baseball. Diasuke Matsuzaka posts the third best ERA (2.90) despite the worst walk rate in the league (5.05 BBper9, worst by 0.55) how? Ridiculously good luck, like the fourth lowest BABIP in the league (.267) good luck.

Derek Carty over at THT took a look into all the different ways to calculate BABIP  yesterday, while Rich Lederer at Baseball Analysts dug deeper into how groundball rate will effect a hitter’s BABIP today. Both are interesting reads and worth your time .Check ’em out.

Open Thread: Continuing the Torre story
Poor Bobby lowers his contract demands
  • Manimal

    So what is considered a lucky BABIP and an unlucky BABIP?

    • Mike A.

      League average is around .305-.310. If you start dipping below .290ish or going above .330ish, luck comes into play.

      • Manimal

        So you could say Cano’s BABIP of .286 was slightly unlucky?

        • A.D.

          Considering his career is .323, very unlucky

          • Pablo Zevallos

            Actually, it was just a statistical correction. He should be at the norm by (at most) 2011 (so that he corrects for the two above-average years).

      • E-ROC

        Albert Pujols has been lucky on more than one occasion.

        • Chris

          No exactly. The typical BABIP is different for each hitter.

          For pitchers, BABIP tends to regress to the league average (or more specifically, the team average which depends on team defense). For batters, BABIP regress to their individual expected value. The easiest (and crudest) way to predict this is career BABIP. The first article linked goes into more detail on other formulas that do a better job of predicting future BABIP.

  • Artist formerly known as ‘The’ Steve

    I have no time for your silly stats, I’m busy hating Joe Torre.

    • tommiesmithjohncarlos a/k/a Ridiculous Upside

      I have no time to hate Joe Torre, I’m busy making sweet love to these spreadsheets.

  • Steve H

    How can I judge BABIP with my eyes?

    • Jamal G.

      Back-to-back awesome comments by the Steves.

  • BJ

    Carty’s article was not about “different ways to calculate BABIP”, it was about the best way to estimate BABIP given previous data, or in other words the best way to project a player’s BABIP next year.

    The interesting point is that people thought the expected BABIP was determined by Line Drive %, but in reality the correlation is very low. Carty tests a specifically designed BABIP estimator that actually works very well.

    • Mike A.

      Yeah my bad, I worded that wrong.

  • Ben B.

    Here’s hoping for a little regression for both Swish and Dice-K.

  • Manimal

    Look at Gardner’s minor league BABIP, He had a very lucky minor league career.

    • Ben B.

      He scrappy, though. Like putting Pedroia in CF.

    • Jamal G.

      Hmm, I wonder how much of that can be attributed to his speed when he hits ground-balls to the left side.

      • Manimal

        I’m sure some of it could be but that doesn’t account for all of it.

        • Ed

          The lower quality of minor league defense may account for the rest.

          Gardner’s minor league BABIP numbers are similar to Ichiro’s major league numbers, so it is possible to sustain numbers like that. Too bad Ichiro never played in the minors, that would be an interesting comparison.

          • tommiesmithjohncarlos a/k/a Ridiculous Upside

            Too bad Ichiro never played in the minors, that would be an interesting comparison.

            NPB = AAA

            Somebody pull up his Japanese baseball stats, and there you go!

    • A.D.

      Well actually it means that his career norms, hitters, such as Jeter or Pujols or Gwynn have generally higher BABIP.

      Essentially if you’re going to hit for a high avg, you either need a shitload of HR or a higher than avg BABIP

  • Reggie C.

    Matsuzaka lucky? I’m sorry. That’s where the K pitch comes in real handy. He’s just real good. Lets not kid ourselves there.

    • Mike R.

      If you have a “K pitch” why would you wait until you’ve loaded the bases to use it?

      • Reggie C.

        Yeah. I made a blank statement. hehehe… i usually don’t.

        seriously tho, of course there’s some luck involved, but i’m not going to take anything away from Dice-K’s very good season. inferring bad luck or good luck from walk rate just seems like a fool’s errand.

        • tommiesmithjohncarlos a/k/a Ridiculous Upside

          Reggie, walk rate and strikeout rate don’t factor into BABIP. It’s simply what happens with balls hit into play. If DiceK strikes one guy out on three pitches, walks the next guy on four pitches, and then gives up a hit on the first pitch to the third batter he sees, only that last pitch to that last batter is calculated into BABIP.

          And the stats show that, despite both striking out a good percentage of the batters he faced and walking a good percentage of the batters he faced, the batters that either made a hit or an out by hitting the ball into the field of play made an abnormally low number of hits and an abnormally high number of outs.

          Even if he continues to strike out and walk guys at the same rate, he was very lucky last season and stands to not remain that lucky over a larger sample size.

          • Jack

            Dude, you said balls.

            • tommiesmithjohncarlos a/k/a Ridiculous Upside
              • Jack

                Anyway, a reasonable person might think, “Wow, putting a cast aluminum pair of freakishly oversized replica testicles on a vehicle is kinda gay.” But not White Trash People, again, what they are really saying is “I love balls,” which is totally not the least bit gay at all.


    • dan

      BABIP has absolutely nothing to do with strikeouts

      • Mike R.

        Provide analysis to back up your statement.

        • dan

          Ummm. It’s a fact. Here’s the equation:

          (Hits-home runs)/Balls in play

          I don’t see strikeouts anywhere.

          • Jack

            I think (hope) he was joking.

          • A.D.

            it was a

            *****Sarcasm Alert*****

            • dan

              I didn’t think it was sarcasm based on the response he made to Reggie C.

              If it wasn’t serious, then my bad.

              • tommiesmithjohncarlos a/k/a Ridiculous Upside

                It wasn’t a sarcasm alert, it was a running gag/inside joke/meme alert.

              • Mike R.

                No problem. My comment to Reggie was serious (lighthearted, but serious). I didn’t add a wink or anything at the end because you’ve been around so long I thought you were in on the joke.

    • Steve H

      Then what about his 2007 season? He had more K/inning, more K/bb and an identical whip, yet his ERA was 1.50 runs higher. His LD/GB/FB %’s were identical.

      His BABIP was about a league average .306 in 2007. There’s your difference. Luck was clearly a HUGE factor in his era dropping 1.50 runs.

  • dan

    Before this goes too far, the comment above about .305 being normal applies to pitchers. Hitters have their own “true talent” BABIP that can be anywhere from like .260 to .360 or so. For instance, Jeter and Ichiro have consistently high BABIPs, so you would regress more towards each’s career average than league average.

    Seeing who was lucky and unlucky with BABIP for hitters is a MUCH bigger process than just seeing how far he was from .300

    • A.D.

      exactly, hitters, you have to look at their career.

    • tommiesmithjohncarlos a/k/a Ridiculous Upside

      Couldn’t you also say, though, that some pitchers have a “true talent” BABIP? That some pitchers, like sinkerballers, establish their own lower BABIP standard baselines where the balls put in play against them have a lower natural success rate than the average pitchers?

      • A.D.

        It can, Johan Santana’s career BABIP is .286 & Wakefield .280

        So you can see that some will have less, though not as far off the .300 as some hitters

      • dan

        There are people who do believe that. Some also say that knuckleballers have a lower true-talent BABIP. It may be true. But the fact is, the spread in talent is so small that it’s almost always canceled out by random fluctuation. For instance, Pettitte has one of the highest BABIP figures of the past 10 seasons or so and it’s only .316

  • Kevin G.

    I thought a low BABIP meant the player was unlucky. Is it the opposite for pitchers or is based on the hiteers they face?

    • Manimal

      Yeah, unlucky hitter means lucky pitcher and vice versa.

    • A.D.

      Opposite… so a pitcher with a low BABIP is lucky, where high is unlucky

      • Kevin G.

        Thanks, to both of you. That means Pettite, Burnett, and Mussina were the 3rd, 4th, and 7th unluckiest pitchers in baseball last year respectively.

        • BigBlueAL

          Is it a coincidence that Pettitte and Moose were both in the Top 7 of “unluckiest” pitchers considering the Defense playing behind them????

          • Edwantsacracker

            I remember reading an article about a pitcher who liked stats. People thought it was funny that he liked the stats because his BABIP was very low. I think he played for KC?

            Anyway he thought that if he was constantly getting ahead in the counts that players would be making defensive swings and making poorer contact effecting the stat. Its just a theory.

            • steve (different one)

              it was bannister

  • Matt

    BABIP is a factor of luck but at the same time, it’s not the be all, end all. If you hit the ball hard, chances are the ball is going to fall somewhere–like with Albert Pujols.

    But in a situation like Cano and Swisher last year–high LD rates, low BA–that’s when a low BABIP can come into play.

    • tommiesmithjohncarlos a/k/a Ridiculous Upside

      Yeah, BABIP can both be used to see which players or pitchers have one year statistical anomalies (like Swisher, Pettitte, and Dice-K) that will likely course correct, or can be used to reinforce which players are good enough/talented enough to consistently be better than others (like Pujols).

      It’s just a matter of drawing smart conclusions from what the stats tell you.

      • Edwantsacracker

        But how much can it tell us about Cano? With our eyes, we saw that Cano was swinging away all the time, not taking pitches, not working the count, and as a result was often getting into terrible counts.

        In those terrible counts he was taking bad defensive swings, making bad contact and of course not getting hits.

        I’m not anti-stat, I just question the value of this one.

        • Matt

          Cano’s case was very strange this year. He had a low BABIP in the first half which really killed his BA then, but it more or less worked itself out because he had a solid second half. He hit .307/.333/.482 after the ASB, and those are essentially his career numbers (.303/.336/.468). I’d, of course, love for his OBP to creep up but I’ll take it if it comes with a high SLG.

          What was more interesting about Cano was how low his BA was when he had a high line drive percentage, which one assumes would turn into hits. But, he also had his lowest GB/FB ratio, made up of a career low GB% (grounders turn into hits easier than fly balls) and a career high FB% (his ratio here has been climbing each year, something to keep an eye on). So it seems that Cano was hitting the ball hard at a decent rate but it was going right to people, and the hits just weren’t falling. But on the other hand, he was also getting under the ball a lot more, causing him to pop up more.

  • Greg G.

    BABIP is a great stat: it’s fun to pronounce…right up there with “RBI=ribbie”

    /crawls back under rock

    • pat

      I enjoyed that greg.

  • dan

    Also, the main thing that gets some people is that they equate BABIP with pitcher skill. Pedro Martinez had two of the bets seasons of all time in 1999 and 2000. You know what his BABIPs were for those two seasons?

    .343 and .253. One extraordinarily high, the other extremely low. I don’t think his true talent level changed that much in one season. It’s random fluctuation (luck).

  • A.D.

    Ty Cobb: career .372 BABIP
    Rod Carew: .361
    Tony Gwynn .345
    Ruth: .340
    Larry Walker .336

    Great hitters will have much higher BABIP than the pitcher averages.

    • tommiesmithjohncarlos a/k/a Ridiculous Upside

      Ty Cobb: career 1.000 guy stabbed

  • BigBlueAL

    Hey Mike or anybody, is the Hardball Times Season Preview book worth buying???? I always buy the Baseball Prospectus, but was reading a bit about the Hardball Times season preview and it seems pretty good too. Only a bit less than 300 pages though (well “only” compared to the Prospectus) and costs 20 bucks so maybe too much to pay for (gotta get the Prospectus and Torre’s book first!!!). Do love the website though (Baseball Analysts great too).

  • ortforshort

    Statistics can only take you so far. They’re an approximation of real phenomena. They should be looked at as a conversation starter, not a conversation ender. When the stat doesn’t match real life, the stat-head will say “luck” and that’s supposed to explain it. What it really means is that the stat isn’t doing a very good job and you need to understand the process that you’re trying to measure better and go back to the drawing board if you want to approximate life better. “Luck” is a cop out.

    • steve (different one)

      i’d say dismissing stats is also a cop out.

      it means you don’t have to learn anything.

    • Evan

      I agree that particular metrics should be used as premises for a conclusion rather than conclusions in themselves. Also, there are times when there’s more to the discrepancy between expected vs. actual performance than just random variation. That said, it does not eliminate the value of quantifying certain phenomena. These metrics are helpful in discerning things that sense experience may have trouble understanding and thus explaining to others. When done with care, they do a decent enough job to explain much of what we can see with similar extensive observational analysis. Even with the most accurate and precise measurement that one can come up with, there is always a chance that a highly unlikely outcome will come about if given a small enough sample. To say that elimination of luck is needed for a measurement to be worth anything, particularly for something like baseball, is asking a lot.

  • Slugger27

    I went to look up jeters stats, specifically his BABIP… got sidetracked by his wikipedia page… where the first sentence says hes been an all star pitcher 18 times and is captain of the los angeles lakers

    weird… i never knew about him… hes a bigger badass than i thought

    • Chris

      He also apparently joined the Lakers in 1915.

      • Slugger27

        that part is particularly impressive… then was drafted by the seattle seahawks in the 92 draft…

        3 sports in one lifetime?? GREATEST. ATHLETE. EVER.

    • Short Porch

      Well all that must have gone down the memory hole since last night.

      What remains on Wikipedia is a searing indictment of his defense. They should really hand him a centerfielder’s glove, move A-Rod to short.

      Jeter’s defense has been the subject of criticism from a number of sabermetricians, including Bill James, Rob Neyer and the publication, Baseball Prospectus.[22][23][24][25][26] The book The Fielding Bible by John Dewan contains an essay by James in which he concludes that Jeter “was probably the most ineffective defensive player in the major leagues, at any position.”[22] A 2008 study by researchers at the University of Pennsylvania found that from 2002-2005 Jeter was the worst defensive shortstop in the Major Leagues.[27] Jeter responded to this criticism by saying “I play in New York, man. Criticism is part of the game, you take criticism as a challenge.”

      I am a Yankee fan to the bone, but Jeter just is not a good shortstop.

  • josh

    i will be the first to tell you i am ignorant when it comes to babip, but doesnt it matter how hard (or how well) the batter hits the ball. i mean if you make alot of contact but hit the ball weakly isnt there less of a chance you will get on base than if you make solid contact? maybe someone can explain this to me

    • Chris

      Basically, the argument is that BABIP is controlled primarily by the hitter and defense.

      So a HOF pitcher should only have a slightly better BABIP than a replacement level pitcher. A hitter on the other hand can maintain a significantly higher BABIP over his career because he has much more control over whether a ball hit in play becomes a hit.

  • RollingWave

    We can look at some pitcher with long careers and their career BABIP

    Clemens: .286
    Maddux: .286
    RJ: .295
    Pedro: .282

    here you have 4 guys that probably the 10 ten pitcher ever in terms of domination.

    here you have some not so good pitchers that happened to lasted quiet a while too.

    Moyer: .287
    Livan Hernadez: .310
    Tim Reddings: .306
    Jon Garland: .288

    so there’s pretty mild difference , Garland and Moyer both have similar BABIP as guys like Pedro / Clemens / Maddux, I highly doubt anyone would think they are similar in abilty. so while successful pitchers tend to be able to keep their BABIP below league average, the difference is very mild.

    where as in the hitter’s case. the difference can be profound. guys that hit the ball really hard, and guys that run really fast usually end up with pretty high BABIP, where as obviously, the opposite end produce some rahter meh results.

    Guys who hit the ball really hard.

    Pujols: .319
    A-rod: .322
    Vlad: .319

    Guys who are REALLY fast (and preferablly lefty)

    Ichiro!: .354 !
    Jose Reyes: .310
    Carl Crawford: .328
    Juan Pierre: .317

    where as guys who are slow farts and/or weak hitting. (there aren’t too many guys who’s a combination of both though, at least those that can carve out any semblence of a career.)

    • Short Porch

      You forgot the category ‘easily defensed.’ If you are spraying the ball all over the place, that’s one thing.

      If you are as predictable on a good percentage of your batted balls, if you are a dead pull hitter for instance, or are easily jammed and just pound the ball to 3B/Shortstop regularly, that will skew the stats.

      When Bernie got jammed, he’d hit a lazy two hopper to second, seemingly every time. I am sure you can come up with other examples of how a certain player often makes his outs, and how that is a function of how he can be pitched to and defensed.

      • D.B.H.O.F. p.k.a Don Corleone

        “I am sure you can come up with other examples of how a certain player often makes his outs, and how that is a function of how he can be pitched to and defensed.”

        Very true

        • 27 this year

          Giambi is a good case. His BA and probably his BABIP was so low because he always hit the ball toward the part of the field with an extra defender.

  • Pingback: I wonder how Kobe feels about this … | River Avenue Blues

  • D.B.H.O.F. p.k.a Don Corleone

    While I am not the biggest stat guy out there, statistics that break down line drives, pop ups, ground balls are very important in understanding the year a guy had if you did not see it first hand day in and day out. As I did not see much of Swisher last year it was interesting to read this.

    • Rick in Boston

      I think it’s also good for helping explain why players you see every day are struggling. Like the example above of Cano – his line drives were consistant with his past, but his fly ball rate increased.

  • the most felonious vocalist in the wide world of showbusiness

    I think a lot of people missed one of the more important conclusions of the article, Line Drive % is a terrible way of predicting future BABIP. Even folks who read BP and idolize Bill James often think that LD% is a good way to predict BABIP. It’s not. The previous season’s BABIP is a much better indicator of future BABIP than LD%. So, to say that Nick Swisher’s high LD% was evidence of his unlucky BABIP is false. The article shows us that his 2009 BABIP can be better predicted with his 2008 BABIP than with his 2008 LD%. Of course, there are far more effective ways to predict his BABIP, but those are kind of complicated and the formulas aren’t readily available. But using LD% to predict future BABIP is kind of like using pitcher Wins to predict future ERA.