Getting to know BABIP
BySometimes in baseball things happen that we just can’t explain, and when it does happen we call it luck. Good luck, bad luck, whatever. One of the biggest statistical luck fiends in BABIP, or Batting Average on Balls in Play. Nick Swisher posted a career low batting avg last year (.219) despite a career high line drive percentage (20.9%) how? Bad luck, evidenced by his absurdly low .251 BABIP, fourth lowest in baseball. Diasuke Matsuzaka posts the third best ERA (2.90) despite the worst walk rate in the league (5.05 BBper9, worst by 0.55) how? Ridiculously good luck, like the fourth lowest BABIP in the league (.267) good luck.
Derek Carty over at THT took a look into all the different ways to calculate BABIPĀ yesterday, while Rich Lederer at Baseball Analysts dug deeper into how groundball rate will effect a hitter’s BABIP today. Both are interesting reads and worth your time .Check ‘em out.




So what is considered a lucky BABIP and an unlucky BABIP?
League average is around .305-.310. If you start dipping below .290ish or going above .330ish, luck comes into play.
So you could say Cano’s BABIP of .286 was slightly unlucky?
Considering his career is .323, very unlucky
Actually, it was just a statistical correction. He should be at the norm by (at most) 2011 (so that he corrects for the two above-average years).
Albert Pujols has been lucky on more than one occasion.
No exactly. The typical BABIP is different for each hitter.
For pitchers, BABIP tends to regress to the league average (or more specifically, the team average which depends on team defense). For batters, BABIP regress to their individual expected value. The easiest (and crudest) way to predict this is career BABIP. The first article linked goes into more detail on other formulas that do a better job of predicting future BABIP.
I have no time for your silly stats, I’m busy hating Joe Torre.
I have no time to hate Joe Torre, I’m busy making sweet love to these spreadsheets.
http://www.marieclaire.co.uk/i.....559923.jpg
Heh.
That was the only decent thing that came up when I searched Google Images for “Spreadsheet love”. I think we just found the exception to Rule 34.
I so need to get this tattoo…
http://j-walkblog.com/images/iloveexcel2007.jpg
How can I judge BABIP with my eyes?
Back-to-back awesome comments by the Steves.
Carty’s article was not about “different ways to calculate BABIP”, it was about the best way to estimate BABIP given previous data, or in other words the best way to project a player’s BABIP next year.
The interesting point is that people thought the expected BABIP was determined by Line Drive %, but in reality the correlation is very low. Carty tests a specifically designed BABIP estimator that actually works very well.
Yeah my bad, I worded that wrong.
Here’s hoping for a little regression for both Swish and Dice-K.
Look at Gardner’s minor league BABIP, He had a very lucky minor league career.
http://www.fangraphs.com/stats.....F#advanced
He scrappy, though. Like putting Pedroia in CF.
Hmm, I wonder how much of that can be attributed to his speed when he hits ground-balls to the left side.
I’m sure some of it could be but that doesn’t account for all of it.
The lower quality of minor league defense may account for the rest.
Gardner’s minor league BABIP numbers are similar to Ichiro’s major league numbers, so it is possible to sustain numbers like that. Too bad Ichiro never played in the minors, that would be an interesting comparison.
Too bad Ichiro never played in the minors, that would be an interesting comparison.
NPB = AAA
Somebody pull up his Japanese baseball stats, and there you go!
Well actually it means that his career norms, hitters, such as Jeter or Pujols or Gwynn have generally higher BABIP.
Essentially if you’re going to hit for a high avg, you either need a shitload of HR or a higher than avg BABIP
Matsuzaka lucky? I’m sorry. That’s where the K pitch comes in real handy. He’s just real good. Lets not kid ourselves there.
If you have a “K pitch” why would you wait until you’ve loaded the bases to use it?
Yeah. I made a blank statement. hehehe… i usually don’t.
seriously tho, of course there’s some luck involved, but i’m not going to take anything away from Dice-K’s very good season. inferring bad luck or good luck from walk rate just seems like a fool’s errand.
Reggie, walk rate and strikeout rate don’t factor into BABIP. It’s simply what happens with balls hit into play. If DiceK strikes one guy out on three pitches, walks the next guy on four pitches, and then gives up a hit on the first pitch to the third batter he sees, only that last pitch to that last batter is calculated into BABIP.
And the stats show that, despite both striking out a good percentage of the batters he faced and walking a good percentage of the batters he faced, the batters that either made a hit or an out by hitting the ball into the field of play made an abnormally low number of hits and an abnormally high number of outs.
Even if he continues to strike out and walk guys at the same rate, he was very lucky last season and stands to not remain that lucky over a larger sample size.
Dude, you said balls.
http://stuffwhitetrashpeopleli...../28-balls/
Anyway, a reasonable person might think, āWow, putting a cast aluminum pair of freakishly oversized replica testicles on a vehicle is kinda gay.ā But not White Trash People, again, what they are really saying is āI love balls,ā which is totally not the least bit gay at all.
Heh.
BABIP has absolutely nothing to do with strikeouts
Provide analysis to back up your statement.
Ummm. It’s a fact. Here’s the equation:
(Hits-home runs)/Balls in play
I don’t see strikeouts anywhere.
I think (hope) he was joking.
it was a
*****Sarcasm Alert*****
I didn’t think it was sarcasm based on the response he made to Reggie C.
If it wasn’t serious, then my bad.
It wasn’t a sarcasm alert, it was a running gag/inside joke/meme alert.
I thought I knew all of those inside joke things we had here. I guess I’ve been slacking recently.
http://riveraveblues.com/2009/.....nces-6550/
Review, and enjoy.
No problem. My comment to Reggie was serious (lighthearted, but serious). I didn’t add a wink or anything at the end because you’ve been around so long I thought you were in on the joke.
Then what about his 2007 season? He had more K/inning, more K/bb and an identical whip, yet his ERA was 1.50 runs higher. His LD/GB/FB %’s were identical.
His BABIP was about a league average .306 in 2007. There’s your difference. Luck was clearly a HUGE factor in his era dropping 1.50 runs.
Before this goes too far, the comment above about .305 being normal applies to pitchers. Hitters have their own “true talent” BABIP that can be anywhere from like .260 to .360 or so. For instance, Jeter and Ichiro have consistently high BABIPs, so you would regress more towards each’s career average than league average.
Seeing who was lucky and unlucky with BABIP for hitters is a MUCH bigger process than just seeing how far he was from .300
exactly, hitters, you have to look at their career.
Couldn’t you also say, though, that some pitchers have a “true talent” BABIP? That some pitchers, like sinkerballers, establish their own lower BABIP standard baselines where the balls put in play against them have a lower natural success rate than the average pitchers?
It can, Johan Santana’s career BABIP is .286 & Wakefield .280
So you can see that some will have less, though not as far off the .300 as some hitters
There are people who do believe that. Some also say that knuckleballers have a lower true-talent BABIP. It may be true. But the fact is, the spread in talent is so small that it’s almost always canceled out by random fluctuation. For instance, Pettitte has one of the highest BABIP figures of the past 10 seasons or so and it’s only .316
I thought a low BABIP meant the player was unlucky. Is it the opposite for pitchers or is based on the hiteers they face?
Yeah, unlucky hitter means lucky pitcher and vice versa.
Opposite… so a pitcher with a low BABIP is lucky, where high is unlucky
Thanks, to both of you. That means Pettite, Burnett, and Mussina were the 3rd, 4th, and 7th unluckiest pitchers in baseball last year respectively.
Is it a coincidence that Pettitte and Moose were both in the Top 7 of “unluckiest” pitchers considering the Defense playing behind them????
I remember reading an article about a pitcher who liked stats. People thought it was funny that he liked the stats because his BABIP was very low. I think he played for KC?
Anyway he thought that if he was constantly getting ahead in the counts that players would be making defensive swings and making poorer contact effecting the stat. Its just a theory.
it was bannister
BABIP is a factor of luck but at the same time, it’s not the be all, end all. If you hit the ball hard, chances are the ball is going to fall somewhere–like with Albert Pujols.
But in a situation like Cano and Swisher last year–high LD rates, low BA–that’s when a low BABIP can come into play.
Yeah, BABIP can both be used to see which players or pitchers have one year statistical anomalies (like Swisher, Pettitte, and Dice-K) that will likely course correct, or can be used to reinforce which players are good enough/talented enough to consistently be better than others (like Pujols).
It’s just a matter of drawing smart conclusions from what the stats tell you.
But how much can it tell us about Cano? With our eyes, we saw that Cano was swinging away all the time, not taking pitches, not working the count, and as a result was often getting into terrible counts.
In those terrible counts he was taking bad defensive swings, making bad contact and of course not getting hits.
I’m not anti-stat, I just question the value of this one.
Cano’s case was very strange this year. He had a low BABIP in the first half which really killed his BA then, but it more or less worked itself out because he had a solid second half. He hit .307/.333/.482 after the ASB, and those are essentially his career numbers (.303/.336/.468). I’d, of course, love for his OBP to creep up but I’ll take it if it comes with a high SLG.
What was more interesting about Cano was how low his BA was when he had a high line drive percentage, which one assumes would turn into hits. But, he also had his lowest GB/FB ratio, made up of a career low GB% (grounders turn into hits easier than fly balls) and a career high FB% (his ratio here has been climbing each year, something to keep an eye on). So it seems that Cano was hitting the ball hard at a decent rate but it was going right to people, and the hits just weren’t falling. But on the other hand, he was also getting under the ball a lot more, causing him to pop up more.
BABIP is a great stat: it’s fun to pronounce…right up there with “RBI=ribbie”
/crawls back under rock
I enjoyed that greg.
Also, the main thing that gets some people is that they equate BABIP with pitcher skill. Pedro Martinez had two of the bets seasons of all time in 1999 and 2000. You know what his BABIPs were for those two seasons?
.343 and .253. One extraordinarily high, the other extremely low. I don’t think his true talent level changed that much in one season. It’s random fluctuation (luck).
Ty Cobb: career .372 BABIP
Rod Carew: .361
Tony Gwynn .345
Ruth: .340
Larry Walker .336
Great hitters will have much higher BABIP than the pitcher averages.
Ty Cobb: career 1.000 guy stabbed
Hey Mike or anybody, is the Hardball Times Season Preview book worth buying???? I always buy the Baseball Prospectus, but was reading a bit about the Hardball Times season preview and it seems pretty good too. Only a bit less than 300 pages though (well “only” compared to the Prospectus) and costs 20 bucks so maybe too much to pay for (gotta get the Prospectus and Torre’s book first!!!). Do love the website though (Baseball Analysts great too).
Statistics can only take you so far. They’re an approximation of real phenomena. They should be looked at as a conversation starter, not a conversation ender. When the stat doesn’t match real life, the stat-head will say “luck” and that’s supposed to explain it. What it really means is that the stat isn’t doing a very good job and you need to understand the process that you’re trying to measure better and go back to the drawing board if you want to approximate life better. “Luck” is a cop out.
i’d say dismissing stats is also a cop out.
it means you don’t have to learn anything.
I agree that particular metrics should be used as premises for a conclusion rather than conclusions in themselves. Also, there are times when there’s more to the discrepancy between expected vs. actual performance than just random variation. That said, it does not eliminate the value of quantifying certain phenomena. These metrics are helpful in discerning things that sense experience may have trouble understanding and thus explaining to others. When done with care, they do a decent enough job to explain much of what we can see with similar extensive observational analysis. Even with the most accurate and precise measurement that one can come up with, there is always a chance that a highly unlikely outcome will come about if given a small enough sample. To say that elimination of luck is needed for a measurement to be worth anything, particularly for something like baseball, is asking a lot.
I went to look up jeters stats, specifically his BABIP… got sidetracked by his wikipedia page… where the first sentence says hes been an all star pitcher 18 times and is captain of the los angeles lakers
weird… i never knew about him… hes a bigger badass than i thought
He also apparently joined the Lakers in 1915.
that part is particularly impressive… then was drafted by the seattle seahawks in the 92 draft…
3 sports in one lifetime?? GREATEST. ATHLETE. EVER.
Well all that must have gone down the memory hole since last night.
What remains on Wikipedia is a searing indictment of his defense. They should really hand him a centerfielder’s glove, move A-Rod to short.
Jeter’s defense has been the subject of criticism from a number of sabermetricians, including Bill James, Rob Neyer and the publication, Baseball Prospectus.[22][23][24][25][26] The book The Fielding Bible by John Dewan contains an essay by James in which he concludes that Jeter “was probably the most ineffective defensive player in the major leagues, at any position.”[22] A 2008 study by researchers at the University of Pennsylvania found that from 2002-2005 Jeter was the worst defensive shortstop in the Major Leagues.[27] Jeter responded to this criticism by saying “I play in New York, man. Criticism is part of the game, you take criticism as a challenge.”
I am a Yankee fan to the bone, but Jeter just is not a good shortstop.
i will be the first to tell you i am ignorant when it comes to babip, but doesnt it matter how hard (or how well) the batter hits the ball. i mean if you make alot of contact but hit the ball weakly isnt there less of a chance you will get on base than if you make solid contact? maybe someone can explain this to me
Basically, the argument is that BABIP is controlled primarily by the hitter and defense.
So a HOF pitcher should only have a slightly better BABIP than a replacement level pitcher. A hitter on the other hand can maintain a significantly higher BABIP over his career because he has much more control over whether a ball hit in play becomes a hit.
We can look at some pitcher with long careers and their career BABIP
Clemens: .286
Maddux: .286
RJ: .295
Pedro: .282
here you have 4 guys that probably the 10 ten pitcher ever in terms of domination.
here you have some not so good pitchers that happened to lasted quiet a while too.
Moyer: .287
Livan Hernadez: .310
Tim Reddings: .306
Jon Garland: .288
so there’s pretty mild difference , Garland and Moyer both have similar BABIP as guys like Pedro / Clemens / Maddux, I highly doubt anyone would think they are similar in abilty. so while successful pitchers tend to be able to keep their BABIP below league average, the difference is very mild.
where as in the hitter’s case. the difference can be profound. guys that hit the ball really hard, and guys that run really fast usually end up with pretty high BABIP, where as obviously, the opposite end produce some rahter meh results.
Guys who hit the ball really hard.
Pujols: .319
A-rod: .322
Vlad: .319
Guys who are REALLY fast (and preferablly lefty)
Ichiro!: .354 !
Jose Reyes: .310
Carl Crawford: .328
Juan Pierre: .317
where as guys who are slow farts and/or weak hitting. (there aren’t too many guys who’s a combination of both though, at least those that can carve out any semblence of a career.)
You forgot the category ‘easily defensed.’ If you are spraying the ball all over the place, that’s one thing.
If you are as predictable on a good percentage of your batted balls, if you are a dead pull hitter for instance, or are easily jammed and just pound the ball to 3B/Shortstop regularly, that will skew the stats.
When Bernie got jammed, he’d hit a lazy two hopper to second, seemingly every time. I am sure you can come up with other examples of how a certain player often makes his outs, and how that is a function of how he can be pitched to and defensed.
“I am sure you can come up with other examples of how a certain player often makes his outs, and how that is a function of how he can be pitched to and defensed.”
Very true
Giambi is a good case. His BA and probably his BABIP was so low because he always hit the ball toward the part of the field with an extra defender.
While I am not the biggest stat guy out there, statistics that break down line drives, pop ups, ground balls are very important in understanding the year a guy had if you did not see it first hand day in and day out. As I did not see much of Swisher last year it was interesting to read this.
I think it’s also good for helping explain why players you see every day are struggling. Like the example above of Cano – his line drives were consistant with his past, but his fly ball rate increased.
I think a lot of people missed one of the more important conclusions of the article, Line Drive % is a terrible way of predicting future BABIP. Even folks who read BP and idolize Bill James often think that LD% is a good way to predict BABIP. It’s not. The previous season’s BABIP is a much better indicator of future BABIP than LD%. So, to say that Nick Swisher’s high LD% was evidence of his unlucky BABIP is false. The article shows us that his 2009 BABIP can be better predicted with his 2008 BABIP than with his 2008 LD%. Of course, there are far more effective ways to predict his BABIP, but those are kind of complicated and the formulas aren’t readily available. But using LD% to predict future BABIP is kind of like using pitcher Wins to predict future ERA.