How well does A-Rod hit outside pitches?

Alex Rodriguez changed the narrative of his career last fall. After a few years of playoff futility, he came back with two especially huge hits: a game-tying home run in the bottom of the ninth in Game 2 of the ALDS, and a game-tying home run in the bottom of the 11th in Game 2 of the ALCS. The first was an outside pitch about waist high, and the second was an outside pitch up in the zone.

We know A-Rod has tremendous power to the opposite field, and it’s one of the reasons he’s succeeded at Yankee Stadium, which isn’t quite as friendly to righties as lefties. But did you know that on 132 swings on pitches in the middle-outside and middle-up zones, A-Rod created negative runs in 2009? Jeremy Greenhouse of The Baseball Analysts examines how each hitter in the league fares against pitches in certain parts of the zone, and that’s what his data suggests.

There are other factors to account for, of course. This analysis does not break down the zone by pitch, so for all we know A-Rod could have fallen victim to off-speed and breaking stuff while demolishing outside fastballs. It also doesn’t give us an idea of sequence, which is helpful because a hitter is probably more likely to swing and miss, or make poor contact, on a pitch low and away if he’s been set up inside. Thankfully, pitch sequencing appears to be the next frontier. The strategy nerd in me is excited.

You can check out the article, linked above, or you can just head to the Google Doc spreadsheet. Each location resides on a different sheet, and is in order of runs created. Just one quick observation: Generally, the hitters who fared well on down and away pitches took fewer swings at them. The ones who fared poorly took a lot of swings — in Ryan Howard’s case, over three times the number of the leader in that zone, Carlos Gonzalez. Yet Ichiro took 121 swings in that zone, far more than anyone around him. The dude is just that good.

Open Thread: Rethinking the box score

We all love box scores — or, at least, I hope everyone else does. It’s going to be strange soon, having a generation of baseball fans who didn’t grow up waiting for their dads to finish reading the sports section so they could read all the box scores from the previous day. But while the actual experience will change, the box scores will not. We’ll still see them, just in pixels on instead of on paper with dried ink.

One great thing about the internet, though, is that it doesn’t have the space limitations of a newspaper. Box scores caught on because they fit a lot of data into a relatively small space, allowing papers to print them all on one page. Today we don’t have to limit ourselves to that. We can still present the box score in its original format, because people find it familiar, but we can also experiment a bit more with new was that might take up a bit more space. FanGraphs‘s WPA chart is just one example. We can post this along with the original box score, and it really doesn’t matter because we’re not limited to a certain space.

At Baseball Analysts, Dave Allen imagines a box score of his own. It’s in a linear format, like the WPA charts, but it does it in a different way. Here’s the example, though make sure you read the post for the full gist of what he’s doing. Also, click on it for a larger version.

In 2010 we plan to add more game data to our recaps. The regular box score will likely be part of that, but we also want to think about other ways we can present information from the game. If you’re so inclined, leave the comments below, and we’ll sift through and see if there are any ideas we can use. Nothing’s off-limits now that we’re not constricted by space, so think as out of the box as possible.

If you’d rather just BS, well, that’s why this is an open thread. Toronto visits the Devils, the Nets are in Boston, and Milwaukee plays the Knicks at the Garden.

A-Rod’s 500th home run ball sells for $103,579

One day, maybe I’ll have this kind of money to spend on a baseball. Alex Rodriguez‘s 500th home run sold in an anonymous online auction last night for $103,579. I’m wondering the same thing as Big League Stew’s Kevin Kaduk: did A-Rod himself buy the ball? He earned $28.5 million that year, $27 million in salary and a $1.5 million bonus for winning his third MVP award under the old contract. The ball would have cost him four-tenths of a percent of his salary that season. At that point, why not?

Lackey vs. Vazquez

Both the Yankees and Red Sox imported a big time starting pitcher this offseason, though they went about it in very different ways. The small market Sox managed to convince Lackey to take a massively below market deal in order to fit him into their tiny payroll (/hyperbole), while the Yanks traded an excellent prospect and two spare parts to bring Javy back. Jay at Fack Youk compared the two pitchers with regards to their polar opposite reputations of clutchiness, and shows that the two aren’t as different as you may think. Make sure you check it out, it’s some great stuff.

2009 Yankees didn’t fare well in one-run games, blowouts

The 2009 Yankees won more games than almost everyone expected. That’s almost always the case when a team wins over 100 games, but it holds particularly true in the three-team dogfight that is the AL East. They got to the 103 win mark because they were a good team that got hot at the right time, but also because they got lucky in some ways. Their 915 runs scored and 753 runs allowed works out to a 95-67 Pythagorean record, which they outperformed by eight wins. Does this mean the Yankees got lucky?

That’s a tough question to answer. Their Pythag record suggests that they did, but that takes gross runs scored and gross runs allowed and factors them into an equation. It doesn’t take into account how the team scored those runs. Over the course of a 162-game season, the reasoning goes, the issues of how a team scores runs should even out. Sometimes it doesn’t. Perhaps we can learn a bit more about the nature of the 2009 season by looking at a breakdown of the results.

Brandon Heipp of Walk Like A Sabermetrician looks at the 2009 run distribution of all 30 MLB teams. He breaks games down into one-run games, blowouts (five or more runs difference), and the games in between. Here’s the odd thing: The Yankees underperformed their record in both one-run games and blowouts. It’s the latter that seems odd, since a team with an offense like the Yankees figures, intuitively, to have a favorable record in blowouts.

The Yankees played in 38 one-run games and went 22-16, a .579 win percentage. Overall the Yankees had a .635 win percentage, and had a .653 win percentage in non-one-run games. They played in 51 blowouts and went 32-19, a .627 win percentage. This is more in line with their season total, but still a bit below. Still, no playoff team had as big a difference in blowouts as the Yankees. In those middle games, obviously, the Yankees killed, going 49-24 in 73 games, a .671 win percentage.

I’m not sure we can discern much from this data. It’s just interesting to see that while the Yankees had an over .500 win percentage in all three categories, that they still didn’t do exceptionally well in blowouts. I would have thought, since they outperformed their Pythag, that maybe they were inordinately good in one-run games. They weren’t. Though, maybe — and this is just a guess — maybe their tempered record in blowouts led to their Pythag underestimating their record.

Heipp also looks at the team win percentage when scoring X number of runs. When a team scores one run, for example, it wins 7.5 percent of those games. The magic number, it appears, is four runs, as that crossed the .500 line. It also adds the highest marginal win percentage value over the value before it. That is, the jump from scoring three to scoring four runs, in terms of winning percentage, is .186. Going from four to five runs adds just .106 to the win percentage.

Here’s another interesting bit: the Yankees played in 28 games where they scored four runs, and won just 14 of them. The average MLB team played in 21 games where they scored four runs, but we should have figured that the Yankees were above average there. I would say that reflects poorly on their pitching and defense, but then I saw that they outperformed the average when scoring two and three runs. When scoring two runs, the average MLB team won 20.8 percent of the time, while the Yankees won 28.6 percent. They won 41.2 percent of games when they scored three runs, against am MLB average of 33.7 percent. In both instances they played in fewer games than MLB average.

Again, I’m not exactly sure what we can take this data to mean. I’m not sure that, by itself, we can take it to mean anything definitive. I do think it’s interesting to note these trends. In some ways it bucks intuition. In other ways it gives us another way to view the 2009 Yankees as a team. They did well here, but not well there. Since this post contains a lot of random data, we’ll close with another random bit. The Yankees allowed 11 runs just once this season — and won the game. When scoring 11 runs, MLB teams went 110-6. Glad that the Yankees counted for one of those six.

RAB Live Chat

The stats we use: FIP

In previous editions of this series we’ve discussed UZR, a defensive statistic, and wOBA, an offensive one. Today we’ll move onto a pitching one. It won’t be the only pitching one we’ll discuss, just as wOBA won’t be the only offensive one. To the best of my abilities, here’s an explanation of Fielding Independent Pitching, or FIP.

Understanding DIPS

The roots of FIP extend back to 2001. In Baseball Prospectus’s annual book, Voros McCracken presented the case that pitchers have little to no control over what happens to balls put in play. The article itself is pretty easy to understand, so if you have a spare five minutes I suggest giving it a read. If not, I’ll provide the most important of McCracken’s findings.

He looked at how hits per balls in play fluctuated from year to year, and found that “pitchers who are the best at preventing hits on balls in play one year are often the worst at it in the next.” He then cites Greg Maddux, who had a poor rate of hits on balls in play in 1999, but was among the best in 1998. Pedro Martinez saw a similar trend, performing horribly in 1999 and excellently in 2000 on balls in play.

You can see for yourself. Here’s Pedro’s BABIP in 2000, .253, tops in the majors, and here’s his BABIP in 1999, third worst among qualifying starters. You can see Greg Maddux on that list as well, seventh worst among qualifying pitchers, while he finished sixth best in 1999. So if pitchers as prolific as Maddux and Martinez can go from among the best to among the worst in the span of one season, it should say something about the nature of a pitcher’s ability to control the outcome of balls put in play.

So what does a pitcher have control over? Tom Tango lists it in a spectrum, from 100 percent pitching to 100 percent fielding. On the 100 percent, or near-100 percent, pitching side: balks, pick-offs, HBP, K, BB, HR. Then there’s a gray area, where it’s partly the pitcher, partly the fielding, though tough to determine which. These outcomes include wild pitches, stolen bases, caught stealings, singles, doubles, triples, batting outs, and passed balls. On the 100 percent fielding side are running outs. The focus of FIP, then, is on the 100 percent pitching part of the spectrum.

Weighing homers, walks, and strikeouts

In our wOBA and UZR primers, we talked about linear run estimators. As a one-sentence recap, linear run estimators put a value on outcomes based on how they contribute to actual run scoring, based on years of historical data. In order to weigh home runs and walks as negative outcomes, and strikeouts as positives, we need to use the linear run estimators to create a ratio, so that we properly weigh the value of each. For those who don’t want to see formulas, skip to the next section. For those who want to see the actual numbers, here goes.

Why the 13:3:2 ratio? We need look no further than the linear run estimator. That’s the ratio of value between homers, walks, and strikeouts.

Scaling it to ERA

One attractive quality of many new statistics is that they scale to existing stats. That makes it easier for us to transition. Looking at raw wOBA, for instance, you might not be able to immediately recognize how good a player performed. But, because it’s scaled to OBP, we can look at the number with a sense of familiarity. It runs along the same scale, so if we know that a hitter with a .335 OBP is near league average, we can assume the same of a player with a .335 wOBA. Except, of course, that wOBA tells us more than OBP by itself.

To align to ERA, we simply add 3.2 to the FIP. That number can apparently fluctuate sometimes — I’ve seen Tango mention adding 3.1 as recently as 2008. But more recently he’s gone with the 3.2 number.

A note on xFIP

In browsing stats on sites like FanGraphs. you might notice a stat called xFIP. This takes the idea of pitcher control a bit further, positing that in addition to having little control over outcomes on balls in play, pitchers have little control over the rate at which their fly balls go for home runs. So, to normalize for this variance, xFIP looks at the number of outfield flies hit off the pitcher, and takes 11 percent of that, which is the league average percentage of fly balls hit for home runs. The equation remains the same.

The reason I like this is because pitcher see more consistency in their year-to-year strikeouts and walks than home runs. There’s still some year-to-year correlation with home runs, but just not as strong. Is that enough to warrant a further normalization? That’s for you to decide. Chances are, however, that we’ll stick to just FIP here when talking about the things pitchers do.

It’s not all about luck

A common misconception is that FIP treats outcomes on balls in play as luck. This is not true. As explained above, outcomes on balls in play represent a gray area, where we don’t know how to what degree the pitcher and fielders are responsible. FIP just strips those plays out of the equation. See the section below for further elaboration.

A good way to think about this is how Tango put it. What we want is ERA to equal FIP plus fielding dependent pitching, plus fielding, plus luck — therefore luck is just one component stripped out of FIP. There are two other components stripped out as well, both of which are probably more important than luck.

Remember: it tells us one thing

The more important thing to note about FIP is exactly what it tells us. It does not make claims about luck, per se. What it tells us is how a pitcher fared on events that were close to 100 percent in his control. Since we know that factors like luck and defense play into ERA, it’s valuable to know how a pitcher does in terms of events for which he’s solely responsible.

Later in the series we’ll get to tRA, which considers batted ball type, and SIERA, Baseball Prospectus’s take on the matter, which will be revealed in the upcoming Baseball Prospectus 2010.


The original DIPS article
Defensive Responsibility Spectrum
Tango elaborates on FIP