Monitoring Sabathia’s workload

One of the selling points of CC Sabathia‘s Cy Young case is the incredible volume of innings he’s amassed this season. Fans have long grown accustomed to the bulky lefty throwing up outsized innings pitched totals, and for this reason it’s easy to gloss over his prolificity. This year, he’s thrown 176.2 innings, a number eclipsed by only Justin Verlander of the Detroit Tigers. In the last two years, he’s thrown the third most innings in baseball, behind Halladay and Felix Hernandez. In the last five years no one in baseball has thrown more innings than Sabathia. He’s thrown 1,138.1, leading Roy Halladay by 8 innings. The next closest is Dan Haren with 1072.2. If you add in the seventy some-odd innings he’s thrown in the postseason since 2007, his lead over Halladay only widens further.

This is a cause for pride and for concern. Sabathia has earned his reputation as a durable ace, and there’s no current reason to think he’ll suddenly get injured or break down. Still one could be forgiven for wondering if he’ll be able to do this in perpetuity. If he won’t, then when exactly will the decline begin? This is a particular relevant question this season, as CC is currently on pace to threaten to eclipse his past innings pitched and total pitches thrown totals. Below is a chart detailing the past five years of work, and projecting what he might achieve if current trends hold.

As it currently stands, Sabathia is throwing around 108 pitches per start. This is a mark reminiscent of his last contract year with the Milwaukee Brewers. If he keeps up his current pace, Sabathia will pitch close to 250 innings again and throw around 3700 pitches, 100 pitches or so higher than what he did in 2007, 2009 and 2010 and, again, closer to his 2008 campaign. Of course, the postseason counts too. It doesn’t show up in Sabathia’s initial Baseball-Reference page, but the pitches he’s hurled with that left arm count just as much (if not a little more, given the stress of the event) than the ones in April.

Obviously, the 2011 totals could vary a great deal depending on how far the Yankees go into the postseason. In the scenario that minimizes the number of postseason pitches thrown for Sabathia (the worst-case scenario for the Yankees, they go home in the ALDS), Sabathia makes one start. In the scenario that maximizes the number of postseason pitches thrown, Sabathia makes 8 starts (2 in the ALDS, 3 in the ALCS and 3 in the WS – heart-attack city). Spitballing it, his 2009 numbers seem like a fair enough estimate for what he might do in this year’s postseason, but even so he averaged 7 innings and close to 110 pitches per outing that year. Scaling it back to 5 starts, 500 pitches and 30 innings is a bit more conservative. This isn’t any sort of serious projection, to be clear; no one knows how far the Yankees will go into October and how many starts Sabathia will make. There’s nothing wrong with spitballing though as long as you admit you’re spitballing it! Here’s the cumulative data on Sabathia, including the regular season and postseason.

To recap, Sabathia is likely looking at around 250 innings and 3,650-3,700 pitches in the regular season. This would be his highest mark since 2008. If he throws 5 postseason starts of six innings and 100 pitches apiece (a conservative estimate that could vary wildly), his innings pitched and pitches thrown totals will creep up to an all-time high, well past the 265.4 IP and 4,134 pitches thrown mark he set in back in 2009. It’s not inconceivable that he could crack 280 innings and 4,200 pitches. If he were to make 2 starts in the ALDS, 3 in the ALCS and 3 in the World Series, he’d easily surpass the 300 innings pitched mark.

This is all a moot point if the Yankees get bounced before the World Series, but it’s at least worth monitoring for several reasons. For one, the last time he cracked 250 innings in the regular season (after pitching on short rest for what seemed like weeks) he was ineffective in his only NLDS start with the Brewers. He looked run-down, and the Brewers went home early. It doesn’t even need to be said, but the Yankees need a sharp CC to do well this October. Secondly, Sabathia will likely opt-out of his current contract and re-up with the Yankees on another long-term deal this winter. If he’s going to be around for awhile and making big bucks, it might be a good idea to look after his long-term interests.

One easy way to do this would be to continue to roll with the six-man rotation in August. As Moshe Mandel of The Yankee Analysts noted in great detail, the six-man rotation this month would result in one less start for CC Sabathia. Hughes and Nova both pitched well in their last outings, so there doesn’t seem to be huge harm in allowing them to continue to battle it out this month, and hopefully it would result in slightly lower innings pitched and pitches thrown totals for Sabathia. He’ll still have ridiculously high numbers by the standard of mostly any other pitcher, but there’s only so much that can be done. The Yankees should do what they can to keep him fresh for October and beyond, but at some point they’ll simply have to roll the dice and hope for the best.

Handicapping the AL Cy Young Race

As the season moves into the dog days of August, some of the discussion in baseball circles naturally turns towards end of season awards ballots. This is usually a lot of fun, particularly when the old school, traditional camp goes head to head with the sabermetric camp and acrimony and recriminations ensue. There’s nothing quite like watching a reporter argue for a pitcher based on the win-loss record against someone who hasn’t looked at a W-L record all season. In anticipation of this, I’ve set out to handicap the American League Cy Young race, and have done so by trying to consider all relevant factors. Plenty of voters really do prefer looking at win-loss record, earned run average and overall team success. Other voters are comfortable looking past that and examining stats like FIP, strikeout and walk rates, and other more advanced measures of pitcher success. I’m not arguing for a particular voter rationale as much as trying to predict which one of the American League’s best pitchers will garner enough support from voters to take home the bacon. It’s a very good crop of pitchers this year, so the debate should be lively.

Honorable Mentions: C.J. Wilson, Felix Hernandez and Justin Masterson. All three have had fantastic seasons in their own right, but it’s hard to imagine any of them cracking the top 3 of the ballot as things stand right now. Of the three, King Felix seems the strongest candidate to move up the ballot if he finishes strong and other candidates slip. He’s won before, and he’s having another superb year in Seattle.

The Fringe

Dan Haren, photo courtesy of AP.

Dan Haren

My preseason pick for Cy Young is having another typically superb season. Haren is a bit of a fly ball pitcher, so pitching in Angels Stadium with good outfield defenders has really helped him so far. This year, Haren’s strikeout rate has dipped into the 7.5 K/9 range, down a little from his usual ~8 K/9 mark. However, he’s been more stingy than ever with the free passes, walking only 1.36 batters per nine innings. As a result, Haren leads the American League in K/BB ratio with a 5.65 mark, ahead of Justin Verlander’s 4.97.

Haren’s win-loss record is currently a modest 10-6. With a dozen or so starts left on the season, he seems unlikely to win twenty games this year, so he’s not likely to pick up any support from the traditional crowd in that area. His ERA is 3.01, certainly a respectable mark but nothing as shiny some of the other candidates. His ERA doesn’t diverge too wildly from his FIP (2.65) or xFIP (3.12), so there’s no reason to expect him to tail off as the season moves on, except for the fact that he usually pitches better in the first half of the season than the second.

As a result, I expect Haren to wind up in the top 3 of a few ballots, but he likely won’t be a serious contender for the award. Aside from K/BB ratio, he doesn’t lead the league in any of the “important” metrics, whether they be traditional or sabermetric, and there just isn’t a whole lot of buzz about his season. It’s been an excellent year for Haren, but probably not one good enough to win him the award. This is a friendly reminder that the Angels obtained him using Joe Saunders as the primary trade chip. Moving on.

Josh Beckett

Sadly, this was the only photo of Beckett available on the Internet. C'est la vie. Photo courtesy of AP.

Josh Beckett would likely be a serious contender for the Cy Young if not for the fact that he’s thrown roughly 30 to 40 innings less than the some of the other heavy hitters on this list. Like other seasons, Beckett has had a few struggles with his health this year, but he’s still managed to put together a good campaign and has several factors working in his favor for his Cy Young bid. For one, he’s a very well known pitcher with a reputation as being an ace, and he pitches in Boston and gets plenty of exposure. Further, he has a very low ERA, currently at 2.17. Those two factors alone mean that he’ll show up on plenty of Cy Young ballots around the nation.

Beckett is having a good year, no way around it. Yet, interestingly, his very low ERA is slightly misleading. It’s not as if this is a breakout year for Beckett. His strikeout rate has dipped a bit from career norms, and his xFIP is right in line with his career average. In fact, he posted a lower xFIP in each one of his 2007-2009 seasons. This shouldn’t obscure the fact that Beckett has had great success in the run prevention category, and if he cracks the 200 inning mark and the Red Sox win over 100 games he might find himself creeping up the ballot for plenty of voters. It won’t be undeserved. But it will be an interesting testament to the importance that a sub-3 ERA has on the psyche of the Cy Young voting community.

The Contenders

Jered Weaver

Aside from the pitcher deemed the Favorite, Jered Weaver has perhaps the strongest case for the AL Cy Young this year. Not only is his win-loss record a solid 14-4, but he also boasts a rather anemic 1.79 ERA. Weaver has a good strikeout rate thus far, punching out around seven and a half batters per nine innings, and he walks around two batters per nine. The key to explaining his tremendous success at run prevention this year is his astronomically low home run rate, 0.34 HR per nine innings. Weaver has given up only 6 home runs the entire year, well below what one would consider normal. Only 3% of his fly balls have turned into home runs this year; league average is around 10%, and Weaver himself is a career 7.5% HR/FB pitcher. It’s really an odd situation, particularly because Weaver is such an extreme fly ball pitcher. As a result, several run estimators expect Weaver to start yielding home runs at a much higher rate. His xFIP is 3.61, nearly two runs higher than his ERA.

I’m not advocating that Weaver be penalized in any way for maintaining such a low home run to fly ball ratio. If he ends the year with a 3% HR/FB ratio and a sub-2 ERA, he’ll likely win the Cy Young and it’ll be hard to argue that he doesn’t deserve it. The historical record is what it is, even if it’s not likely sustainable or repeatable. The season isn’t over just yet though. Weaver has a decent amount of time left and it’s reasonable to expect his HR/FB ratio going forward to be somewhere around his career rate of 8%, which means more home runs and a higher ERA. Weaver may be a front-runner for the award at the moment, but it’s possible that he loses some steam as some of those fly balls turn into home runs and his ERA regresses in the last two months of the season. If not, and he finishes with 20 wins, a sub-2 ERA and a 90 win Los Angeles Angels team, he very well may take home his first ever Cy Young.

Justin Verlander

Justin Verlander in the state of Karl Welzein. Photo courtesy of Getty Images.

Another pitcher sure to get some love from Cy Young voters is Justin Verlander. Verlander is currently posting his third straight sub-3 FIP season, but this year he finally has the ERA to match it (2.34). Verlander currently boasts an elite strikeout rate with an 8.79 K/9, but is walking a career low 1.77 batters per nine. For a career 2.81 BB/9 guy, this is a substantial reduction, and it leaves him with the second-best K/BB ratio in the American League. Like Weaver, Verlander is also well on his way to twenty wins, currently sporting a 14-5 win-loss record.

The thing that may stand in the way the most of Verlander clinching his first ever Cy Young is the risk of batted ball regression. His BABIP is currently .239, below his career mark of .288. Yet even if that inches up a couple dozen points, Verlander is still likely to have a very compelling case for Cy Young. He’s going to have the wins, the ERA and the peripheral stats to support him. He’s also thrown a ton of innings, more than CC Sabathia, and he’s thrown a no-hitter this season. If Detroit wins the Central, he may get an even bigger boost from voters. Verlander’s 2011 is absolutely superb. Whether he’s able to beat out Weaver and others is another question.

The Favorite 

"Throw ya hands in the air if youse a true playa". Photo courtesy of Getty Images.

CC Sabathia 

All the stars are lining up for Sabathia to win the second Cy Young award of his rather illustrious career. On the traditional side, the big fellow currently leads the American League in wins with 15. It’s absolutely true that he gets loads of run support, which is why wins aren’t the best barometer of pitcher skill, but plenty of voters still consider the factor. CC has 10 or 11 starts left this season, which means he’s a really good bet to end the year with more than twenty wins, a feat he accomplished for the first time last season. Like Weaver and Verlander, Sabathia also sports a very low ERA, currently 2.56. If Weaver’s ERA ticks up north of 2, it’s likely to make CC’s case look stronger.

Sabathia also has the benefit of pitching for a team contending for a playoff spot, perhaps directly against his competitors. Personally, I don’t think a better pitcher should be penalized for pitching on a worse team, but it seems clear that plenty of voters put a sort of premium on whether the Cy Young contender’s team makes the playoffs. If Sabathia’s Yankees win the Wild Card and Jered Weaver’s Angels or Justin Verlander’s Tigers do not, it can only help Sabathia’s case.

Sabathia should receive a good amount of support from the stat community. His case rests on more than just win totals, ERA and the Yankees making the playoffs. He currently sports the lowest FIP in the American League (2.49) and the highest fWAR total (5.6). He has an elite strikeout rate, a good walk rate and he’s getting loads of groundballs. His BABIP is fairly normal, and the only thing that could hurt him going forward is his relatively low HR/FB ratio, currently about half of his career rate. As a result his xFIP is 3.03, a touch higher than Verlander but still lower than Jered Weaver’s 3.61 mark. In other words, there’s nothing too fluky about Sabathia’s performance. Anyone who has watched him lately knows that already. He’s been virtually untouchable lately, in a stream reminiscent of his now-famous performance with Milwaukee in 2008. Sabathia is an ace in his prime, pitching in a tough division and racking up all sorts of indicators of dominance. In the midst of a very good year of pitching in the American League, Sabathia may end of with the best case for American League Cy Young. If the big fella wants a new contract, he’s doing a really good job of showing the world just how good he can be.

The trade market that wasn’t

It all keeps coming back to Cliff Lee. A year ago, the Yankees were on the precipice of acquiring Lee from the Mariners, a feat which would have given them one of the best rotations in baseball. They failed, and a short time later were bounced from the playoffs by a team led by Cliff Lee. Soon after, they saw Cliff Lee spurn the them for the Phillies in free agency. By my count, that’s three separate instances of Cliff Lee-induced pain. When Andy Pettitte retired a few months after Lee went to Philadelphia, Cashman pivoted. In a manner reminiscent of the Red Sox in 2009, the Yankees decided to build the rotation on the cheap, allowing Freddy Garcia, Bartolo Colon and Ivan Nova to battle it out for the two remaining rotation slots (the other three being occupied by Sabathia, Burnett and Hughes). Once Hughes went down with an injury, Colon took his spot and performed admirably. Garcia has been fantastic too. Yet all along it’s seemed as if plan for the Yankees’ rotation was to run with these guys until a better option arose on the trade market. Freddy Garcia’s nice and all, but shouldn’t the Yankees go into battle in October with a serious complement to Sabathia? Yet here we stand a mere week or so away from the trade deadline and there seems to be no complement available? Where are the pitchers? Where are the targets? Where are the potential upgrades?

A few big names have arisen, to be sure. Ubaldo Jimenez was the target last week, but it doesn’t seem that Colorado is serious about trading him. Some have suggested that they were simply recognizing that the market was very weak and seeing if some team (like the Yankees) would be willing to panic and overpay for their lanky and affordable ace. In the absence of that a deal seems unlikely. James Shields has also been rumored to be available, but not to the Yankees. If Tampa decides to move the putative ace it won’t likely be an intra-divisional move. Hiroki Kuroda would be a potential option, one for whom I’ve long advocated, but his no-trade clause puts him in the driver’s seat and means that he’ll determine whether he gets traded and to where he gets traded. John Danks would be a nice upgrade, but there’s no indication that the White Sox are looking to move a starter and the teams don’t even match up particularly well for a trade anyway. Who’s left, Jason Marquis?

A year ago the Yankees came close to having a very good rotation and no Jesus Montero when they offered Seattle Montero for Lee. That deal fell through. A few months later, they came close to having a very good rotation and Jesus Montero when they tried to get Cliff Lee for nothing more than money. That deal fell through. The plus side is that the Yankees still have Montero, of course. Whether they really want him is another question. They don’t seem to have any interest in calling him up any time soon, and Cashman has gone out of his way to make it clear that Montero is available in trades. Yet there is no Cliff Lee on the market this year, no pitcher for whom Montero would be a suitable return. Right now the effort to swap Montero for a pitcher looks a day late and a buck short.

There is serious downside risk in relying on the trade market. Sometimes the targets don’t materialize and other times your assets don’t matchup with the best available targets. This shouldn’t be interpreted as a criticism of Cashman. No one that I’m aware of predicted that the Yankees would whiff on Lee twice, lose Pettitte to retirement, and then find themselves unable to upgrade the rotation via the trade market at all. It sounds like a worst-case scenario dreamt up on a Red Sox message board. Yet, as of July 23rd that’s exactly what’s happened. The best pitcher truly on the market seems to be Kuroda, a pitcher with a no-trade clause and a disinclination to leave Los Angeles.  It’s not the situation the Yankees hoped to be in at this point.

The old saying is that a bird in the hand is worth two in the bush. You can always hope that better opportunities arise later, but your risk goes up the further away you are from the acquisition target. This entire market could change very quickly, and that’s what makes the trade deadline so exciting. Yet, as of today it looks like the Yankees are dancing alone. The most realistic option at this point seems very unlikely, but I suppose there’s no harm in continuing to beat the drum once more, until the deadline passes. Help us, Hiroki. You’re our only hope!

On WAR and the duties of the analyst

One of the biggest ironies of the new stat age is that the development of sophisticated and nuanced analytical tools like WAR provide the reader with a shortcut and enable the same type of lazy, simplistic analysis the tools were created to avoid. One doesn’t need to travel very far to find instances of this sort: see the use of single-season fWAR to settle debates on All-Star selections/snubs or MVP ballots. For the uninitiated, this is “doing it wrong”. WAR is comprised of many components: baserunning, fielding, and offense, among others. When it comes to fielding, a large sample of data is required in order to ensure reliability. In fact, many say that 3 seasons of UZR data is a good sample size. But single-season fWAR considers only the UZR data in that given year. This doesn’t mean that single-season fWAR is useless, just that some caution and editorial discretion is required in its application.

This isn’t the fault of WAR’s framework, although one can be forgiven for wishing there was some sort of warning sign attached to it on the Fangraphs’ leaderboards with a blinking light and a flashing message: “BEWARE! Small sample sizes still apply! Especially with the defensive component!”. Rather, the misuse of the framework is more user error than anything else. Drivers are responsible for knowing how to properly operate a car; analysts are responsible for knowing how to use WAR. They’re also responsible for not intentionally misuing WAR, or any other stat, to serve a preexisting agenda.

This gets us to a simple point, which is this: it’s the duty of the analyst to use the tools and frameworks wisely, with humility and honesty, and to create a margin of space which allows for tolerance and ambiguity. This is decidedly antithetical to the approach found so often in many popular forums: assert a controversial opinion, get pageviews, profit. But it’s a better approach. It’s not easy, and it requires far more work than making a few clicks on Fangraphs and spouting off an opinion on “Who’s better this year”, but it’s the way to circumvent the dogmatism and unsophisticated analysis we find so distasteful when we see it anywhere else.

One of the most hair-raising parts of George Orwell’s book Animal Farm is when the animals look through the window and find that the faces of the pigs have become indistinguishable from the faces of the humans that they all worked together to overthrow. The symbolism is unmistakeable: once they achieved their goals they became what they hated. The message is of course a political one, but it has bearing in the world of baseball analysis. This movement – call it a SABR movement, a stat movement, a mouth-breathing basement-dwelling movement, whatever you like – is only gathering more and more steam. WAR is on Baseball Tonight. David Cone broadcasts the virtues of FIP to the entire YES Network audience. It’s only getting bigger and stronger.

As the movement expands it will become easier to develop a more rigid orthodoxy. This isn’t necessarily bad. In a religious sense orthodoxy maintains the purity of a belief system, prevents false doctrines from gaining root amongst believers, and roots out heretics. In the world of baseball analysis it is far less coherent, systematic or discursive. But there’s still orthodoxy. There’s still a set of rules, however loose, analysts are playing by. There’s nothing wrong with this per se, but the risk is that orthodoxy can turn into dogmatism, which will stifle the innovative and free-thinking spirit which animated the movement in the beginning. Then the movement will stop growing, and it will be dead and boring. Consider this a call to keep that spirit alive, to keep hustling and thinking outside the box, to not use single-season WAR in an irresponsible way and to be ready to set aside WAR and any other metric, state or framework as inferior when the next innovation comes along.

The great Cano vs. Pedroia debate

This post originally ran Saturday morning but quickly got buried by the news of Alex Rodriguez‘s torn meniscus, so we’re bumping back up because it’s really good and you should read it. Enjoy.

Recently Patrick Sullivan of Over the Monster and Baseball Analysts fame ignited a debate when he said the following: “You know who’s not as good as Dustin Pedroia? Like, not at all? Robinson Cano“. Them’s fightin’ words, pal. Sullivan later said that he dug in so stridently for fun on Twitter, but there’s an honest debate to be had here over the value of the two players. Is he right? Who is better, Cano or Pedroia? In order to answer the question, we need to evaluate all aspects to each player’s game: offense, base running and defense. We’ll run through each category, then examine the each player’s fWAR. We’ll also introduce a variation on WAR which I’ve lovingly dubbed RABWAR. Let’s get to it.

Offense: light tower power vs. the little on-base machine that could

Robinson Cano and Dustin Pedroia are both elite offensive forces at the plate. They just go about their business in differ manners. Cano is impatient. He rarely takes a base on balls, preferring to attack early in the count. As a result, he averages a walk rate of about 5% every year, a subpar showing. He makes up for this by hitting for average and for power. He’s a lifetime .308 hitter with a career slugging percentage of .492. The latter mark belies his true power skill, though. His power has been far more substantial in the past three years, and he’s slugged .520, .534 and .526 (including 2011).

For a second baseman, Cano’s power is superlative. Since 2009 his slugging percentage is .526, the highest in baseball among second baseman. The next closest is Chase Utley at .478. Cano also has the highest batting average among second baseman since 2009. Cano is the owner of a career .358 wOBA. Like his slugging, this mark is well below his totals in the past three years: .370, .389 and .375. It’s true that using 2009 as a start point is both arbitrary and favorable to Cano, but it’s also worth noting that he’s entering his physical prime. As a matter of true talent and future expectations, his 2009-2011 data would seem to be more relevant than what he did in his early 20s. This is the book on Cano: an elite hitter with poor on-base skills but who hits for average and power better than nearly anyone at his position.

Dustin Pedroia is a different animal. Like Cano, Pedroia hits for average (career .301 hitter). He’s also shown a decent amount of power with a .455 career slugging percentage, although this is well below Cano. Where he really sets himself apart is his on-base ability. Pedroia’s career walk rate is almost 10%, and this year he’s notched a 15% mark. He’s very patient at the plate and is extremely difficult to strike out, although he’s struck out more recently. Over the past 3 years, Pedroia has an on-base percentage of .376, a mark second only to Chase Utley’s .391. Overall, Pedroia has a career wOBA of .366, .08 points higher than Robinson Cano. Unlike Cano, Pedroia does not benefit from using a sample of only the past three years. His wOBA from 2009 to 2011 is .366, identical to his career average. Who’s the better overall hitter then?

As you can see, Cano has edged Pedroia out in wOBA since the start of 2009, but Pedroia has been more consistent since 2007. It’s also worth noting that Pedroia outperforms Cano slightly in wRC+, which is like a wOBA-based version of OPS+. Pedroia has a career mark of 120, and Cano’s career wRC+ is 118. In the past three years, Pedroia’s respective wRC+ marks are 113, 132 and 129. Cano’s are 121, 142 and 137.  In terms of overall offensive production, the two are very, very close. I’d like to give the category to Cano because of his tremendous upside, but his lack of a respectable walk rate means that his overall production is more likely to be the victim of the capricious whim of the BABIP dragons. This one’s a tossup.

Base running: don’t even think about it vs. the constant threat

Yankees fans know that Robinson Cano should never try to steal a base. He still tries though, and manages to swipe about 5 bases a year, giving him a career total of 26 stolen bases. He’s been caught a staggering 24 times though, meaning that his success rate is just over 50%. Pedroia is far better at stealing bases. He’s stolen 72 bases in his career and averages around 20 a year when he’s healthy. Unlike Cano, he hasn’t gotten thrown out that often – his total caught stealing  mark is 15, giving him a success rate of around 83%.

There’s more to base running than just stealing bases, though. For that we can turn to two very good base running stats, both of which attempt to quantify how many runs are contributed by a player’s advancement on the bases by considering ground, air and hit advancements. Baseball Prospectus’ version is EqBRR, short for Equivalent Base Running Runs. In addition to ground, air and hit advancements it also includes stolen bases and other advancements like wild pitches. Fangraphs’ version does not include these considerations. According to EqBRR, Robinson Cano has been worth only 1.2 runs on the base paths for his entire career, while  Dustin Pedroia has been worth 7.5 runs. This is despite the fact that Cano has played in over three hundred more games than Pedroia. It’s worth noting that Cano’s mark was negative prior to this season; he’s only in the black because he’s been worth 1.5 runs on the basepaths in 2011, bolstered by very high scores on ground and air advancement. In sum, by Baseball Prospectus’ measure Pedroia’s been worth about a half a win more than Cano on the bases.

Fangraphs’ base running stat is UBR, or Ultimate Base Running, and you can read about here. This metric grades Cano out much better than Pedroia, a surprising result. By UBR’s reckoning, Cano has been worth 4.1 runs on the base paths, while Pedroia has been worth -0.4. As mentioned, UBR does not include stolen bases, and we know that there’s a gigantic discrepancy between the two players when it comes to this factor. As such, EqBRR is probably a better indicator of base running value here, which means Pedroia gets the nod in this category.

Defense: depends on who you ask

It’d be really easy to provide the relevant UZR scores for each player and call it a day. It would also be incomplete. Astute readers know that there are some serious difficulties present in UZR and other defensive metrics. Baseball Prospectus’ Colin Wyers has been cleaning the glass like Dennis Rodman on the topic for quite some time now and has proposed an alternative, FRAA. For a primer on the issue, see this piece on the serious problems with most defensive metrics, this piece which summarizes the park-scorer and range biases problems and proposes a way forward, and this piece which examines FRAA against UZR on the topic of Derek Jeter. Colin Wyers summarizes FRAA accordingly:

Simply put, we count how many plays a player made, as well as expected plays for the average player at that position based upon a pitcher’s estimated ground-ball tendencies and the handedness of the batter. There are also adjustments for park and the base-out situations; depending on whether there are runners on base, as well as the number of outs, the shortstop may position himself differently, and we account for that in the average baselines.

The other metrics use other data to come to their estimate of expected outs—in the cases of UZR and DRS, it’s batted-ball and hit location data measured by BIS video scouts. In the cases of TZ and FRAA, it’s data collected by press box stringers working for MLB’s Gameday product.

So we have two different metrics both attempting to quantify defensive value, just in different ways. How do the two second-baseman, Cano and Pedroia, stack up against each other using UZR and FRAA? We’ll start with Cano:

Wowza. UZR hates Cano’s performance with the white hot intensity of a supernova, grading him out at -39.3 runs above average at second base. It’s given him a negative value for every year but 2007, although the worst scores came early in his career. The overwhelming majority of Cano’s poor UZR mark comes from his range. He grades out at nearly average in terms of double play and error runs above average, but has a -36.4 runs above average mark for range. Unlike UZR, FRAA is a huge fan, grading him at 31.2 runs above average. This is a difference of over 70 runs and clearly raises big questions. Other defensive metrics aren’t as harsh on Cano as UZR is, but none are as positive as FRAA. Where you come down on Cano’s defense, then, is likely informed by your own subjective evaluation from watching him. I’d split the difference. Cano certainly doesn’t strike me as a lousy defender, he gets to plenty of balls and turns a double play smoother than anyone. At the same time, I wouldn’t call him an elite defender. He simply doesn’t strike me as being cut from the same elite defensive cloth as someone like Adrian Beltre or Mark Ellis.

Like Cano, UZR and FRAA also see Pedroia differently. He grades out superbly by UZR’s standards, clocking in at 32.5 runs above average for his career, but looks far worse according to FRAA, scoring -1.2 runs above average. From a subjective standpoint, I’d argue that Pedroia is a very good defender. Whether he’s as good as UZR purports him to be is difficult to say. There are serious issues surrounding defensive metrics, so declaring a winner in this category is difficult. In this situation it’s wise to follow the advice of Tom Tango, who recommends we assume that all sides have something to add and take the midpoint. In that case, this category goes to Pedroia if only because of how poorly UZR grades Cano.

Conclusion: the final countdown

What WAR gives us is a systematic, consistent framework to value the accomplishments of players.  The good thing about a framework is that each person is free to create his own implementation.  Not all houses are built the same, but they all follow the same principle.  That’s what WAR gives us.” – Tom Tango.

Fangraphs’ WAR, which uses UBR for baserunning and UZR for defense, grades the two players accordingly:

By this standard, Pedroia is the clear winner. Give Pedroia some 1200 more plate appearances, and he would lead Cano by a wide margin. But as we know, fWAR relies on Fangraphs’ UBR and UZR. So let’s swap out UBR and UZR for Baseball Prospectus’ EqBRR and FRAA, respectively. We’ll call this little SABR-demon spawn RABWAR.

Here Cano is the clear winner, thanks largely to the difference in the way their defense is scored. So who is better: Cano or Pedroia? The offense is a tossup, the base running goes to Pedroia and the defense is a toss-up leaning towards Pedroia. At the end of the day, whether you pick Pedroia or Cano will likely hinge on which defensive metric you prefer, or which team you prefer. Cano and Pedroia are both incredibly talented second baseman and it’s tough to see any daylight between their two respective statistical profiles. In this sense, the claim that Cano is not “nearly as good” as Pedroia simply doesn’t ring true. If I was forced to pick between the two and was able to erase their prior team affiliations from my mind I’d likely go with Pedroia, in no small part because of my preference for his approach at the plate. It’s a very difficult choice though, unless I’m allowed to pick from the other division rival and take Ben Zobrist. Now there’s a second baseman.

Special thanks to Joe Pawlikowski and Moshe Mandel for their contributions to this piece.

Bartolo’s Indian Summer

“None of us can predict what’s gonna happen” – Joe Girardi, March 22, 2011 on the decision to begin the season with Bartolo Colon in the bullpen

As weird as it seems now, Bartolo Colon began the season in the bullpen after losing the 5th starter battle to Freddy Garcia back in March. While Colon had out-pitched Garcia in Spring Training, and while his stuff looked fantastic, there were serious questions about his durability. This wasn’t exactly an unfounded concern – Colon is 38 years old, didn’t pitch at all in the majors last season, and last topped 100 innings back in 2005. “Eater” is a word that comes to mind when one thinks of Bartolo, but it’s in connection with food, not innings.

So thank goodness, in a sense, for Phil Hughes‘ dead arm. As it turned out, Hughes’ injury opened the door for Colon and allowed the Yankees to see what they really had in him. Despite a rough outing last time out, he has really come up in spades for the team. Aided by a shoulder rejuvenated by a controversial stem-cell procedure, Colon has been the second-best starter on the 2011 Yankees. As comeback stories go, this one is almost a bit too good to be true. Indeed, this veteran and two-time All Star is having the best season of his length career, even better than when he beat out Johan Santana for the Cy Young in 2005.

In 2005 Colon pitched 222.2 innings of 3.48 ERA ball for the Angels. His W-L record was sterling, 21-8, and was no doubt the driving force behind him winning the Cy Young. Colon’s K rate was 6.35/9, not exactly the highest strikeout rate of Cy Young winners, but he only handed out 1.74 walks per nine innings. His FIP on the year was 3.75, and his xFIP was 3.91. This year he’s doing even better. He’s struck out 7.90 batters per nine innings so far in 2011 while maintaining his typically low walk rate of 2.20/9. His BABIP is a touch lower than his career norm (and his last outing certainly helped bring it closer to average), but other than that there’s there’s no indication that Colon has benefited from anything unsustainable or odd. By all measures, this is a career year for Bartolo Colon, and he looks fantastic. His two-seam fastball is a jaw-dropper when it’s on. You can see it here at 0:54.

A lot of analysts have been expecting the Yankees to be in the market for front-line pitching. By all indications, they are. But a lot of the preseason speculation on the topic was predicated on the notion that either A.J. Burnett or Phil Hughes represented the Yankees #2 starter, and that Garcia/Colon/Nova were simply back-end guys designed to soak up innings to be moved out when the reinforcements arrived. No one expected Colon to become that #2 starter for the team. But that’s what he is, and it’s not a mirage. He has the 11th lowest SIERA of any AL pitcher, better than Jon Lester, CC Sabathia and Josh Beckett, albeit in fewer innings.

It’s hard to imagine that a story on the New York Yankees would go relatively underreported, but it seems as if that’s what’s happened with Bartolo. A fair amount of attention has been given to his surgery, but not enough has been given to the fact that he’s having a career season at the ripe old age of 38. There is concern about his durability – the last time he cracked even 100 innings was 2005 – and perhaps the Yankees would be wise to monitor his workload down the stretch. But the fact remains that as of today he represents a viable #2 starter behind CC Sabathia, giving the Yankees flexibility as they head into the trade deadline. These 90 innings from Bartolo and ~1.6 fWAR are no small reason the Yankees are tied with Boston in the loss column for first place in July.

The chapter on this season’s New York Yankees isn’t written yet. It’s barely halfway through. No matter what happens with this club – whether they miss the playoffs, get knocked out, or cruise down the Canyon of Heroes in November through a shower of praise and confetti – there’s no doubt that Bartolo Colon has contributed in a large way to the success of this team. Something Girardi said back in Spring Training now rings true, not only for Spring Training but also as an epitapth for the season at its midpoint: “Bartolo was the wild card in all of this”.

Midseason offensive walk and strikeout rates

Over at The Process Report, Chris St. John recently examined a few select offensive statistics for the Tampa Bay Rays. In particular St. John keyed in on strikeout rate, line drive rate, and pitches per plate appearance. He contrasted each player’s current 2011 numbers with their career numbers (and didn’t include the 2011 numbers in the career numbers). This is a worthy exercise as we’re at a point in the season where plenty of statistics find a large enough sample size to stabilize. I’ve followed his lead using two basic plate approach statistics: walk rate and strikeout rate. Like St. John, I’ve excluded the 2011 numbers from career totals. I’ve also calculated strikeout rate using plate appearances, rather than at-bats, as the denominator. Fangraphs uses at-bats, but plate appearances is a more helpful and logical choice. Data is current through Friday morning, and we’ll kick it off with walk rate.

Walk rate

The two big movers up are Granderson and Swisher. Swisher in particular is notable given his slow start. Despite a low BABIP and poor power numbers, particularly from the left side, Swisher is currently posting the best walk rate and on-base percentage of his entire career. As a result, he’s assembled offensive numbers well above league average, albeit in a depressed offensive environment league-wide. This is good to see. Swisher has had a rough go of it this year, a year which in essence represents a contract year, and even though he’s found himself on the short end of the stick luck-wise he’s still been able to maintain his patience at the plate. Plate discipline is both talent and skill, and Nick Swisher has both.

On the down side is Jorge Posada. It’s not terribly hard to read the between the lines on this one. As he gets older and his bat slows down it would seem logical that Posada would find more pitchers challenging him in the zone. His slow start, no matter how much it was founded on ill-fortune, likely did nothing to discourage this. Despite the dip, it’s worthy to note that his walk rate is still above league average.

Strikeout rate

Don’t be confused by the color scheme change. Green is still good, and red is still bad. Here we see Jeter, Teixeira and Swisher lopping off a decent amount of strikeouts against their historical averages. The cynic would argue that Jeter is striking out less because he’s grounding out to second on the first pitch more. Jeter is actually seeing more pitches per plate appearance this year than in years past, but perhaps more work in this area is required to draw conclusions. It’s also nice to see Swisher reduce his strikeout rate. Peripherals-wise he’s having a very respectable year. It’ll be easier to believe it as the results continue to follow.

On the other side, Martin and Cano are striking out more than they have in the past. Granderson is also striking out more, but no one’s complaining about his year whatsoever. Ultimately, these guys are only halfway through their season and have plenty of time to sort things out at the plate. There’s nothing extremely problematic here, aside from the burr in the saddle that is Robinson Cano‘s plate discipline, and in one respect (Nick Swisher) this data is tremendously encouraging. Will this data hold, or regress to career norms? Time will tell.