The stats we use: WPA and LI

Past Trade Review: David Justice
Why Jermaine Dye never fit with the Yankees

Now that we have discussed our favorites among defensive, offensive, and pitching stats, we’re going to move onto a more general one. Today I’m going to explain Win Probability Added, or WPA, and Leverage Index, or LI. Both are pretty simple concepts, but we use them enough on RAB that I’d like to add it to our stats guide.

Quick terms

I like the term WPA, because it accurately describes what the stat tells. You’ll also see me talk about WE, or Win Expectancy. Put simply, WE represents a team’s chances of winning at any point in the game, and WPA represents the play-by-play swing in WE.

Calculating win expectancy

It’s the bottom of the fifth, two outs, runners on first and third, home team down by one. What are the chances of the home team winning that game? Thanks to the abundance of freely available data, we can pore through historical records and find out. (Where would we be without Retrosheet?) With over 70,000 games played since 1977, we have plenty of data to draw from.

The answer to the above question, according to Walk Off Balk’s Win Expectancy Finder, is that the home team won 42.9 percent of the time. If the batter singled in the runner from third, tying the score and placing runners on first and second with two out, the home team’s win expectancy rises to 57.1 percent, or a 7.9 percent swing. When we calculate Win Probability Added, the hitter gets credited with .079, and the pitcher gets debited.

As we’ll see in a second, however, this particular Win Expectancy Finder contains certain flaws.

Stripping out bias

I first started following WPA in 2005 when I wrote some blog that no one read. To calculate it I used Dave Studeman’s WPA spreadsheet, which was based on the Win Expectancy Finder. All I had to do was input the game’s play-by-play results, and the spreadsheet would track WPA throughout the game, assigning blame and credit to pitchers and hitters, and in the end creating a neat graph. It seemed like the perfect implementation.

Then, when RAB started in 2007, I discovered FanGraphs. They tracked the Win Expectancy of all games, basically doing the job the spreadsheet did. Since I had switched to a Mac that winter, and since Studes’s spreadsheet didn’t work on Excel 2003 for Mac, I found this a viable solution. Yet there are differences in how FanGraphs calculates Win Expectancy and how the WE Finder does.

The biggest difference between the two is run environment. Some years teams score more runs than others. I’m not sure if the WE Finder adjusts for this, depending on the year range you select, but FanGraphs does. The site uses the most up-to-date Win Expectancy tables, while the WE Finder runs only through the 2006 season. Those all help the accuracy of FanGraphs’s WPA measures.

The final aspect might seem a bit controversial to some, but it’s really not. In the WE Finder, the game begins already slanted to the home team. Since home teams won 54 percent of games between 1977 and 2006, the game starts with the home team having 0.540 WE. That means if they put up a scoreless first, they have a nearly 60 percent WE when coming to bat. This might make sense at first, but after further examination I prefer the FanGraphs method, where the WE starts at 50 percent.

The main question people ask upon hearing this is, “If a home team wins 54 percent of the time, shouldn’t we take that into account?” If we take that into account, however, where do we stop? We know that Johan Santana wins a certain percentage of his games. Why not adjust WPA at the start of the game to reflect this? Why not adjust for day and night games? Weekday and weekend? There are so many pre-game factors involved that it’s best to strip all bias and start everyone on equal footing.

What about those weird graphs?

Above is the WPA graph for World Series Game 6. Pretty boring, eh? If that were a normal game in June, we wouldn’t much care for it. Unfortunately, the WPA graph doesn’t adjust for the home team’s fans’ excitement.

The graph is relatively self-explanatory. The green line tracks the WE as the game goes along. As it draws closer to the bottom, the visiting team has the advantage. As it draws closer to the top, the home team has the advantage.

For a more interesting WPA graph:

Next up: what’s that bar graph at the bottom?

Leverage Index

The concept of clutch hitting has permeated baseball since its inception. Some players rise to the occasion, while others don’t. Until LI, we had no real way of measuring clutch ability. We just worked off anecdotal evidence of of writers and fans touting some players while eviscerating others. With Leverage Index, though, we can determine just how important a situation is, and then how players performed in those situations.

A situation with a LI of 1 is considered average. The higher the number, the more crucial the situation. If the number falls below one, it is considered a relatively unimportant situation. Leverage index considers the base, out, and score situation, so at-bats in the ninth inning of a one-run game will count for much more than a comparable situation in the third.

For example, if the home team has the bases loaded with two outs in the bottom of the second , down by one run, the LI in that situation is 3.1. The same situation, but in the bottom of the ninth, yields the highest possible LI, 10.9. You can find a full chart of LI by inning/base/out situation in the resources section.

Any questions?

This is the toughest statistic for me explain, because I’m so familiar with it. I’ve been using and examining WPA for almost five years now, so what seem self-evident to me might not to others. Make sure to ask any questions in the comments, or email them to me. I’m more than willing to edit this guide so it’s as accurate and comprehensive as possible when we create our full guide.

Resources

The One About Win Probability
Get to Know: Leverage Index
Crucial Situations: Part 1, Part 2, Part 3
Leverage Index table
Win Expectancy Finder

email
Past Trade Review: David Justice
Why Jermaine Dye never fit with the Yankees
  • king of fruitless hypotheticals

    you know what i like about that twins graph? it goes to 11.

    • pat

      I enjoyed this comment.

  • 28 next year

    Is that 7.9 swing figure right? 42.9 to 57.1 is not 7.9 or double that. Could you clarify as to how that is calculated?

    • king of fruitless hypotheticals

      49.2

      • 28 next year

        oh, that makes sense. That is what I thought. Thanks

  • pete

    the one issue i have with WPA is the fundamentally flawed nature of win probability to begin with. Or it’s common interpretation, rather. For me, because it is inherently situational, its primary use is basically for fun – a way to look back at a game or a season and gauge individual significance. Unfortunately, it does somewhat fail in this regard, because the situation itself is far from identical to the generic one plugged into WPA. That is to say, there is a huge difference between singling the runner in in the aforementioned bottom 5th, 1st and 3rd, 2 out situation with ramiro pena hitting behind you, and doing so with alex rodriguez hitting behind you.

    Of course, the more microanalytical you get, the less astute your assessment of a player’s abilities, but this is not, to my understanding, the purpose of WPA. While WPA does give us a good way of looking at a player’s “clutch” abilities, the fact that they so rarely differ, in a large enough sample, from a player’s “regular” abilities, renders it a fairly useless tool for player assessment. Thus I believe its real use is for reflective purposes – it is meant to provide us with an accurate and unbiased summary of a player’s true contributions (again, NOT abilities, but contributions) to the success of a team. Thus the fact that it “evens out in the end” is, unlike with other stats, irrelevant, in my opinion. Its flaws are, however, essentially uncorrectable, since there is no way to ever accurately assess any player’s likelihood of succeeding against a particular pitcher or hitter. In my opinion, it’s a fun stat, but it’s not a particularly useful one.

    • http://www.riveraveblues.com Joseph Pawlikowski

      I’ve always viewed it as a narrative tool. The real reason I wrote this is for LI — that can provide some insight, at least on a game-to-game basis — but it’s hard to explain LI without WPA. Or, at least, it seemed like a logical combination.

      • king of fruitless hypotheticals

        LI shows, neatly i think, the impact of a pitcher in a given situation…like bringing in a reliever with bases loaded no outs up by one in the 8th, etc.

        i was deeply disturbed by your reference to ‘clutch’ however :)

    • 28 next year

      Like Joe said, it just tells a tale of how a game played out. A graph of WPA can tell you what happened in an individual game and it is fun to look at but overall, it doesn’t provide much value outside the context of the game. I don’t think it was meant to nor should be looked at in that way. I love following the graphs of games just cause it is fun.

  • MTG

    Pretty interesting that home teams have won 54% of the time since 1977. I’m guessing this might have a lot to do with not traveling to play the game, but I wonder if there’s some sort of mental aspect as well in terms of being comfortable when playing at home.

    Also, congrats Joe on the FanGraphs piece. One of my favorite websites, after RAB of course.

    • Sej

      Matt Swartz had a good series on Home Field Advantage on BP a while ago. The article discussed the effects of Travel and series length. I think it’s an open article if you wanted to take a look at it.
      HFA Part 4 of 5

  • PaulF

    What exactly does fangraphs do to get rid of the homefield advantage?

  • Brian

    Since I had switched to a Mac that winter, and since Studes’s spreadsheet didn’t work on Excel 2003 for Mac, I found this a viable solution.

    computing fail. should have gotten linux as it would have still worked for you and its free and open source

  • Pingback: Bullpen shutdowns and meltdowns | River Avenue Blues

  • Pingback: Yanks can’t figure out Moyer, fall 6-3 « Yankees Baseball News

  • Pingback: Yanks drop rubber game, fall 7-1 to Phillies « Yankees Baseball News

  • Pingback: Teixeira, Granderson back Hughes in 5-3 win « Yankees Baseball News

  • Pingback: Nova nails first MLB win in 2-1 thriller :: 8 Dollar Hot Dogs.com

  • Pingback: Side View Mirrors » Blog Archive » Yankees drop Twins to end homestand

  • http://themotoring.com/UsedCarsForSale.aspx usedcarsforsale

    Good to see your blog. Thanks for the info with graph. All are wonderful resources.

  • Pingback: The Five Biggest Yankees Hits of 2011 | River Avenue Blues

  • Pingback: The five biggest hits of the 2012 season | River Avenue Blues

  • Pingback: Beltran disrespects Flanny, swats walk-off homer to give Yankees 5-3 win - Curated News, Rumors and Headlines - FourLeagues

  • Pingback: Roberts delays the inevitable with game-tying homer in the ninth, Yankees fall 3-2 to Rays in 12 innings - Curated News, Rumors and Headlines - FourLeagues