Feb
11

The stats we use: WPA and LI

By

Now that we have discussed our favorites among defensive, offensive, and pitching stats, we’re going to move onto a more general one. Today I’m going to explain Win Probability Added, or WPA, and Leverage Index, or LI. Both are pretty simple concepts, but we use them enough on RAB that I’d like to add it to our stats guide.

Quick terms

I like the term WPA, because it accurately describes what the stat tells. You’ll also see me talk about WE, or Win Expectancy. Put simply, WE represents a team’s chances of winning at any point in the game, and WPA represents the play-by-play swing in WE.

Calculating win expectancy

It’s the bottom of the fifth, two outs, runners on first and third, home team down by one. What are the chances of the home team winning that game? Thanks to the abundance of freely available data, we can pore through historical records and find out. (Where would we be without Retrosheet?) With over 70,000 games played since 1977, we have plenty of data to draw from.

The answer to the above question, according to Walk Off Balk’s Win Expectancy Finder, is that the home team won 42.9 percent of the time. If the batter singled in the runner from third, tying the score and placing runners on first and second with two out, the home team’s win expectancy rises to 57.1 percent, or a 7.9 percent swing. When we calculate Win Probability Added, the hitter gets credited with .079, and the pitcher gets debited.

As we’ll see in a second, however, this particular Win Expectancy Finder contains certain flaws.

Stripping out bias

I first started following WPA in 2005 when I wrote some blog that no one read. To calculate it I used Dave Studeman’s WPA spreadsheet, which was based on the Win Expectancy Finder. All I had to do was input the game’s play-by-play results, and the spreadsheet would track WPA throughout the game, assigning blame and credit to pitchers and hitters, and in the end creating a neat graph. It seemed like the perfect implementation.

Then, when RAB started in 2007, I discovered FanGraphs. They tracked the Win Expectancy of all games, basically doing the job the spreadsheet did. Since I had switched to a Mac that winter, and since Studes’s spreadsheet didn’t work on Excel 2003 for Mac, I found this a viable solution. Yet there are differences in how FanGraphs calculates Win Expectancy and how the WE Finder does.

The biggest difference between the two is run environment. Some years teams score more runs than others. I’m not sure if the WE Finder adjusts for this, depending on the year range you select, but FanGraphs does. The site uses the most up-to-date Win Expectancy tables, while the WE Finder runs only through the 2006 season. Those all help the accuracy of FanGraphs’s WPA measures.

The final aspect might seem a bit controversial to some, but it’s really not. In the WE Finder, the game begins already slanted to the home team. Since home teams won 54 percent of games between 1977 and 2006, the game starts with the home team having 0.540 WE. That means if they put up a scoreless first, they have a nearly 60 percent WE when coming to bat. This might make sense at first, but after further examination I prefer the FanGraphs method, where the WE starts at 50 percent.

The main question people ask upon hearing this is, “If a home team wins 54 percent of the time, shouldn’t we take that into account?” If we take that into account, however, where do we stop? We know that Johan Santana wins a certain percentage of his games. Why not adjust WPA at the start of the game to reflect this? Why not adjust for day and night games? Weekday and weekend? There are so many pre-game factors involved that it’s best to strip all bias and start everyone on equal footing.

What about those weird graphs?

Above is the WPA graph for World Series Game 6. Pretty boring, eh? If that were a normal game in June, we wouldn’t much care for it. Unfortunately, the WPA graph doesn’t adjust for the home team’s fans’ excitement.

The graph is relatively self-explanatory. The green line tracks the WE as the game goes along. As it draws closer to the bottom, the visiting team has the advantage. As it draws closer to the top, the home team has the advantage.

For a more interesting WPA graph:

Next up: what’s that bar graph at the bottom?

Leverage Index

The concept of clutch hitting has permeated baseball since its inception. Some players rise to the occasion, while others don’t. Until LI, we had no real way of measuring clutch ability. We just worked off anecdotal evidence of of writers and fans touting some players while eviscerating others. With Leverage Index, though, we can determine just how important a situation is, and then how players performed in those situations.

A situation with a LI of 1 is considered average. The higher the number, the more crucial the situation. If the number falls below one, it is considered a relatively unimportant situation. Leverage index considers the base, out, and score situation, so at-bats in the ninth inning of a one-run game will count for much more than a comparable situation in the third.

For example, if the home team has the bases loaded with two outs in the bottom of the second , down by one run, the LI in that situation is 3.1. The same situation, but in the bottom of the ninth, yields the highest possible LI, 10.9. You can find a full chart of LI by inning/base/out situation in the resources section.

Any questions?

This is the toughest statistic for me explain, because I’m so familiar with it. I’ve been using and examining WPA for almost five years now, so what seem self-evident to me might not to others. Make sure to ask any questions in the comments, or email them to me. I’m more than willing to edit this guide so it’s as accurate and comprehensive as possible when we create our full guide.

Resources

The One About Win Probability
Get to Know: Leverage Index
Crucial Situations: Part 1, Part 2, Part 3
Leverage Index table
Win Expectancy Finder

Categories : Analysis

22 Comments»

  1. king of fruitless hypotheticals says:

    you know what i like about that twins graph? it goes to 11.

  2. 28 next year says:

    Is that 7.9 swing figure right? 42.9 to 57.1 is not 7.9 or double that. Could you clarify as to how that is calculated?

  3. pete says:

    the one issue i have with WPA is the fundamentally flawed nature of win probability to begin with. Or it’s common interpretation, rather. For me, because it is inherently situational, its primary use is basically for fun – a way to look back at a game or a season and gauge individual significance. Unfortunately, it does somewhat fail in this regard, because the situation itself is far from identical to the generic one plugged into WPA. That is to say, there is a huge difference between singling the runner in in the aforementioned bottom 5th, 1st and 3rd, 2 out situation with ramiro pena hitting behind you, and doing so with alex rodriguez hitting behind you.

    Of course, the more microanalytical you get, the less astute your assessment of a player’s abilities, but this is not, to my understanding, the purpose of WPA. While WPA does give us a good way of looking at a player’s “clutch” abilities, the fact that they so rarely differ, in a large enough sample, from a player’s “regular” abilities, renders it a fairly useless tool for player assessment. Thus I believe its real use is for reflective purposes – it is meant to provide us with an accurate and unbiased summary of a player’s true contributions (again, NOT abilities, but contributions) to the success of a team. Thus the fact that it “evens out in the end” is, unlike with other stats, irrelevant, in my opinion. Its flaws are, however, essentially uncorrectable, since there is no way to ever accurately assess any player’s likelihood of succeeding against a particular pitcher or hitter. In my opinion, it’s a fun stat, but it’s not a particularly useful one.

    • I’ve always viewed it as a narrative tool. The real reason I wrote this is for LI — that can provide some insight, at least on a game-to-game basis — but it’s hard to explain LI without WPA. Or, at least, it seemed like a logical combination.

      • king of fruitless hypotheticals says:

        LI shows, neatly i think, the impact of a pitcher in a given situation…like bringing in a reliever with bases loaded no outs up by one in the 8th, etc.

        i was deeply disturbed by your reference to ‘clutch’ however :)

    • 28 next year says:

      Like Joe said, it just tells a tale of how a game played out. A graph of WPA can tell you what happened in an individual game and it is fun to look at but overall, it doesn’t provide much value outside the context of the game. I don’t think it was meant to nor should be looked at in that way. I love following the graphs of games just cause it is fun.

  4. MTG says:

    Pretty interesting that home teams have won 54% of the time since 1977. I’m guessing this might have a lot to do with not traveling to play the game, but I wonder if there’s some sort of mental aspect as well in terms of being comfortable when playing at home.

    Also, congrats Joe on the FanGraphs piece. One of my favorite websites, after RAB of course.

    • Sej says:

      Matt Swartz had a good series on Home Field Advantage on BP a while ago. The article discussed the effects of Travel and series length. I think it’s an open article if you wanted to take a look at it.
      HFA Part 4 of 5

  5. PaulF says:

    What exactly does fangraphs do to get rid of the homefield advantage?

  6. Brian says:

    Since I had switched to a Mac that winter, and since Studes’s spreadsheet didn’t work on Excel 2003 for Mac, I found this a viable solution.

    computing fail. should have gotten linux as it would have still worked for you and its free and open source

  7. [...] metrics yesterday called Shutdowns and Meltdowns. It’s a simple idea involving relievers and win probability. A Shutdown (SD) is when a reliever accumulates at least 0.06 WPA in an outing, while a Meltdown [...]

  8. [...] tacked on two more runs in the next inning on a Shane Victorino homer, which WPA says was the biggest hit of the game. It wasn’t a terrible pitch, an 86 mph cutter in the on [...]

  9. [...] as part of a seven pitch at-bat. Technically, that was a biggest Yankee hit of the game at 17.3 WPA, but for all intents and purposes Tex’s shot was just as important. His checked in at 17.2 [...]

  10. [...] door, his time on the Yanks may be limited. But on Sunday, he gave the team its biggest late-inning WPA bump as the White Sox’s win expectancy dropped from 40.9 to 24.8 with that caught stealing. [...]

  11. [...] it only tied the game, the double improved the Yankees chances of winning by 16.9%, the biggest WPA swing of the [...]

  12. Good to see your blog. Thanks for the info with graph. All are wonderful resources.

  13. [...] going to look at the five biggest hits of the Yankees season in terms of Win Probability Added, or WPA. Hannah did something very similar a few weeks ago, looking at the biggest hits by Leverage Index, [...]

  14. [...] some help from win probability added, or WPA, we’re going to take a look at New York’s five biggest hits of this season. If [...]

Leave a Reply

You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.

If this is your first time commenting on River Ave. Blues, please review the RAB Commenter Guidelines. Login for commenting features. Register for RAB.