# Now *this* is a nerd

By
Maybe Nate Silver can predict the success or failure of a baseball team, but can he predict how many times they’ll head to extra innings? Well, probably, but he’s not the only one. From the Albany Times Union, a senior at Russell Sage College has published a study predicting certain events in baseball games. It has won Rebecca Gregory a place at the Conference on Undergraduate Research at the University of Wisconsin in April.

For those who don’t have a degree in mathematics:

The purpose of this research was to apply probability theory to determine if the 2008 New York Mets and New York Yankees baseball teams followed the same scoring patterns that other baseball teams have historically done…In addition to analyzing the 2008 New York Mets and New York Yankees scoring patterns, this paper extends the research by Glass & Lowry to include the probability that an average game would require extra innings…This research concludes that the 2008 New York Mets and New York Yankees had scoring patterns that:…were similar to the historical scoring patterns of the average major league baseball team.”

The rest talks about quasigeometric distributions, so I was lost before I even started. I did Google it, though every single result is a reference to “Quasigeometric Distributions and Extra Inning Baseball Games,” which is the work cited and elaborated on by Gregory.

I haven’t seen the whole study, but here are some highlights, from the Times Union article:

The bottom line is that Gregory determined that a baseball team, in theory, has an 11.5 percent chance of scoring two runs during a game. When analyzing the data for the 2008 Mets, she found the team scored two runs 11.9 percent of the time. The Yankees score was a much smaller percentage.

…

She also determined that, theoretically, 10.2 percent of Major League games should reach extra innings. But in reality 9.2 percent of all baseball games went into extra innings. And there was more, having to do with the odds of getting a first and second run.

Wow, that sounds like pretty complicated stuff. Sometimes I feel like these “super statistical analysis” things suck some of the pleasure out of the game for me, but then I go back to reading Baseball Prospectus.

Also, neat tattoo.

It’s really not that complicated, the jargon just makes it sound that way. I haven’t read the quasigeometric distribution book that is cited in the article but the name suggests it is just a variation of a geometric distribution which also sounds more complicated than it is. A geometric distribution is just the probability associated with the number of events before something occurs.

A simple example is flipping a coin. Everyone knows the probability of heads is .5. What’s the probability that you will get a heads on the first flip? That’s .5 of course. What’s the probability that you will get your first heads on the second flip? Well, you would need to get a tails on the first flip (that’s a .5 probability) and a heads on the second flip (again, .5 prob.), so the probability that both of those things happens is .5 times .5, which is .25. What’s the probability the first heads appears on the third flip? You would need two tails then a heads, which is .5 x .5 x .5, or .125. You keep doing this process and you have built a geometric probability distribution. Really simple. It’s gets a little more complicated when you are talking about extra inning games but not a ton more complicated.

who has time for this stuff? just kidding…its pretty cool

Well it appears that it may have been a thesis for class. Thus he was able to wind his hobby and schooling together.

Hi,

Saw this question over at yankee universe.

My question, has this analysis been done?

Is it easy to look at all AL players and rank them, say by OBP + SLG vs pitchers with ERA+ of 110?

—-

…the ability to hit better pitching. We’ve all seen minor leaguers who mash at one level, yet can’t hack it at the next. Isn’t it sensible to think that major leaguers are the same way? Isn’t a hitter’s ability to hit top pitching a better way to evaluate a player than looking at his numbers in “bases loaded” “scoring position, 2 out” “late and close” or “runners on” situations, particularly if you’re a team like the Yankees who is very interested in how a player will perform in playoff situations, where the pitching is much better?

—–

you can use the baseball-reference play index if you pony up the nominal fee to do so.

Or, if you’re REALLY a nerd, you can just hack that biznatch.

Unfortunately Im not going to do this, mainly due to my own time constraint.

But I think its an interesting question and probably more important than “clutch”. I would rather want to know who performs well against the best pitchers (who you will probably face in the post season)

Is there any one up for the challenge?

50 percent of the time it works every time.

It’s made with real bits of panther… so you know it’s good.

If the thick glasses fit….