Earlier today, based on a post at The Book Blog, I proposed Closeness scores, the average of the lead at the end of every half inning. I calculated the score for each game since 1974 out to two decimal places, and the following graph shows the distribution of those scores:
This looks a bit like a gamma distribution. The fascinating thing to me are the peaks at integer values. What it may be telling us is that once a team achieves a lead, that lead is sticky. If a team gets out to an early 2-0 lead, the lead tends to stay two runs throughout the game. I’d love to hear a number’s expert take on the shape of this histogram.
If you enjoy this kind of research, consider donating to the Baseball Musings Pledge Drive.



I think what makes scores “sticky” is simply that zero is by far the most frequent number of runs scored. So leads will often last.
There are a lot of ways to get the 1.0 score. For example:
*Visitor scores 1 in top of 1st, wins 1-0.
*Visitor scores 1 in top of first, then gives up 2 runs in bottom of ANY subsequent inning to lose 2-1.
*H team scores 1 bottom of 1st, V scores 3 in T9, H scores 1 in B9 (V wins 2-1).
@Guy: Yes, the stickiness turned out to be an artifact of the rounding.
Ah, that makes sense. Is there any bump at the integers at all if you take it out to 3 or 4 places, or smooth?