March 29, 2011

Fighting Regression

With the 2011 season about to begin, here’s a reminder of how tough it is to win every year. The following graph shows how team wins changed year to year, looking at consecutive 162 games seasons (I left out any pair of seasons where work stoppages caused less than a 162 game schedule to be played). Click for a larger view.

Graph of wins changes from year to year.

Wins in a season as a function of wins in the previous year, consecutive 162 game seasons.

It turns out one half of the previous year’s win total plus 40 is an okay estimate. The regression equation pulls all teams toward 81 wins, which needs to be the result of course. Baseball is a zero sum game in terms of wins. If some team is going to win five more games in a season, those five wins need to come out of another team’s total.

Note that there is a lot of noise in the data, hence the low R-squared. The biggest outliers are the 1999 Diamondbacks who went from 65 wins to 100, and the 1998 Marlins who dropped from 92 wins to 54. Note, too, the high end of the scale, team that win over 105 games. Six of those teams beat the regression equation the next year, two are about right on, and two do worse than the regression predicts. That much talent on a team tends to hold up.

The reason for much of the noise is that team do try to fight against this regression. Long term success means not resting on your laurels, but constantly trying make small improvements to keep a team competitive. (The Braves serve as a great example of this during their 1990s run of playoff appearances. They seemed to make one significant change every year, upgrading a position that wasn’t bad, but could be made better.) The following table shows how organizations fared in beating the regression equation (accumulating more wins than the regression formula predicts):

TeamName Seasons Beat PctBeat
Yankees 42 29 69
Dodgers 41 25 61
Reds 41 24 58.5
Braves 41 24 58.5
Cardinals 41 24 58.5
Blue Jays 28 16 57.1
Athletics 42 24 57.1
Red Sox 42 24 57.1
Giants 39 22 56.4
Phillies 41 23 56.1
Twins/Senators 42 23 54.8
Angels 42 22 52.4
Tigers 42 22 52.4
Marlins 14 7 50
Pirates 41 20 48.8
Astros 41 20 48.8
Orioles 42 20 47.6
Mets 41 19 46.3
White Sox 42 19 45.2
Royals 34 15 44.1
Mariners 28 12 42.9
Indians 40 17 42.5
Diamondbacks 12 5 41.7
Cubs 41 16 39
Padres 34 13 38.2
Rockies 14 5 35.7
Rangers/Senators 42 15 35.7
Nationals/Expos 34 12 35.3
Brewers/Pilots 33 10 30.3
Rays/Devil Rays 12 2 16.7

Not surprisingly, the Yankees and Dodgers, with their great resources over the years, were consistently able to do better than expected. The Athletics, one of the teams that was an early adopter of sabermetrics, also ranks high.

At the other end, there are a bunch of expansion teams and the Cubs. You can see just how poorly the Rays front office performed until recently, the two plus years coming in 2008 and 2010. There’s no excuse for the Cubs. They have a huge fan base and a sold out park. They really should be up there with the Yankees and Dodgers. No wonder they haven’t reached the World Series since 1945.

Leave a Reply

Your email address will not be published. Required fields are marked *