November 28, 2006
Probabilistic Model of Range, Third Basemen, 2006
There's been a suggestion to present the data in a different format, so I'm going to try that with the third basemen. I'm also just reporting the mixed velocity/distance model here. People seem to like that model better. At some point, I'll redo the tables for the positions posted ealier. Here's the ranking of the third baseman based on difference in DER.
Probabilistic Model of Range, Third Basemen. Model is Based on 2006 Data Only. Minimum 1000 Balls in Play. Uses Distance for Fly Balls.
Player | In Play | Actual Outs | Predicted Outs | DER | Predicted DER | Difference |
Joe Crede | 3962 | 436 | 397.55 | 0.110 | 0.100 | 0.00971 |
Freddy Sanchez | 2527 | 285 | 265.88 | 0.113 | 0.105 | 0.00757 |
Pedro Feliz | 4278 | 420 | 391.93 | 0.098 | 0.092 | 0.00656 |
Brandon Inge | 4278 | 506 | 479.75 | 0.118 | 0.112 | 0.00614 |
Adrian Beltre | 4159 | 416 | 393.60 | 0.100 | 0.095 | 0.00539 |
Maicer E Izturis | 2069 | 182 | 171.49 | 0.088 | 0.083 | 0.00508 |
Scott Rolen | 3788 | 390 | 371.79 | 0.103 | 0.098 | 0.00481 |
Mike Lowell | 3990 | 429 | 411.96 | 0.108 | 0.103 | 0.00427 |
Morgan Ensberg | 2917 | 289 | 276.96 | 0.099 | 0.095 | 0.00413 |
Ryan W Zimmerman | 4383 | 382 | 365.01 | 0.087 | 0.083 | 0.00388 |
Andy M Marte | 1348 | 141 | 135.81 | 0.105 | 0.101 | 0.00385 |
Corey Koskie | 1847 | 189 | 182.02 | 0.102 | 0.099 | 0.00378 |
David Bell | 3716 | 347 | 334.10 | 0.093 | 0.090 | 0.00347 |
Willy Aybar | 1388 | 106 | 102.60 | 0.076 | 0.074 | 0.00245 |
Eric Chavez | 3607 | 362 | 353.27 | 0.100 | 0.098 | 0.00242 |
Nick Punto | 2256 | 217 | 212.22 | 0.096 | 0.094 | 0.00212 |
Miguel Cabrera | 4010 | 349 | 342.51 | 0.087 | 0.085 | 0.00162 |
Vinny Castilla | 1755 | 161 | 158.59 | 0.092 | 0.090 | 0.00138 |
Chad A Tracy | 3930 | 339 | 337.78 | 0.086 | 0.086 | 0.00031 |
Hank Blalock | 3374 | 293 | 292.07 | 0.087 | 0.087 | 0.00027 |
Melvin Mora | 4109 | 372 | 372.59 | 0.091 | 0.091 | -0.00014 |
David A Wright | 4041 | 356 | 359.00 | 0.088 | 0.089 | -0.00074 |
Troy Glaus | 3586 | 324 | 326.88 | 0.090 | 0.091 | -0.00080 |
Aramis Ramirez | 3934 | 333 | 336.63 | 0.085 | 0.086 | -0.00092 |
Chipper Jones | 2811 | 247 | 250.06 | 0.088 | 0.089 | -0.00109 |
Mark T Teahen | 2954 | 286 | 289.22 | 0.097 | 0.098 | -0.00109 |
Abraham O Nunez | 1876 | 182 | 184.40 | 0.097 | 0.098 | -0.00128 |
B.J. Upton | 1326 | 114 | 115.79 | 0.086 | 0.087 | -0.00135 |
Mark DeRosa | 1098 | 97 | 99.17 | 0.088 | 0.090 | -0.00197 |
Alex Rodriguez | 3968 | 330 | 338.71 | 0.083 | 0.085 | -0.00219 |
Wilson Betemit | 1831 | 142 | 146.67 | 0.078 | 0.080 | -0.00255 |
Garrett Atkins | 4385 | 358 | 375.87 | 0.082 | 0.086 | -0.00408 |
Edwin Encarnacion | 2908 | 252 | 265.44 | 0.087 | 0.091 | -0.00462 |
Aubrey Huff | 2133 | 193 | 203.79 | 0.090 | 0.096 | -0.00506 |
Aaron Boone | 2748 | 221 | 235.26 | 0.080 | 0.086 | -0.00519 |
Tony Batista | 1354 | 114 | 124.03 | 0.084 | 0.092 | -0.00741 |
Rich Aurilia | 1109 | 101 | 112.09 | 0.091 | 0.101 | -0.01000 |
As you can see, Joe Crede earned that gold glove. Now here's the same list using just outs, and sorted by 100*Actual Outs/Predicted Outs.
Probabilistic Model of Range, Third Basemen. Model is Based on 2006 Data Only. Minimum 1000 Balls in Play. Uses Distance for Fly Balls. Sorted by Out Ratio.
Player | InPlay | Actual Outs | Predicted Outs | Out Difference | Out Ratio |
|
---|
Joe Crede | 3962 | 436 | 397.55 | 38.45 | 109.67 |
Freddy Sanchez | 2527 | 285 | 265.88 | 19.12 | 107.19 |
Pedro Feliz | 4278 | 420 | 391.93 | 28.07 | 107.16 |
Maicer E Izturis | 2069 | 182 | 171.49 | 10.51 | 106.13 |
Adrian Beltre | 4159 | 416 | 393.60 | 22.40 | 105.69 |
Brandon Inge | 4278 | 506 | 479.75 | 26.25 | 105.47 |
Scott Rolen | 3788 | 390 | 371.79 | 18.21 | 104.90 |
Ryan W Zimmerman | 4383 | 382 | 365.01 | 16.99 | 104.65 |
Morgan Ensberg | 2917 | 289 | 276.96 | 12.04 | 104.35 |
Mike Lowell | 3990 | 429 | 411.96 | 17.04 | 104.14 |
David Bell | 3716 | 347 | 334.10 | 12.90 | 103.86 |
Corey Koskie | 1847 | 189 | 182.02 | 6.98 | 103.84 |
Andy M Marte | 1348 | 141 | 135.81 | 5.19 | 103.82 |
Willy Aybar | 1388 | 106 | 102.60 | 3.40 | 103.32 |
Eric Chavez | 3607 | 362 | 353.27 | 8.73 | 102.47 |
Nick Punto | 2256 | 217 | 212.22 | 4.78 | 102.25 |
Miguel Cabrera | 4010 | 349 | 342.51 | 6.49 | 101.89 |
Vinny Castilla | 1755 | 161 | 158.59 | 2.41 | 101.52 |
Chad A Tracy | 3930 | 339 | 337.78 | 1.22 | 100.36 |
Hank Blalock | 3374 | 293 | 292.07 | 0.93 | 100.32 |
Melvin Mora | 4109 | 372 | 372.59 | -0.59 | 99.84 |
David A Wright | 4041 | 356 | 359.00 | -3.00 | 99.17 |
Troy Glaus | 3586 | 324 | 326.88 | -2.88 | 99.12 |
Aramis Ramirez | 3934 | 333 | 336.63 | -3.63 | 98.92 |
Mark T Teahen | 2954 | 286 | 289.22 | -3.22 | 98.89 |
Chipper Jones | 2811 | 247 | 250.06 | -3.06 | 98.78 |
Abraham O Nunez | 1876 | 182 | 184.40 | -2.40 | 98.70 |
B.J. Upton | 1326 | 114 | 115.79 | -1.79 | 98.46 |
Mark DeRosa | 1098 | 97 | 99.17 | -2.17 | 97.82 |
Alex Rodriguez | 3968 | 330 | 338.71 | -8.71 | 97.43 |
Wilson Betemit | 1831 | 142 | 146.67 | -4.67 | 96.82 |
Garrett Atkins | 4385 | 358 | 375.87 | -17.87 | 95.25 |
Edwin Encarnacion | 2908 | 252 | 265.44 | -13.44 | 94.94 |
Aubrey Huff | 2133 | 193 | 203.79 | -10.79 | 94.71 |
Aaron Boone | 2748 | 221 | 235.26 | -14.26 | 93.94 |
Tony Batista | 1354 | 114 | 124.03 | -10.03 | 91.91 |
Rich Aurilia | 1109 | 101 | 112.09 | -11.09 | 90.11 |
As you can see, the order is almost exactly the same. From this chart, the Indians should be happier with Marte at third than Boone. And Alex Rodriguez must have made up for all those errors someplace else, since he's only down 8 outs. Freddy Sanchez did it all, winning a batting title and playing a great third base. The Joe Randa injury was the best thing to happen to Pittsburgh last year.
Please let me know which presentation you like better in the comments.
I like the "100 is average" scale a LOT better. Interesting stuff! Thanks!
Now if only Double-E could quit making so many E's for my Reds!
beltre may be a huge bust with the bat but he plays stellar defense.
Your positive rating of Miguel "The Butcher" Cabrera is perplexing to say the least.
I like the order of magnitude difference - the percentages are much easier on the eyes than the raw DER difference, especially for individuals.
I like more what it does to the data itself. Using the out ratio removes bias for players who got easy-to-field balls (high predicted DER).
Example:
Third basemen A and B both see 2500 balls in play.
Player A sees many balls hit toward third base and weakly, has a high predicted DER and is expected to make 300 outs. He fields well and makes 320 outs. His DER delta is .008 and his out ratio is 106.67.
Player B sees few balls hit toward third base and hard, has low predicted DER and is expected to make 200 outs. He fields extremely well and makes 217 outs. His DER delta is .0068 and his out ratio is 107.5.
DER Difference rates player A higher but Out Ratio (correctly) rates player B higher. Player A got mroe "extra" outs but player B's outs are more impressive given the opportunities he got.
Math:
Oa = Actual Outs
Op = Predicted Outs
B = Balls in Play
Out Ratio = Oa/Op
Out Ratio is a rate statistic of how many outs made per outs the player "should have made."
DER Difference = (Oa-Op)/B
DER Difference is a rate statistic of how many extra outs are made (or lost) per ball seen by the *team*. It ignores some of the information we have available: how many balls the player should have caught at his position.
I like the "100 is average" tweak as well. That is one of the reasons why I have always been fond of ERA+ and OPS+.
On another note, I thought you were partial to using both distance AND velocity, whereas now you are using only distance... May I ask why?
One more vote for the second chart... love the 100-base scale.
Not surprisingly, I like the 2nd table as well. Now if you drop the decimals in the last 3 columns (which are distracting and imply far mor precision than is real), I think you'll have a really great presentation of the data.
I think you're insinuating that Crede won a Gold Glove this year, and he didn't. Chavez robbed him.
Veteran White Sox watchers will agree, Crede does have solid defense. It is to their credit that they have stuck with him while his bat catches up. With his contract coming to an end, and Boras as an agent, he's going to be a very rich young man. I like the ranking using "100" easy to follow and understand.
The "out difference" should be rounded to the whole number.
If we use that as a precision-level, then the out ratio needs to be rounded to the first decimal place.
That is:
400/400*100= 100.00
401/400*100=100.25
402/400*100=100.50
403/400*100=100.75
404/400*100=101.00
Each one is one out more, so no need to show it to the second decimal place. You can reasonably arguy that the out ratio should be to the zero-th decimal place, since a guy who is +0 and +4 are probably the "same" when you consider the uncertainty level of the metric.
***
As for maynard's reasoning of differential and ratio, it doesn't apply as much as he's stating. The extreme among the 3B "predicted" is .080 to .112. If you have 4000 balls in play, one guy has 320 expected outs, and the other guy has 448 expected outs. If they are both 10% above (i.e., out ratio of 110), one is +32 and the other is +45.
However, in terms of significance, the first guy, through no fault of his own, actually did have fewer opps, and therefore, his 110 score is actually less meaningful than the second guy's 110 score.
The "true" answer is to do what I do, and convert the scores into z-scores. You will find that the "true" answer will lie somewhere between the "out difference" and the "out ratio".
Nonetheless, since David is now showing both side-by-side, we have no issue. Great job to David.
yeah, Crede was robbed again for Gold Glove, but somehow he won 3b Silver Slugger over A Rod.
"yeah, Crede was robbed again for Gold Glove, but somehow he won 3b Silver Slugger over A Rod."
Cash wise, this probably evens out. Chavez better watch out: unless he hits and fields well next year that award is Crede's.
I'm inclined to like the first table a bit better. What about sorting the data by out difference, however? That would give the clearest picture of who the most valubale defenders actually were.