February 10, 2011

Objective PMR

In November I made note of this post, Jeter Uncertainty Principle, which linked to a Baseball Prospectus article that wondered if the batted ball fielding systems were overestimating the range of players.

the spread of observed performance in metrics like UZR and DRS is much, much smaller than that of metrics like nFRAA (or Tom Tango’s With Or Without You system, which is similarly down on Jeter’s fielding ability).

This got me thinking, but unfortunately my computer crashed and I lost my MSSql database with my fielding data. I recently downloaded the retrosheet data through 2009, and decided to do an experiment.

The Probabilistic Model of Range uses six parameters to determine the probability of a ball in play being turned into an out. Three of those, the direction, velocity, and batted ball type, are subjective measures. Three of them, the handedness of the batter, the handedness of the pitcher, and the park are objective. If I used just those three objective measures to construct the model, would I see a bigger spread in the data?

I last computed PMR in 2008, and here is the team listing from that year. I built the retrosheet model a bit differently. I only use visiting fielders so a great or terrible home fielder doesn’t over influence the model, but I used a different set of data to build the model. Instead of using the same year, I used the previous three seasons and the following season. So for 2008, I used the 2005, 2006, 2007 and 2009 seasons. Here are the results for the teams:

Objective PMR, 2008 Teams, model built with visiting team data from 2005, 2006, 2007, and 2009
Team In Play Actual Outs Predicted Outs Actual DER Predicted DER Index
BOS 4229 2954 2825.582 0.699 0.668 104.5
TBA 4265 3024 2908.787 0.709 0.682 104.0
TOR 4217 2962 2881.598 0.702 0.683 102.8
CHN 4164 2930 2857.973 0.704 0.686 102.5
ANA 4374 3024 2973.781 0.691 0.680 101.7
OAK 4292 2992 2944.710 0.697 0.686 101.6
MIL 4362 3048 3006.376 0.699 0.689 101.4
SLN 4604 3198 3159.799 0.695 0.686 101.2
FLO 4342 3005 2970.709 0.692 0.684 101.2
PHI 4399 3061 3023.389 0.696 0.687 101.2
ATL 4392 3040 3008.062 0.692 0.685 101.1
KCA 4416 3039 3008.497 0.688 0.681 101.0
NYN 4341 3030 3004.247 0.698 0.692 100.9
NYA 4351 2963 2941.234 0.681 0.676 100.7
COL 4535 3075 3056.998 0.678 0.674 100.6
CLE 4514 3094 3078.160 0.685 0.682 100.5
HOU 4298 2999 2988.713 0.698 0.695 100.3
BAL 4539 3120 3112.376 0.687 0.686 100.2
DET 4536 3107 3104.271 0.685 0.684 100.1
WAS 4420 3044 3044.363 0.689 0.689 100.0
LAN 4277 2951 2950.631 0.690 0.690 100.0
MIN 4584 3141 3143.573 0.685 0.686 99.9
PIT 4688 3166 3187.588 0.675 0.680 99.3
ARI 4236 2903 2927.417 0.685 0.691 99.2
SEA 4514 3069 3095.687 0.680 0.686 99.1
SDN 4426 3080 3109.792 0.696 0.703 99.0
SFN 4237 2904 2942.056 0.685 0.694 98.7
CHA 4395 3006 3058.851 0.684 0.696 98.3
TEX 4671 3126 3185.941 0.669 0.682 98.1
CIN 4313 2904 2990.839 0.673 0.693 97.1

The number of balls in play do not match up precisely between BIS and Retrosheet, and I’m exploring why (in the case of Baltimore, it was a foul ball error). The ordering is different, which isn’t surprising given the different model and the way the model was built. What I would like you to notice, however, is that the spread of the index is indeed wider than the original PMR model shows. So a probabilistic model that uses only objective parameters also shows a larger spread.

I’m going to try to get more comfortable with the data, making sure the differences are just foul pops. Then, I’ll build some positional models to see what happens there.

3 thoughts on “Objective PMR

  1. dondbaseball

    I profess to being a neophyte on the defensive metric systems because as you noted, most are based on subjective opinion and I find it really irritating when writers, bloggers and fans quote these systems as gospel despite the drastic differences in rating players between them. I view this as a work in progress, similar to pitch f/x once was (and still is) that until ballparks can set up a grid like system using radar type tracking (or whatever else they use for pitching determination) we will not have a reliable system. That being said, I like your continued exploration of defensive evaluation. Because of who he is, I feel Jeter is much maligned by everyone due to flawed systems. He is observed by everyone and usually with a critical eye-meaning people want to see his weaknesses. Do I think he has some range deficiencies? Yes, but not to the extent that UZR and DRS do nor to the extent that Tango keeps mouthing about. Defensive metrics are not concrete like any of the batting metrics and should not be considered as such but people seem to forget that…I look forward to your continued evaluation on this. Keep up the good work!

    ReplyReply
  2. Petr S

    Nice to see the “return” of PMR, David; thanks! The fielding talks at the last sportvision summit were interesting. One of these is online at the web address http://www.whowins.com/wherefieldersfield201007.pdf. It was on SABR-L too. From the later pages in that pitch, seems like the subjective parameters you identify may soon be becoming objective especially if field f/x is made public. Thanks again.

    ReplyReply

Leave a Reply

Your email address will not be published. Required fields are marked *