(I have previously done this for offensive statistics)
Each stabilization point is a six-year calculation, including the current and five previous years (so for example, 2014 incudes 2009-2014 data, 1965 includes 1959 - 1965 data, etc.). There's not a mathematical or baseball reason for this choice - through trial and error it just seemed to provide enough data for estimation that the overall trend was apparent, with a decent amount of smoothing. Data includes only starting pitchers from each year, and for cutoff values (the minimum number of TBF, BIP, etc. to be included in the dataset) I used the same values as in my previous post. Years were split for the same player. Raw counts are used, not adjusted in any form. Relief pitchers are excluded.
All of the code that I used to create these can be found in my github, though I make no claims to efficiency or ease of operation. Because I added this code several months after the article was originally posted, I did not clean and annotate it as I normally would have - I just posted the raw code. The code is a modified form of the code used to calculate offensive stabilization points over time.
All of the plots shown below, and more, can be found in my imgur account.
For some statistics, I will show plots for both the mean of a statistic over time and the stabilization point. The stabilization point driven largely by the underlying population variance of talent levels, which tends to be more difficult to estimate then the mean - hence the reason that, even with six years of moving data, the 'stabilization point' will appear to fluctuate quite a bit. I recommend not reading too much into the fluctuations, but rather looking for more general patterns.
Firstly, the ground ball, fly ball, and line drive rates (per BIP) only have recent data available. In that time, neither the fly ball or ground ball stabilization points have changed much
More interesting is batting average on balls in play, for which we have more data. The standard error for BABIP is approximately 500 balls in play, so it's not wise to trust small fluctuations in this plot as representing real shifts - however, it does appear that there is a positive trend in the stabilization point, indicative of the spread in BABIP values getting smaller. (A plot with 95% error bounds at each point can be found here, though I don't necessarily care for it)
What does coincide with the increase in stabilization point is the decrease in population variance over time, as seen in this animated plot with the observed strikeout rates (histogram) and estimated true talent distribution (dashed line) - the spread in both is constantly increasing over time.
Also interesting is the earned run rate (per inning pitched, min 80 IP).
Beginning in the early 2000s, it dropped to a very low point, relative to its history, and has remained there more consistently than in the past. Meanwhile, the stabilization point for walk rate (min 400 BF) has increased in recent years, after reaching a maximum in the 1980s and decreasing.
On-base percentage and hit by-pitch rate have all fluctuated within a relatively stable area over time.
For some statistics, such as WHIP and home run rate, and it is difficult to discern a pattern other than fluctuations within a certain range.
An interesting look at how certain things have changed over time - though, as I mentioned before, I would encourage not reading too much into these plots.
Post a Comment