24 May, 2016

2016 Win Prediction Totals (Through May 22)

These predictions are based on my own silly estimator, which I know can be improved with some effort on my part.  There's some work related to this estimator that I'm trying to get published academically, so I won't talk about the technical details yet (not that they're particularly mind-blowing anyway).

I set the nominal coverage at 95% (meaning the way I calculated it the intervals should get it right 95% of the time), but based on tests of earlier seasons point in the season the actual coverage is slightly under 94%, with intervals being one game off if and when they are off.

Intervals are inclusive. All win totals assume a 162 game schedule.

\begin{array} {c c c c} 
\textrm{Team}  & \textrm{Lower}  & \textrm{Mean} & \textrm{Upper} & \textrm{True Win Total}  & \textrm{Current Wins}\\ \hline

ARI & 65 & 79.58 & 94 & 81.81 & 21 \\
ATL & 48 & 61.91 & 77 & 67.95 & 12 \\
BAL & 74 & 89.11 & 104 & 85.19 & 26 \\
BOS & 80 & 94.48 & 109 & 92.65 & 27 \\
CHC & 88 & 102.56 & 117 & 99.31 & 29 \\
CHW & 75 & 89.64 & 104 & 87.36 & 26 \\
CIN & 49 & 63.47 & 78 & 66.52 & 15 \\
CLE & 71 & 85.5 & 100 & 85.03 & 22 \\
COL & 66 & 80.94 & 96 & 80.92 & 21 \\
DET & 66 & 80.55 & 95 & 81.05 & 21 \\
HOU & 57 & 71.08 & 86 & 74.86 & 17 \\
KCR & 64 & 79.12 & 94 & 77.75 & 22 \\
LAA & 62 & 76.87 & 91 & 78.08 & 20 \\
LAD & 68 & 82.33 & 97 & 83.53 & 22 \\
MIA & 67 & 81.46 & 96 & 80.95 & 22 \\
MIL & 57 & 71.51 & 86 & 73.47 & 18 \\
MIN & 47 & 61.06 & 76 & 68.16 & 11 \\
NYM & 72 & 87.09 & 102 & 84.52 & 25 \\
NYY & 63 & 77.99 & 93 & 77.6 & 21 \\
OAK & 58 & 71.82 & 86 & 73.12 & 19 \\
PHI & 67 & 81.6 & 96 & 77.71 & 25 \\
PIT & 69 & 83.7 & 98 & 81.94 & 23 \\
SDP & 59 & 72.83 & 87 & 74.53 & 19 \\
SEA & 76 & 90.55 & 105 & 87.86 & 26 \\
SFG & 72 & 85.92 & 100 & 82.28 & 27 \\
STL & 73 & 87.24 & 102 & 88.18 & 23 \\
TBR & 68 & 82.68 & 98 & 83.92 & 20 \\
TEX & 70 & 84.81 & 99 & 82.1 & 25 \\
TOR & 65 & 79.6 & 94 & 80.45 & 22 \\
WSN & 78 & 92.88 & 107 & 90.45 & 27 \\  \hline\end{array}

As you would expect, it's really, really difficult to predict how many games a team is going to win only a quarter of the way through the season, and intervals are necessarily going to be very wide. A couple of things stand out, though - at this point we can be confident that the Chicago Cubs will finish above 0.500 and the Minnesota Twins, Cincinnati Reds, and Atlanta Braves will finish below 0.500. For every other team, we just don't have enough information yet.

To explain the difference between "Mean" and "True Win Total"  - imagine flipping a fair coin 10 times. The number of heads you expect is 5 - this is what I have called "True Win Total," representing my best guess at the true ability of the team over 162 games. However, if you pause halfway through and note that in the first 5 flips there were 4 heads, the predicted total number of heads becomes $4 + 0.5(5) = 6.5$ - this is what I have called "Mean", representing the expected number of wins based on true ability over the remaining schedule added to the current number of wins (from the beginning of the season until May 22).

These quantiles are based off of a distribution - I've uploaded a picture of each team's distribution to imgur. The bars in red are the win total values covered by the 95% interval. The blue line represents my estimate of the team's "True Win Total" based on its performance - so if the blue line is to the left of the peak, the team is predicted to finish "lucky" - more wins than would be expected based on their talent level - and if the blue line is to the right of the peak, the team is predicted to finish "unlucky" - fewer wins that would be expected based on their talent level.

No comments:

Post a Comment