Simple Football Predictor: Week 9 Predictions Including the Spread
Riding high after predicting a record high in week 7, the model regressed down to only predicting 50% of the games correctly week 8. See this week’s predictions below along with a new Linear Regression model to help predict the spread.
Week 8 Results:
According to the normal sports media and football punditry week 8 was full of upsets, so I shouldn’t be too worried about the weak performance last week, particularly because I did about as expected in my more confident tiers. Additionally, unlike last week, there is no evidence of the home bias in my incorrect results, I shot at 50% regardless of whether the model predicted the home team or the road team to win.
For the season (not including week 5, when my model was all over the place due to a lack of data) the model is correctly predicting the winner roughly 55% of the time. Once again, the model is doing about as well as expected in the higher confidence picks but struggling (thanks a ton to this week) in the lower confidence levels, particularly the 50–55% range. Anyhow, we are still only looking at 42 games, so the model should correct itself over time.
Improvements:
This week I will be running 2 models. The first is the Logistic Regression model I’ve been using the last 4 weeks to predict whether the home or road team will emerge victorious, that model will remain the same as last week, so no need to delve into the details again (you can find all previous predictions here). The second model is a new Linear Regression Model to help predict the margin of victory
Without delving too much into the math behind arriving at each ‘b’ the linear regression model above can be interpreted as follows:
Y(hat) represents the predicted outcome, in our case margin of home victory (a negative value would indicate a road victory by that margin).
b0 is the intercept, meaning if all X’s were to equal 0, we would predict this value. In our case, b0 can be interpreted as the predicted margin of victory if both teams entered the game averaging 0 points scored and allowed.
the rest of the b’s are effects on the predicted value for every increase in the value of the associated X by 1. For example if X1 is the average points for by the home team, and b1 = .9, than for each 1 point increase in that average, the predicted victory margin will increase by .9 points.
After using Python to create the regression model we arrive at the below formula for predicting the margin of victory.
Home_win_margin = 5.3097 + (.3575 x 3_wk_avg_points_for_home) + (-.3609 x 3_wk_avg_points_against_home) + (-.4123 x 3_wk_avg_points_for_away) + (.2604 x 3_wk_avg_points_against_away)
For example running this formula against last week’s Giants - Buccaneers game gives the following result:
Giants_win_margin = 5.3097 + (.3575 x 25) + (-.3609 x 26) + (-.4123 x 34) + (.2604 x 16.67) = -4.81 or a Buccaneers victory of about 5 points, in actuality the Buccaneers only won by 2.
Finally, the R-squared for this model is only .093, meaning this model only explains 9.3% of the variability in win_margin, so we shouldn’t expect too accurate findings from this model. Hopefully, with more features in future weeks, we can improve the R-squared.
Predictions:
Predictions vs spread by game (spread as of 11:59PM Wednesday)
Packers (-7) vs 49ers: 49ers
Broncos vs Falcons (-4): Falcons
Seahawks(-2.5) vs Bills: Bills
Ravens (-2.5) vs Colts: Colts
Texans (-7) vs Jaguars: Jaguars
Panthers vs Chiefs (-11): Chiefs
Lions vs Vikings (-4): Lions
Bears vs Titans (-6): Bears
Giants vs Washington (-3): Giants
Raiders vs Chargers (-1): Chargers
Dolphins vs Cardinals (-4.5): Dolphins
Steelers (-13.5) vs Cowboys: Cowboys
Saints vs Buccaneers (-5): Buccaneers
Patriots (-7) vs Jets: Jets
Quick Observations:
- The model for margin of victory seems to be very cautious only predicting double digit victories twice and 8 out 14 games finishing within 3 points. Hopefully this will correct itself a bit when we add more features starting next week.
- The Jets and Vikings are both predicted to win (due to the home bias in the model, not actual skills) by a negative total. The discrepancy is caused by the 2 being predicted by separate models. In both models these two games are some of the closest to a 50/50 toss up, so they could go either way.
- Next week we will be adding in 3 week average yardage into our models. This will include Passing, Rushing, and Total Yards, both for and against. We will again test multiple models to see which has the best predictive power.
I am currently enrolled in the Applied Data Science Professional Certificate program at the Thayer School of Engineering at Dartmouth College. Please reach out to met via LinkedIn (https://www.linkedin.com/in/nathanielselevan/) or comment below with all questions, comments, project suggestions, or to act as a mentor. All likes and comments will be well appreciated.