Machine Learning in the AFL

Mousey · Apr 27, 2016

Blake said: ↑

Click to expand...

Does this include results from previous seasons? If so it looks pretty bloody good.

Blake · Apr 27, 2016

Rego said: ↑

What are the variables in a basic sum up?
Click to expand...

No variables apart from team vs team results. This is using an ELO rating system for each team, similar to which is used in chess to score players. It gives teams more points for beating teams better than them, and vice versa if they lose to teams weaker than them. The good thing about this is that it can be run historically to get an idea of which team had the most dominance over their peers. I might try and collect some more data and see which teams come out on top.

Mousey said: ↑

Does this include results from previous seasons? If so it looks pretty bloody good.
Click to expand...

Yep! Started from 2000 (although because teams' ratings reset to a degree each season, it would have virtually no impact). However, ratings from 2014/13/12 are still taken into account in this model.

Blake · Apr 27, 2016

I've spent today adding 1960-1999 into the database. Looking from 1960 forwards, we can now do a historical comparison between teams of different eras:

Blake · Apr 27, 2016

BEBot 1.0 tips this weekend's round of footy:

North Melbourne v Western Bulldogs
Melbourne v St Kilda
Adelaide Crows v Fremantle
GWS Giants v Hawthorn
Richmond v Port Adelaide
Geelong Cats v Gold Coast Suns
Brisbane Lions v Sydney Swans
Carlton v Essendon
West Coast Eagles v Collingwood

Blake · Apr 27, 2016

BeBot 1.1 in progress - now factors in game margins when adjusting team's ratings after a win or loss. This probably won't make much sense immediately, but this is the process I am using to optimise the parameters of my ELO system. One of the main factors is the K value, or recency bias as I've deemed it.

How the K value affects ELO:

New_Rating = Old_Rating + K * (Result - Expected)

So, if a team is 50% chance of victory based off their ELO rating, and wins the game, and has a base rating of 1500, the formula would look like:

New_Rating = 1500 + K * (1 - 0.5)
New_Rating = 1500 + 0.5K

With BEBot 1.1 in progress, I am again modifying this ratings process:

New_Rating = Old_Rating + K * (Result - Expected) * Margin_Adjustment

This Margin_Adjustment is a new feature. Ideally, with a margin of 0, the adjustment would be equal to 1, meaning that the K value stays the same. The trick is now finding the best function for the margin adjustment, such that a victory of 100 points increases the rating by an additional... well, how many really should it?

Here's an example function I've used for BEBot 1.1:

Margin_Adjustment = (our_score/their_score)^alpha

As alpha increases, the margin adjustment increases as well, giving the margin more impact on the ratings change.

Plotting different values of K and alpha using this function gives us the following heatmap, with darker values representing an improved prediction accuracy:

It can be seen that as K gets too high, the ELOs go a little crazy and adjust way too much to the most recent results, lowering our prediction accuracy. Where does the true balance lie?

Blake · Apr 27, 2016

Some slightly disappointing results - I was hoping for something interesting here. This graph shows the team's difference against their season average this week, based on their performance from last week. I was hoping that teams with big losses would actually improve on their season average through something like a 'rebound' effect, but the results definitely seem somewhat random. Teams winning by 36-60 points get complacent the week after, but teams winning by 61+ are just in good form and continue to play well? Statistical variations? Who knows

Blake · May 2, 2016

BEBot 2.0 complete. Instead of performing rating adjustments based off a win, loss or draw, a new function is used which calculates an expected margin given the ELO difference of the two sides. If a team performs better than this expected margin, their ELO rating goes up - otherwise, it goes down.

It's given me about a 0.3-0.5% increase in accuracy when comparing results from 1995-2016. Running it on the 2015 season now results in a performance of 140/197 (71.066%), one tip ahead of the best Herald Sun human tipster for the year. However, it only averages 66.9% when considering all games from 1995 onwards, and 67.7% when considering games from 2005 onwards.

Blake · May 2, 2016

Adjusted ELO ratings, post 2016 Round 5:

Blake · May 2, 2016

Blake said: ↑

BEBot 1.0 tips this weekend's round of footy:

North Melbourne v Western Bulldogs
Melbourne v St Kilda
Adelaide Crows v Fremantle
GWS Giants v Hawthorn
Richmond v Port Adelaide
Geelong Cats v Gold Coast Suns
Brisbane Lions v Sydney Swans
Carlton v Essendon
West Coast Eagles v Collingwood
Click to expand...

6/9 for the round, with Melbourne, Hawthorn and Essendon letting down BEBot 1.0.

The tips that BEBot 2.0 would have made differently:
Western Bulldogs instead of North Melbourne - WRONG
Carlton instead of Essendon - CORRECT

Blake · May 2, 2016

Predicted margins and ratings change after Round 6:

Blake · May 2, 2016

BEBot 2.0 2016 Round 7 Tips:

Richmond v Hawthorn
Collingwood v Carlton
Geelong v West Coast
Sydney v Essendon
Gold Coast v Melbourne
Western Bulldogs v Adelaide
Fremantle v GWS
St Kilda v North Melbourne
Port Adelaide v Brisbane Lions

Mousey · May 2, 2016

Does it take into account home field advantage yet in tips?

Blake · May 3, 2016

Mousey said: ↑

Does it take into account home field advantage yet in tips?
Click to expand...

Nope. It's still fairly basic, and doesn't take home advantage/rest/other factors into consideration yet. Despite that, I think it's still performing pretty well so hopefully these factors only continue to improve it.

Blake · May 3, 2016

As quoted from MatterOfStats,

"These days, I reckon I know what a good margin forecaster looks like. Any person or algorithm - and I'm still at the point where I think there's a meaningful distinction to be made there - who (that?) can consistently predict margins within 30 points of the actual result is unarguably competent. That benchmark is based on the empirical performances I've seen from others and measured for my own forecasting models across the last decade of analysing football."

I've decided to give my (slightly tweaked) algorithm a run at predicting margin error. It looks like I'm still a little bit off the elusive <= 30 point zone, but I'm still performing very reasonably. Here's the results over some different eras:

Blake · May 3, 2016

Now to answer the question: which team was worse upon introduction to their competition - Gold Coast in 2011, or GWS Giants in 2012?

<iframe width="560" height="315" src="https://www.youtube.com/embed/wv2hr42Nxvo" frameborder="0" allowfullscreen=""></iframe>

<iframe width="560" height="315" src="https://www.youtube.com/embed/uzfXdtH-XCU" frameborder="0" allowfullscreen=""></iframe>

By testing the model's performance using a variety of different ELO ratings for GWS/GC upon introduction to the competition, the results are as follows:

GWS were narrowly better than Gold Coast with an optimal introductory rating of 1381.5, ahead of Gold Coast's 1374.5.

They did, however, finish last for their first 2 seasons...

Blake · May 3, 2016

BEBot v2.1 is joining its robot companions and leaving the humans behind...

Blake · May 6, 2016

I'm now working on a home advantage algorithm. Seems to be very efficient. I've decided to scrap the typical distance travelled/familiarity approach and instead go with one that detects whether a team has over or under-performed based on the model's prediction.

I've made a bar graph of every team's performance at every ground used in the last 10 years, but I'm not going to post 18 images - if you're interested then request a team and I'll post it up. Here's a table of the top 25 ground advantages, and the bottom 25 ground performers...

Difference is the difference in expected margin when playing a team rated as average (1500).

TOP 25

BOTTOM 25

The biggest ground discrepancy is between West Coast and Western Bulldogs at Subiaco. If both teams were equally rated, West Coast would expect a 17.8 point advantage on average - a 3 goal start, effectively.

lpd · May 14, 2016

Blake said: ↑

Have done a bit more research over the past few days. There's been a bit more statistical work done than I first realised, but still nothing to do with machine learning or any of the more complex algorithms.

For anyone wanting to make their own models, I found an awesome dataset provided by Sportsbet for a Machine Learning competition last year. Featured a $5000 first prize as well, looks like I am a year too slow.

http://www.sportsbetcikm15.com/

I will be posting a ~ 15-page PDF of my own initial findings, and thesis proposal, in the days to come. Stay tuned!
Click to expand...

The website seems inaccessible now... are there any copies of the dataset floating around?

Blake · May 16, 2016

After optimising the crap out of BEBot I feel like it has hit its limits, averaging a shade under 30 points per game when attempting to predict the margin of the match. I've started work on a new model and have had some unbelievable results.. to the point where I feel like I might have screwed up in implementing the testing. Let's just say this will be blowing 70% out of the park if it is actually legitimate. Edit: Turns out I messed it up as I was expecting. However, importantly it is still delivering an improvement over BEBot!

Blake · May 16, 2016

Retrospectively fitting tips from the old model (BEBot v5.0). This will most likely be the last we see from it.

Round 7
Richmond vs. Hawthorn, M.C.G., Prediction: Hawthorn by 20 CORRECT
Collingwood vs. Carlton, M.C.G., Prediction: Collingwood by 12 WRONG
Geelong vs. West Coast, Kardinia Park, Prediction: Geelong by 2 CORRECT
Sydney vs. Essendon, S.C.G., Prediction: Sydney by 33 CORRECT
Gold Coast vs. Melbourne, Carrara, Prediction: Melbourne by 4 CORRECT
Western Bulldogs vs. Adelaide, Docklands, Prediction: Western Bulldogs by 4 CORRECT
Fremantle vs. Greater Western Sydney, Subiaco, Prediction: Greater Western Sydney by 10 CORRECT
St Kilda vs. North Melbourne, Docklands, Prediction: North Melbourne by 24 CORRECT
Port Adelaide vs. Brisbane Lions, Adelaide Oval, Prediction: Port Adelaide by 15 CORRECT

8/9 in an admittedly somewhat easy round for tipping.

Round 8
Adelaide vs. Geelong, Adelaide Oval, Prediction: Geelong by 8 CORRECT
Essendon vs. North Melbourne, Docklands, Prediction: North Melbourne by 32 CORRECT
Hawthorn vs. Fremantle, York Park, Prediction: Hawthorn by 22 CORRECT
Greater Western Sydney vs. Gold Coast, Sydney Showground, Prediction: Greater Western Sydney by 18 CORRECT
Richmond vs. Sydney, M.C.G., Prediction: Sydney by 24 WRONG
Brisbane Lions vs. Collingwood, Gabba, Prediction: Collingwood by 14 CORRECT
Carlton vs. Port Adelaide, Docklands, Prediction: Port Adelaide by 19 WRONG
Melbourne vs. Western Bulldogs, M.C.G., Prediction: Western Bulldogs by 15 CORRECT
West Coast vs. St Kilda, Subiaco, Prediction: West Coast by 28 CORRECT

7/9

Definitely some room for improvement in the margin tipping. However, a fairly impressive display from this model to tip autonomously. I don't think I would have made many different tips if I was predicting myself.

Next, I'm going to be posting up the equivalent tips from my new model.

Log in or Sign up

Machine Learning in the AFL

Mousey AJ Son

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Mousey AJ Son

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

lpd New Member

Blake BE Quilty

Blake BE Quilty

Share This Page

Log in or Sign up

Machine Learning in the AFL

Mousey AJ Son

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Mousey AJ Son

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

Blake BE Quilty

lpd New Member

Blake BE Quilty

Blake BE Quilty

Share This Page

Useful Searches