I havent had the time to write any articles of any kind lately so i thought i would post some recent data visuals i have sent out to various clients over the past few weeks with some brief explanations of what they entail, enjoy and any questions feel free to contact me on my social media channels.
A recent model of mine i have been working on is an expected points model some of which i have posted on here before (search this site and you will find explanations for the model). This one is an updated SkyBet League 1 table. I compile these concurrently for all English based leagues and these are a useful tool in seeing how a team ‘should’ be doing and i keep these up to date on a weekly basis.
These are my brand new pass sonars inspired by the guys at @AnalysisEvolved. They are designed to show the pass directions of a team. This example is of PSG in a 3-4-2-1 shape. Again i can compile these on any team in the world if i have the data. Below is Thiago Silva’s individual pass sonar.
The first graphic in this set is an xG timeline designed to show the xG of two teams in a certain game and the second and third graphics are pass maps of the two teams from the same game.
This viz is showing Brentford’s Neal Maupay’s shot map so far this season (it needs updating as it is from earlier on in the season).
This screenshot is from a recent opposition report for a few clients i compiled. This shot shows a brief set piece analysis but is part of a detailed 20 page report.
Another xG graphic this time showing Aston Villa’s rolling xG and xGA for the season so far.
RB Leipzig’s ELO rating throughout the last few years this one created using Python software.
Using a mathematical equation called a convex hull we can show the positions that a given player seems to intercept the ball on average this viz shows Hernandez and Pavard’s success for France at the 2018 World Cup in Russia and was inspired by David Sumpter a well known mathematician after i had read his book Soccermatics.
Again using a convex hull we can determine some passing networks in the Spanish team against Russia and once this is from the 2018 World Cup.
Above is some tactical analysis from the Dortmund Bayern game.
And finally player radars. These are becoming an increasingly popular tool when scouting players and a lot of data is able to shown in one eye pleasing visual.
Trying to predict the outcome of a team’s position at the end of the season is always going to be tricky and probably, in most people’s eyes, near on impossible and in some people’s opinions pointless however there are ways in which we can get reasonably close in achieving this. Given it’s the international break (yawn!) I thought this was a good time to publish what I have been working on.
Goals per game and goals against per game, points per game have all been used in the past to calculate the final league positions of clubs across various leagues but the future is nigh and when expected goals came along in giving us a better measure of a team’s performance across not only a single game but also longer term across a whole season we also gained a significant and more reliable way of assessing the way a league table may look at the end of the season.
I am going to be using two metrics. Expected goal difference and xP (expected points) which I believe are sustainable as they have been used in articles published before and also they are fairly accurate. In calculating expected goal difference I will use the difference between xG and xGA multiplied by the remaining games to get our goal difference totals and league position finishes and for the xP component I will use a more intricate method detailed below.
For this article I will be focusing on the English Premier League and in a few days publish another article covering the Championship and Leagues 1 and 2.
Expected Goals (make a) Difference
Using goal difference is probably not the most thought about way of trying to predict a final league table and work has been published in the past using simple goal difference and it has been proven to be a good indicator of the longer term sustainability of a team’s results but in this article I will not only put forward the case for what I believe are the benefits of using this measure but also why it may not be the best indicator.
So let’s start with the methodology used in creating the tables within the goal difference using some examples from a previous season.
The best teams earn the most points and it’s fair to say they will end up scoring more goals and in theory will end up conceding the least amount of goals over the course of the season but will this dictate their final league position? In the mind’s eye of course but in reality is it the same?
In an excellent, albeit brief, article published in 2012 on 5addedminutes.comNewcastlewere used as an example of this so rather than trying to search through seasons and seasons worth of data to try and find something similar I will use them as an example.
The EPL does not seem to change in regards to competitiveness from year to year and it is often spoken about that there are three ‘mini leagues’ within the EPL. In the 2011-12 season Newcastle United finished the season on a goal difference of +5 and 5th in the table on 65 points (normally this would equate to 55 points and only be good enough for around 7th or 8th position) so in theory they overachieved by 11 points and with only eight wins by a single goal and a trio of big defeats. These slender one goal margin victories can inflate a team’s final league position and although results fluctuate this may in part explain the Magpie’s final position in the table and of course there is no reason to believe this is sustainable winning by these small margins. The following year Newcastlefinished in 16th position with a -23 goal difference with a 41 points total proving that it certainly wasn’t a sustainable way to play over the course of a season.
The model above, as I explained above is merely using standard goal difference so what happens if we use expected goal difference to produce an ‘alternate’ final goal difference table.
The graphic below shows the EPL table at the end of the season when we apply this model.
The method for this is simply xGF per game – xGA per game = goal diff / 36 (games remaining) = final goal difference. Then we can assign a final league position based on the given clubs goal difference.
This is where I have some doubts about the reliability of this method with some of the numbers a little low on the eye test but we shall see in time. When referring and compiling anything along these types of lines looking at previous data sets is vital and it is imperative to go back and look at a previous season or seasons. For simplicity, we will look back to last season’s final EPL table.
Manchester City topped the goal difference charts with a +79 goal difference and finished as champions and in my table above this year they could finish with a +69 GD, a Liverpool (4th last year with +46 GD) are expected to have a +40 at the end of this year so all looking pretty rosy so far in comparison.
Things get a tad shady when we go a bit further down my table. Watford are expected to end up with a GD of +26 and a possibility of finishing 4th when last year they finished on -20 and ended up finishing in 13th on 41 points, quite a large margin and unlikely I think it is fair to say. Bournemouth, with an expected GD of +19 and a final league position of 5th (it should be noted I have placed them above Spurs for alphabetical order purposes only in this table), performed significantly worse last season with a -16 GD and a points finish of 44 points, one would assume they are unlikely-ish to finish in the Europa League positions this year.
But is it so inconceivable that Watford could finish above the likes of Arsenal, Spurs and Manchester United? In truth it probably is but it is interestingly comparable in that last season The Cherries finished only 2 places and 3 points in front of The Hornets with only a -4 GD in the favor of Eddie Howe’s men. Judging by my table they will be closely matched again this year only higher up the table. However unlikely that maybe this is a useful comparison when identifying which teams in the league are consistently going to be your closest rivals in terms of gaining points throughout the season and I’ve only compared one season to another of which one is predicted but the trend is already (slightly) evident.
The bottom four look a fair bit more ‘believable’ in regards to their final league positions but I still think this table is slightly skewed somewhat given Cardiff and Fulham are playing in a different league after being promoted last year so their data is less reliable even though Neil Warnock’s side look nailed on in finding it tough going getting out of the relegation places.
Now we have identified a kind of mini trend in regards to comparing two team’s possible similarities in scoring and conceding goals to end up with similar goal differences and final league positions come the end of the season I now want to look at my alternative method of calculating a points finish.
xP (Expected Points)
I will start with the methodology again to briefly get a gist of what to expect and to give some clarity.
I will use Burnleyas a quick example. Using an adapted Poisson distribution calculation we input Burnley’s xG per game in the first column and their xGA into the second column this then spits out the expected points per game they should achieve we then multiply that by the remaining games (36) to give us Burnley’s expected points finish from there we can assign them a final league position. This is similar to what betting agencies would use in a type of Monte-Carlo type calculation. Simple, right? My math symbols are poor at best so I will leave them up to smarter people than me but the methodology is the same regardless. On to the table.
Although this is my personal preferred method of calculating a team’s expected points finish it, like any predictive model, is never going to be 100% perfect as Manchester City’s 103 xP finish in my table may show… but hang on a minute didn’t they did finish on 100 points in winning the title last year? So perhaps I am being too harsh on the model, myself and Pep Guardiola’s rampant City squad. And as with the Bournemouth/Watfordgoal difference situation over the last two seasons we see a bit of consistency edging in, as we should of course if the workings are correct.
Throughout the process of compiling these two sets of equation’s my mind swung back and forth in regards to one being a better method than the other and its clear to see why. There are many similarities throughout when comparing the two tables. The top three are the same and the top five in general look very familiar, 7th to 11th are fairly similar as are the bottom four.
One team sticks out above the rest though. Wolverhampton Wanderers.
Nuno Esporito Santo has carried on the form and style of play that has seen his Wolves team comfortably go up as champions in the Championship last season with their infamous 3-4-3 shape. Is it possible that they could continue as they have started this season and end up in 4th spot come the end of the season? Of course if we go by the goal difference table most definitely not as they would finish 7 places below in 11th, something to ponder whichever way your opinion might sway.
Arsenal are starting to come round to Unai Emery’s way of thinking so I would expect them to finish higher than the predicted 8th but my confidence in that happening isn’t helped by the fact in both tables they are only a place or two apart and this would also, obviously, depend a fair bit on Bournemouth, Watford and Wolvesfaltering which they may well end up doing of course but in my opinion a top four, or even top five finish for the Gunners is not looking good either way.
How many teams will be involved in the fight for the last of (in my opinion) the automatic champions league spot after City, Liverpool and Chelsea remains to be seen but The Gunners have a fight on their hands from some unlikely teams at this point in the season and the end result will be interesting come the 19th of May next year.
Take it or leave it (depending on who you support)
So there we have it and make of it what you will and take from it what you want but you will probably want to disregard all of it if you are a Huddersfield United, Cardiff City or a Newcastle United fan and you may want to think about crossing your fingers and toes if you’re a supporter of the teams in the red half of Manchester and North London. Only kidding (maybe).