Expected points & goal difference- How accurate can predicting a final EPL league table be?

Points per game is boring

Trying to predict the outcome of a team’s position at the end of the season is always going to be tricky and probably, in most people’s eyes, near on impossible and in some people’s opinions pointless however there are ways in which we can get reasonably close in achieving this. Given it’s the international break (yawn!) I thought this was a good time to publish what I have been working on.

Goals per game and goals against per game, points per game have all been used in the past to calculate the final league positions of clubs across various leagues but the future is nigh and when expected goals came along in giving us a better measure of a team’s performance across not only a single game but also longer term across a whole season we also gained a significant and more reliable way of assessing the way a league table may look at the end of the season.

I am going to be using two metrics. Expected goal difference and xP (expected points) which I believe are sustainable as they have been used in articles published before and also they are fairly accurate. In calculating expected goal difference I will use the difference between xG and xGA multiplied by the remaining games to get our goal difference totals and league position finishes and for the xP component I will use a more intricate method detailed below.

For this article I will be focusing on the English Premier League and in a few days publish another article covering the Championship and Leagues 1 and 2.

Expected Goals (make a) Difference

Using goal difference is probably not the most thought about way of trying to predict a final league table and work has been published in the past using simple goal difference and it has been proven to be a good indicator of the longer term sustainability of a team’s results but in this article I will not only put forward the case for what I believe are the benefits of using this measure but also why it may not be the best indicator.

So let’s start with the methodology used in creating the tables within the goal difference using some examples from a previous season.

The best teams earn the most points and it’s fair to say they will end up scoring more goals and in theory will end up conceding the least amount of goals over the course of the season but will this dictate their final league position? In the mind’s eye of course but in reality is it the same?

In an excellent, albeit brief, article published in 2012 on 5addedminutes.com Newcastle were used as an example of this so rather than trying to search through seasons and seasons worth of data to try and find something similar I will use them as an example.

The EPL does not seem to change in regards to competitiveness from year to year and it is often spoken about that there are three ‘mini leagues’ within the EPL. In the 2011-12 season Newcastle United finished the season on a goal difference of +5 and 5th in the table on 65 points (normally this would equate to 55 points and only be good enough for around 7th or 8th position) so in theory they overachieved by 11 points and with only eight wins by a single goal and a trio of big defeats. These slender one goal margin victories can inflate a team’s final league position and although results fluctuate this may in part explain the Magpie’s final position in the table and of course there is no reason to believe this is sustainable winning by these small margins. The following year Newcastle finished in 16th position with a -23 goal difference with a 41 points total proving that it certainly wasn’t a sustainable way to play over the course of a season.

The model above, as I explained above is merely using standard goal difference so what happens if we use expected goal difference to produce an ‘alternate’ final goal difference table.

The graphic below shows the EPL table at the end of the season when we apply this model.

EPL xG diff
EPL 18/19 final league table using Goal difference as the measure.

The method for this is simply xGF per game – xGA per game = goal diff / 36 (games remaining) = final goal difference. Then we can assign a final league position based on the given clubs goal difference.

This is where I have some doubts about the reliability of this method with some of the numbers a little low on the eye test but we shall see in time. When referring and compiling anything along these types of lines looking at previous data sets is vital and it is imperative to go back and look at a previous season or seasons. For simplicity, we will look back to last season’s final EPL table.

Manchester City topped the goal difference charts with a +79 goal difference and finished as champions and in my table above this year they could finish with a +69 GD, a Liverpool (4th last year with +46 GD) are expected to have a +40 at the end of this year so all looking pretty rosy so far in comparison.

Things get a tad shady when we go a bit further down my table. Watford are expected to end up with a GD of +26 and a possibility of finishing 4th when last year they finished on -20 and ended up finishing in 13th  on 41 points, quite a large margin and unlikely I think it is fair to say. Bournemouth, with an expected GD of +19 and a final league position of 5th   (it should be noted I have placed them above Spurs for alphabetical order purposes only in this table), performed significantly worse last season with a -16 GD and a points finish of 44 points,  one would assume they are unlikely-ish to finish in the Europa League positions this year.

Watford celebration.JPG
Watford- Can Troy Deeney and co beat the usual suspects into a top 5 finish?

But is it so inconceivable that Watford could finish above the likes of Arsenal, Spurs and Manchester United? In truth it probably is but it is interestingly comparable in that last season The Cherries finished only 2 places and 3 points in front of The Hornets with only a -4 GD in the favor of Eddie Howe’s men. Judging by my table they will be closely matched again this year only higher up the table. However unlikely that maybe this is a useful comparison when identifying which teams in the league are consistently  going to be your closest rivals in terms of gaining points throughout the season and I’ve only compared one season to another of which one is predicted but the trend is already (slightly) evident.

The bottom four look a fair bit more ‘believable’ in regards to their final league positions but I still think this table is slightly skewed somewhat given Cardiff and Fulham are playing in a different league after being promoted last year so their data is less reliable even though Neil Warnock’s side look nailed on in finding it tough going getting out of the relegation places.

Now we have identified a kind of mini trend in regards to comparing two team’s possible similarities in scoring and conceding goals to end up with similar goal differences and final league positions come the end of the season I now want to look at my alternative method of calculating a points finish.

xP (Expected Points)

I will start with the methodology again to briefly get a gist of what to expect and to give some clarity.

I will use Burnley as a quick example. Using an adapted Poisson distribution calculation we input Burnley’s xG per game in the first column and their xGA into the second column this then spits out the expected points per game they should achieve we then multiply that by the remaining games (36) to give us Burnley’s expected points finish from there we can assign them a final league position. This is similar to what betting agencies would use in a type of Monte-Carlo type calculation. Simple, right? My math symbols are poor at best so I will leave them up to smarter people than me but the methodology is the same regardless. On to the table.

EPL xP finish 1819

Although this is my personal preferred method of calculating a team’s expected points finish it, like any predictive model, is never going to be 100% perfect as Manchester City’s 103 xP finish in my table may show… but hang on a minute didn’t they did finish on 100 points in winning the title last year? So perhaps I am being too harsh on the model, myself and Pep Guardiola’s rampant City squad. And as with the Bournemouth/Watford goal difference situation over the last two seasons we see a bit of consistency edging in, as we should of course if the workings are correct.

Throughout the process of compiling these two sets of equation’s my mind swung back and forth in regards to one being a better method than the other and its clear to see why. There are many similarities throughout when comparing the two tables. The top three are the same and the top five in general look very familiar, 7th to 11th are fairly similar as are the bottom four.

Nuno Santo
Wolves manager Nuno Esporito Santo

One team sticks out above the rest though. Wolverhampton Wanderers.

Nuno Esporito  Santo has carried on the form and style of play that has seen his Wolves team comfortably go up as champions in the Championship last season with their infamous 3-4-3 shape. Is it possible that they could continue as they have started this season and end up in 4th spot come the end of the season? Of course if we go by the goal difference table most definitely not as they would finish 7 places below in 11th, something to ponder whichever way your opinion might sway.

Arsenal are starting to come round to Unai Emery’s way of thinking so I would expect them to finish higher than the predicted 8th but my confidence in that happening isn’t helped by the fact in both tables they are only a place or two apart and this would also, obviously, depend a fair bit on Bournemouth, Watford and Wolves faltering which they may well end up doing of course but in my opinion a top four, or even top five finish for the Gunners is not looking good either way.

unai emry
Unai Emery’s Arsenal side are showing signs of improvement but can they finish high enough come the end of the season? 

How many teams will be involved in the fight for the last of (in my opinion) the automatic champions league spot after City, Liverpool and Chelsea remains to be seen but The Gunners have a fight on their hands from some unlikely teams at this point in the season and the end result will be interesting come the 19th of May next year.

Take it or leave it (depending on who you support)

So there we have it and make of it what you will and take from it what you want but you will probably want to disregard all of it if you are a Huddersfield United, Cardiff City or a Newcastle United fan and you may want to think about crossing your fingers and toes if you’re a supporter of the teams in the red half of Manchester and North London. Only kidding (maybe).

Gareth Cooper

GC Analytics

Newcastle United v Chelsea: Did Sarri’s team make hard work of getting the win on Sunday?

Hazard after scoring his penalty.

Watching the first 10 minutes of the game at the Sports Direct arena on Sunday it was clear to us all that the tone was set for the rest of the game. Rafa Benitez had clearly set his team up to not lose the game but afterwards claimed he also wanted to try to win the game, the latter didnt appear to be the case at all.

Chelsea coach Maurizio Sarri also remarked that Newcastle were very compact which made his sides efforts all that more harder. But were his tactics enough to really hurt Newcastle’s back five?

Formations and tactics

Chelsea though made hard work of their 1-2 win and almost played into Newcastle’s hands. Having 73% of the possession compared to United’s 27% Chelsea dominated the ball but huffed and puffed and were frustrated by Newcastle and the 5-4-1 shape they had employed.

Newcastle and Chelsea line ups

Sarri again went with his much favoured 4-3-3 set up with the excellent Jorginho at the base of the midfield three. N’Golo Kanté and Mateo Kovačić were stationed either side of the Italian international and had clear licence to play as close to the penalty area as possible with what seemed minimal defensive responsibilities. Eden Hazard was given a free role across the final third and this is possible due to Marcus Alonso playing as high as possible from left back effectively taking up Hazards position on the left hand side with Kovačić able to move inside but also supporting on that left side. Pedro likes to keep the width made possible by the fact Ceśar Azpilicueta isnt in the same mould as Alonso and tends to play a more withdrawn full back role. Alvaro Morata seems almost redundant at times in this set up and this probably needs to be addressed if he is to be more involved in games.

Sarri watches on from the sideline

This Chelsea side is very well balanced with a clear focus of attacking down the left hand side. In the first 20 minutes of the game Chelsea attacked down Newcastle’s right 62% of the time compared to centrally (28%) and only 10% down the left. A clear focus indeed.

Chelsea’s attacking shape

N’Golo Kanté- suited to the role?

In the graphic above we can see how Chelsea group two players in the inside left and right positions when the ball is with Hazard. Alonso is not as far forward in this graphic but for the majority he would be easily on the back of Newcastle’s Yedlin.

N’Golo Kanté has a new role in this side, favouring a more attacking mindset and given his boundless energy he is still able to fulfil his defensive duties. He is tending to hang around on the edge of the box and if he does this Kovačić and Jorginho are reserved in their attacking intent especially the latter, more on him later. But is this what suits Kanté the most? In my opinion he isnt effective enough in this role and his qualities are wasted in the attacking phase.

In the second half Sarri realised that Pedro wasnt in the game and so decided on dropping Kanté back slightly to enable the Spaniard to fill these spaces on the edge of the box the problem was Azpilicueta was very reluctant to fully overlap like his fellow countryman Alonso on the opposite side probably knowing that Pedro has very little enthusiasm when it comes to defending.

Kovačić linked very well with Alonso and Hazard on the left hand side during the game with some slick interchanging of passes to get Alonso into crossing positions.

Jorginho- The pass master

In the past i have championed the fact i would love to see Jorginho in the Premier League and although i thought he may fit in with Guardiola’s 4-3-3 system at Manchester City but i now see why Chelsea brought him in from Napoli. He fits in perfectly at Chelsea and although his defensive qualities are at times questionable, and he is no Fernandinho in terms of tackling etc, but if he gets Kanté beside him in a 4-2-3-1/4-3-3 hybrid maybe, he has the ability to control and dictate games with his passing, especially his progressive passing into Hazard and co, something that Fernandinho severly lacks.

My pass map above is a carbon copy of Jorginho’s pass maps from his time at Napoli under Sarri. The similarities of the central positioning within the vacinity of the centre circle is uncanny and as his dot size shows he completed 155 passes in Sundays game.

Newcastle’s Mo Diamé did his upmost to stifle the Italians movements but to no avail and although third behind Hazard and Pedro in regards to xGChain (chance involvement) he really is going to be key to Chelsea’s progression this season.

Could it have been easier?

Newcastle v Chelsea xG timeline

Judging by my xG timeline above we can see Chelsea really applied more pressure starting around the 61st minute and on 76 minutes Eden Hazard duly dispatched his penalty after a clumsy foul on him by Newcastle defender Schär.

Up until halftime Chelsea made difficult work in breaking The Magpies down and this theme continued into the second half until the time bracket mentioned above.

Referring back to the pass map i would expect Chelsea to have more width across the pitch from Azpilicueta to Alonso (yes i know i gave reasons in regards to the former earlier on in the article). Pedro although started the game wide but as the game grew he became a tad narrower and, as Kanté retreated, even more so.

Given Newcastle had five across the back i get why Sarri employed the tactic of trying to play a more narrow shape than usual. But Chelsea slightly played into Newcastle’s hands in playing this way partly due to the fact their was barely any width provided due to the issues with Pedro and Azpilicueta but also due to Hazard coming in off the attacking wide left position.

Newcastle had enough personnel in the full back and wide areas to cope and forced Chelsea inside a ploy that worked pretty well until Yedlin had put into his own net on 87 minutes to hand Chelsea the 3 points.

Gareth Cooper

GC Analytics