Wednesday, 18 September 2013

Luck in football part 2. Can you have a lucky season?


"It evens itself out over a season and that will never change. You get breaks here and there. Every club gets good breaks, bad breaks."

Sir Alex Ferguson


Does it though? Does luck even itself out across a season? This is a follow up to my post a couple of weeks ago, looking at how much luck there is in a single English Premier League result. The obvious next step is to try to extend that analysis to a thirty eight game season and see whether, over a larger number of games, most of the random chance in football then disappears.

Before we dive into the analysis, it's useful to think about what level of luck might feel right for an average team, in terms of the number of points that team finishes with, compared to how many points they 'should' get. Plus or minus a point across a season? That obviously could happen - you sneak one extra draw, or rattle the bar in the 90th minute at 1-0 down just once in the season and there's your extra point either way. One lucky won or lost game is also pretty easy to envisage, or maybe even two lucky wins or losses. Three lucky wins and nine points? For me that's within the bounds of possibility, but starting to feel more unlikely.

From a statistical point of view, thirty eight games per team isn't all that many, so some level of randomness is definitely going to creep in. The challenge is to work out how much randomness and whether it matters in the grand scheme of things. Whether the league table is random enough that the best team won't always win the title, or if an unlucky mid-table standard team can get relegated.

Working out random chance across a whole season is more difficult than working it out for a single game against the hypothetical 'average opponent' that we saw in my last blog post. In response to that previous post, a few people asked how you'd set up a team's chances against a specific opponent, rather than a generalised 'average' one, which illustrates one of the key problems. If you could predict goal scoring and concession rates against a specific opponent, then you could predict the final score. Then you'd be able to beat the bookies and make a lot of money. Which is hard. In essence this is what my prediction model tries to do and is too complex to form a base for this analysis.

For this post, I'm going to assume that bookmakers' odds are a fair representation of each team's chances in a game and use that as the basis to simulate a season. You can argue with that approach, but I have a suspicion it's going to cause fewer arguments than any other results prediction method that I might use.

At least everybody can see where these numbers have come from and as an added defence for this method, if the bookies odds were consistently wrong across a lot of teams, across the whole season, they'd be losing a fortune.


"There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know."

Donald Rumsfeld


OK, that's probably not very helpful. The reason for the quote is because the bookies odds aren't perfect and it is possible to predict better than they do, which might reduce the amount of 'luck' (can we call it random variation instead?) that we're about to measure. It's possible to consistently predict better than the bookies and there are also all of the factors about a game that neither we, nor the bookies know. If a manager plays his best striker - who's nursing an injury - and loses due to a poor performance, that's not bad luck, it's a bad call. But if we didn't know about the injury then it won't have been priced into the market odds.

What really matters for this post isn't that we have brilliant odds for each individual game, but that we have a fair representation of what a season looks like as a whole, so that we can run simulations. It's not about the individual teams, it's about having a realistic spread of probabilities across a season's 380 games and for this, betting odds should do a good job.

I'll come back to the definition of luck and the implications of using betting odds at the end of this post, but let's get stuck into some numbers. Here are Arsenal's odds for each game last season, taken from Bet365 and re-based to remove the bookmaker's margin so they sum to 100%. (data from www.football-data.co.uk/englandm.php)


If you run that fixture list ten thousand times with those odds, on average Arsenal will finish with seventy points. They scored seventy three points in 2012/13, so on that (huge!) sample of one season at least, the odds-based method seems sensible.

We get a distribution of points for Arsenal using our ten thousand simulated seasons, which looks like this.



Sometimes Arsenal will score fewer than seventy points and sometimes they'll score more, purely through random variation, or 'luck'. The standard deviation of Arsenal's final total is 7.5 points, which means that although the average is seventy, in any given season Arsenal are likely to do seven points better or worse than that. In 66% of seasons, their final total would fall between sixty five and seventy nine points.

One simulation in ten thousand, last season's Arsenal squad is ridiculously unlucky and gets relegated with less than the magic forty point total. This might sound far fetched until you consider that it's a one in ten thousand chance and there have only been 126 seasons of professional football in England, in total, ever. The chances of relegation happening to last season's Arsenal squad are vanishingly small.

Running the same ten thousand season simulation exercise for each team in last year's Premier League, gives you the following points distributions.



We've got a clear top two, a fairly well ordered top seven including Everton and then feasibly any team from eighth downwards could be relegated. Ouch.

We can translate those points totals into finishing positions for each team in each of the ten thousand simulated seasons, to get a likelihood of achieving different positions. Some of the randomness starts to disappear here, because for a team like Newcastle or Fulham to be relegated doesn't just need them to be unlucky. Other teams below them need to be lucky too.



And because I know you're going to want the raw percentages for those...



Were Newcastle unlucky to finish 16th last season? I'll leave that as a rhetorical question.

Picking on Liverpool, a team with their 2012/13 chances in each game (and please note how careful I'm being with my words here; not Liverpool, but a hypothetical team whose chances were exactly represented by the odds Bet365 gave Liverpool last season) would win the league by 'luck' one year in twenty (5%).

The analysis has turned up one more result and it's a result I found quite surprising. Before running any numbers, I'd hypothesised in an email exchange with @SimonGleave that good teams and bad teams would have less randomness in their points total, with average quality teams seeing the most random variation. The reasoning for this crudely being that good teams win a lot and bad teams get beaten a lot and both of those things reduce the space for luck to play a role.

I was wrong.

Here are the standard deviations of season points totals for each team in 2012/13.


Everybody scores plus or minus seven points. Weird.

What that does mean in effect though, is that teams higher up the table have less random chance as a proportion of their total points, than teams lower down, since seven is obviously a much larger proportion of forty points, than of ninety points.



We have a standard deviation 'luck' (I still don't like that word) measure varying from 9% of Man City and Man U's most likely points totals, through to 20% of Reading's.

My hypothesis about why seven appears to be the magic number for all teams is that every team has a number of peers - teams they're similar to and will share points with - and a number of teams that they're either much better or much worse than. This gives every team a similar number of games with fairly uncertain results, where points will be shared, compared with fairly certain ones, whether that's fairly certain to win or to lose.

I've linked to this piece of work before, but it's reassuring how close this result is to the key finding of plus or minus eight points in a post by James Grayson, which kicked off this whole thought process for me. My initial reaction to that number was that it felt too big, but now I'm coming to a very similar conclusion.

I do think we should be treating these numbers as a maximum level of random variation though, because in reality teams will react to their league position and try to change their chances. Teams with more financial resources will be able to react to an unlucky first half of the season by signing players and improving their odds in the second half.

Better predictions than bookmakers manage could also reduce the amount of measured luck, because we'd be more certain about which team should win each game, reducing the level of random variation in results. As I said earlier, the bookies aren't perfect (just annoyingly good) so this should definitely play a role in reducing the true standard deviation below seven.

Finally we're also back to the core question of what is luck? I'm not at all sure when most people say 'luck' that they intuitively mean 'random variation'. It's more nuanced than that. Steve Fenn (@SoccerStatHunt) tweeted a nice definition yesterday:



The key word here being 'unearned'. This got me thinking that you're lucky if:


So what's 'consistently'? I think this goes to the heart of what we'd intuitively define as lucky. In the simulations above, Manchester United had a 36% chance of winning the title, just about equal with Manchester City. If either of those teams win, are they lucky? They've got the best chances out of any team, but 36% isn't huge - it leaves a 64% chance that some other team wins it, which is much more likely.

By those percentages alone, you'd always need luck to win the title.

If you win a single game that you had a 49% chance of winning, were you lucky? After all, there was slightly more chance that you wouldn't win it.

What we mean by luck, seems to be lurking somewhere in an area of 'beating a team that played significantly better than we did'. For me, 'lucky' has an intrinsic element of fairness in its definition. Lucky wins are unfair. Things shouldn't have happened that way. Somebody else deserved the win.

That's why luck is so hard to define - because it's subjective. Your definition and mine could well be different. As a neutral, I'd say a non-league team with a 5% chance of beating a Premier League opponent in the FA Cup third round deserves everything they manage to achieve. A fan of the losing Premier League team, who's going to take a pasting in work on Monday, would probably call them lucky bastards.

What we have shown here, is that bookmakers' odds suggest we're watching a league where in any one single season, each team's points will swing plus or minus seven from their most likely total. That could well be enough to relegate an undeserving team. Of course, that's undeserving depending on whether you support that particular team and depending on where you, personally, draw the percentage line on 'unlucky'.

2 comments:

Jörg Seidel said...

stdev is equal, buy i expect skewness isn't.

hufton said...

Nice post, I wrote something similar here, but using market season points data rather than historical results