Thursday, 13 February 2014

How can an attacking team get close enough to expect a goal?

There's been some great work done in football analytics recently, looking at a team's scoring chances from different positions on the pitch, which has led to the calculation of various Expected Goals (ExpG) metrics. However it's calculated, in essence ExpG gives a player's chance of scoring from a shot, given his position on the pitch. Add up the probabilities for a group of shots and you can work out how many goals a team 'should' have scored from them. Have a look at Statsbomb if you'd like to read up on what's been available up to now.

I've managed to assemble a decent sized database of pass and shot locations from across the first half of the 2013-14 Premier League season and wanted to see if I could take Expected Goals a step further. As an indicator of shot success, Expected Goals typically paints a picture of the penalty area, with the six yard box as a hotspot and becoming colder the further out you move from goal. To a certain extent, its outputs are relatively obvious; if you shoot from closer in, you have a higher chance of scoring and shots from further out are less likely to be converted.

That's not to say Expected Goals isn't a useful metric - far from it - but it doesn't do a great deal for our understanding of how to create goals. We can quantify how much better it is to shoot from closer to the goal, but how do you get closer to the goal in the first place? If your attacks break down trying to reach the shot conversion hotspot, should you even try to get there, or just take your chances from range?

A couple of days ago, I tweeted an image of pass completion data, which we'll be building on in this post.

Pass success rate by destination

The image shows the probability of completing a pass into different areas of the pitch. We're not worried about where the ball is coming from for the moment, but are looking at the chances of passes into different areas being successful.

It's clear to see how - playing from left to right - passing accuracy starts to break down in the opposition half and then drops dramatically at the boundaries of their penalty area.

Even with half a season's worth of passes and shots, we're going to struggle with the number of data points available as this analysis progresses, so let's merge the granularity of that first image into some larger pitch areas.

Pass success rate by destination

We now have a picture of how difficult it is to pass into each area of a football pitch. What about shots?

From the same dataset, here's an average player's probability of scoring with shots from different pitch locations. Penalties are excluded and I've hidden squares with fewer than twenty shots to clean the data up a little.

Shot conversion rate by shot location

As a manager, you're on the horns of a dilemma. Scoring probability climbs to over 30% in the centre of the six yard box, but your chances of passing the ball into that location are slim.

What if we combine the two visualisations?

Pass success rate multiplied by scoring probability, gives an indication of the likely success of an attacking strategy. Pass to an easier area outside the box and shoot from there? Or attempt to work the ball closer, at the risk of losing possession?

Pass success probability * shot conversion rate

It turns out to be far from a clear cut-choice. There's a relatively large area, stretching from the edge of the six yard box, to well outside the area, where penetrating that area with the ball and then scoring once you have are quite evenly balanced at 2-3%. It's not as simple as 'closer to the goal is better' and the balance in one game is almost certainly dependent on passing quality of the individual teams and how well their opponents defend.

If we box out that 2-3% conversion area, we can move the analysis on another step.

Pass success probability * shot conversion rate

How should a team attempt to move the ball into that boxed-out shooting zone? There are three broad choices: Directly from the direction of the centre circle, diagonally, or from the wings.

David Moyes has come in for a lot of criticism this week following Manchester United's draw with Fulham, where his players hit over eighty crosses in ninety minutes. We should be able to show here whether crossing, or a direct approach, is the more successful strategy.

Probability of achieving a successful pass into shooting area

Note that I've changed the colour scale on the above image to peak at 75% rather than 100%, since the average success rate of these passes is lower than when considering the whole pitch. Squares are only shown if they've been the origin of at least twenty passes.

Once you move beyond the eighteen yard line, pass success probability drops off quickly. Touchline crosses from a 'chalk on his boots' classic winger have success rates as low as 30%. Other things being equal, the best chance of passing the ball into our key zone comes from a direct, or diagonal move.

If you're thinking "but that's not fair, most of the passes included here will be targeted at locations outside the box", then you're right. Let's tighten up our key shooting zone, to a central area of the eighteen yard box surrounding the penalty spot.

Probability of achieving a successful pass into close shooting area

Still want to hit crosses all day?

The probability of a pass from the wings finding a team mate in the shooting zone is 30-40%, while moving through the central area has a success rate of 40-50%.

This isn't the end of the story, but it's where I'll stop for now. There are many more factors to be considered, including absolute volume of passes and the fact that a successful pass isn't the same as creating a shooting chance. This analysis will provide a base to work from though and one that I'd like to extend next into different types of teams.

Ultimately, I hope that this type of analysis could answer question such as...

Should teams with worse passing shoot more often from long range? And vice versa, where is the optimal shooting area for a team that passes with a very high success rate?

How do optimal strategies change, based on specific opponents?

(using significantly more data) Can we identify hotspots where passes into the shooting zone have higher success rates? Versus specific opponents? When specific defenders are on the pitch?

Eventually, I believe an approach like this might be able to identify defensive weaknesses in a specific team and optimal attack strategies for their opponents.

1 comment:

Anonymous said...

Excellent work. Can you also incorporate the probability of being dispossessed into the analysis? Certainly attacking via diagonal into the center of the box is ideal from a passing and shot conversion perspective. I do wonder if the higher likelihood of being dispossessed might mute the advantage a little. Would be interesting to see.