Friday, 26 April 2013

Football model: Predictions for 27 April

OK, so last week was a bit of a shambles. Three results out of ten and a loss at the bookies. Damn.

But that means we're due a win this week, right? I think that's how they said probability works at school.

Here are the predictions for this weekend - a little early because I'm off to drink a few cervezas in Madrid (and tour the Bernabeu! Brilliant!) Any late changes to the starting line-ups that are on Fantasy Football Scout, we'll just have to live with.

If you're betting (I still have faith...) then:

Manchester City v West Ham United - Home win
Everton v Fulham - Home win
Southampton v West Bromwich Albion - Away win
Stoke City v Norwich City - Home win
Wigan Athletic v Tottenham Hotspur - Away win
Newcastle United v Liverpool - Away win
Reading v Queens Park Rangers - Away win
Chelsea v Swansea City - Home win
Arsenal v Manchester United - Away win
Aston Villa v Sunderland - Draw

Saturday, 20 April 2013

Football model: Predictions for 20 April

Quick post to log the predictions for this week. There always one that sticks out and this week it's Newcastle! There's no way that's a sensible percentage, but the model uses past shooting frequency and conversion stats and Newcastle's away numbers for the starting line-up on Fantasy Football Scout are less than impressive...

One of the improvements I'm working on is to input better data when we have limited experience (e.g. what shot conversion rate do you give a player who's never scored?) and that should go some way to moderating extreme results like this.

Here are the percentages:

If you're having a flutter, we've got...

Fulham v Arsenal - Draw
Norwich v Reading - Home win
QPR v Stoke - Draw
Sunderland v Everton - Away win
Swansea v Southampton - Home win
West Brom v Newcastle - Home win
West Ham v Wigan - Home win
Tottenham v Man City - Away win
Liverpool v Chelsea - Home win
Man U v Villa - Home win

Wednesday, 17 April 2013

Do you want to see adverts? It will soon be optional, and not just on the web.

I've written before about Adblock, a little plugin for your web browser that exorcises advertising from the internet. I've also wondered aloud why many more people don't use it.

I think Adblock amazing. I run it on my home PC, my work PC and my tablet. No internet ads, ever. Sometimes, when Google search ads might be useful - if I need a plumber, say - I'll turn Adblock off for a while, but it's never off for very long.

Is that in conflict with my job, working in an advertising agency? I don't believe so, no. If adverts are annoying, then people will avoid them. It's our responsibility as a marketing agency to work out how to get your company's message through the clutter and around the avoidance, to people just like me. Pretending the wider world doesn't avoid ads and trying to set some kind of example, would be much worse than skipping them ourselves.

Most people don't choose to block adverts on the internet, even after they know that it's possible and very, very easy to set up. Why not? It's got to be either that most people don't really dislike ads that much and so can't be bothered to avoid them (maybe even actively like them!), or that the little step of setting up the blocker is too much hassle.

Either way, we only seem to have about 5% of people avoiding ads on the web. Let's imagine that it will climb from there, but never become a majority. Blockmetrics estimates a higher proportion of traffic (up to 30%) coming from browsers that block ads but that follows, as savvy internet users who tend to install browser plugins will spend more time on the web

I don't know that ad blocking will never come to be the majority approach, but it makes some sense; if everybody thought ad blocking was that great, then they'd all be doing it by now.

Advertising revenue pays for a lot of content, so we've got a group of people - including me - free riding on other people's ad viewing. I get to read newspaper sites just like you do, but my copies have no ads. If you see ads on the internet, then you're paying for my content via your ad views and clicks. I do appreciate it, really.

So what's next?

TV advertising has so far been relatively untouched by advert avoidance. It's very possible to avoid ads on TV using a box like Tivo, but you have to record programmes and watch them behind the 'Live' feed, so that you can fast forward through breaks. A few people do that, but most don't.

Two recent developments may be about to change all this though.

First, the Dish Hopper is having a legal ding-dong with TV companies in the US, over a Tivo-style box which will recognise ad breaks and skip over them automatically. You still have to be watching a recording of the programme, but you don't need to fast forward - the box finds the ad breaks, misses them out when playing back recordings and for the viewer, ads just vanish.

That gave me an idea.

If you're watching TV live, you can't fast forward into the future to get past an ad break, but why do you have to watch the ads? You could watch something else. You could watch YouTube.

What if your TV recognised the start of an ad break and just swapped in YouTube videos that you might like for a few minutes, then dropped you back to live TV as the ad break ends?

Yeah, fine, YouTube's got ads in it too. But Vimeo hasn't. What about a gallery of inspiring Flickr images for three minutes, accompanied by a Spotify track from one of your playlists? It's a damn good idea.

Some bright sparks have already had that idea. They've built it. Yes it looks a bit amateur, but first versions of tech usually do and if this one doesn't work properly, then there'll be another one along soon that will.

Crucially, we're not talking about a whole DVR here, it's just a little piece of wifi kit that sits between your digital receiver and your TV, detects when ads are running and swaps the feed to the TV for something else until they stop. It will be very, very difficult to prevent people from using one of these.

At the extreme you could make them illegal, but how would you know I've got one?

If you're a commercial TV company, this should be absolutely terrifying. Now that TVs are effectively computers and new components based on equipment like the Raspberry Pi are so cheap, we can customise them. That means people who are determined to avoid ads are going to be able to do it, just like they can on the web.

Just like on the internet, it won't be everybody, but I'm willing to bet that in the not too distant future, a significant number of high value, technically minded people will choose to remove themselves completely from interruption-based TV advertising audiences.

Advertising as a whole will continue onwards, of course it will, it always does, but we're heading for some big changes. The last bastion of unavoidable, interruption based, highly impactful advertising, may be about to fall.

Friday, 12 April 2013

Football Model: Quick update and predictions for 13-17 April

The sun's shining (yes, even oop North) and the pub beckons, but first a quick update and predictions for this weekend.

We did pretty well last week, calling the usual average of 50% of results correctly and winning £23.35 at the bookies, on £10 stakes for each game.

That puts the running total at + £66.68 in the five weeks since I've been calling bets in public.

I'm probably heading for a fall, but no turning back now...

Here are this week's percentages, out to next Wednesday's games (might rerun these if the starting line-ups change). Watch that I've now swapped the order to list Home / Draw / Away like everybody else does.

Picks for betting...

Arsenal v Norwich - Home win (it seems very sure about this one!)
Villa v Fulham - Draw
Everton v QPR - Home Win
Reading v Liverpool - Away Win
Southampton v West Ham - Home Win
Newcastle v Sunderland - Home Win
Stoke v Man U - Away Win
Arsenal v Everton - Home Win
Man City v Wigan - Home Win
West Ham v Man U - Away Win
Fulham v Chelsea - Away Win

I'm off to the pub to spend last week's winnings. Enjoy the weekend!

Wednesday, 10 April 2013

The truth might be counter-intuitive, but DFA's attribution model is wrong

We've been doing a lot of attribution modelling at MediaCom North in the past twelve months. For the uninitiated, that means modelling the paths that people take to arriving at your website, via adverts that you're running on different websites.

Imagine somebody sees one of your banner ads, then searches for you on Google and clicks on a paid search ad, then arrives on your website and buys something. Which of the two adverts that they saw, do you credit with that sale?

Originally, I came at this problem from scratch - as an econometrician, rather than a web specialist - and have arrived at a different answer to a lot of the solutions that you'll see reported on industry sites.

The classic approach and the one taken in DoubleClick's (DFA's) new attribution tool, is to say "well we have one sale and a lot of adverts that might have driven it. What proportion of that one sale do we credit to each ad?"

By doing this you end up with a sale that gets split into pieces - maybe 50% to the paid search ad and 50% to the display ad. Maybe a larger proportion to the paid search ad, depending on your model. You can choose from six different ways of splitting up your online sales, within Google's DFA tool.

This method has one massive advantage. When you finish and add up all of your pieces of sales that have been allocated to different ads, you'll get back to the same total as you see on your final sales report.

And it has one massive disadvantage. It's wrong.

For a start and before we get to why it's wrong, giving people six options is downright unhelpful. There is a right answer to the question of attribution and letting people pick their own method is an analyst's equivalent of throwing up their hands and saying "I dunno, could be anything. Pick an answer you like."

Now why is the approach wrong?

To be fair, this isn't an online only problem. In most marketing analysis, we like to pretend that one sale comes from one marketing channel, because it makes life easier. If you've ever seen a nice neat list of return on investment numbers for different marketing campaigns, then this is effectively what you're looking at. Add up the individual contributions of each marketing campaign and you get total sales from marketing. Simple.

Marketing campaigns in the real world don't work like this, even though we like to pretend that they do. In the real world, people see lots of our ads and it needs lots of ads to persuade somebody to respond. Attribution analysis is specifically trying to solve the question of how (online) marketing campaigns work together, rather than individually, but by splitting up a single sale, we're not answering that question.

Take our example again of the customer who saw a display ad, then searched on Google, then bought something. Assume for the moment that they only decided to search for us, because they'd already seen the display ad. We must believe that sort of thing happens, or we wouldn't bother with attribution analysis in the first place.

What happens if you take away just the display ad? That person doesn't ever see the display ad, so they don't do a search and they don't buy our product. We lose one sale.

And if you take away the search ad? The person sees the display ad, searches on Google and doesn't find us. We lose one sale.

1+1 = 2

No we don't lose two sales in total. That would be ridiculous...

This is why, in real life, you can't add up the individual return on investment to each marketing channel. You'll get to a number that's bigger than your total sales.

But it's still the right approach.

When you use attribution to ask "what does this display advert add to my sales?", you're asking the wrong question. The right question - the one we just worked through above, is:

"What would happen to my sales if I took this display advert away?"

You'd lose a sale. Not half a sale, or 25% of a sale. A whole one.

And at this point you can see that this isn't just an academic argument. Your return on investment numbers to individual ads, using the dividing-up-sales method in Google's DFA tool, are too low. You will potentially remove campaigns that are actually working. This is important.

The only way to do attribution analysis is to ask how many sales you would have lost if an advert wasn't running. This is the question that was asked and is the whole point of attribution analysis. To do it any other way, is to pretend that you're measuring how different adverts work together, while actually measuring them individually.

In technical language, your attribution model must be multiplicative, not linear. You're going to need regression analysis on cookie-level data to build it and you're only going to get one answer, not six.

Still with me?

Next step...

Yes, it makes communicating results to clients harder.

It also makes it harder to forecast the impacts of changes to your campaigns. If you don't run search ads, your display ROI will drop. If you don't run display ads, your search ROI will drop. This is the essence of attribution modelling.

You need a forecasting tool, that can take what you've discovered from attribution modelling and project changes in overall marketing ROI, from a change to one campaign. This is what we've been building and what we're continuing to develop. It's been difficult, but very rewarding, and is now in use on live campaigns - it works.

I'm sure Google knows all this, but has chosen not to implement it, because it's more difficult to build and quite a lot more difficult for end users to understand. What you need to understand though, is that any attribution method which splits up a single sale, will encourage you to low-ball your return on investment numbers. And we can all agree, that's definitely a bad thing.

Monday, 8 April 2013

Football model: Under the hood

I was writing a proposal for a client last week and remembering how important it can be to show some of the mechanics underneath your answers. Not to explain everything that's going on, but to share some screenshots and explanations as evidence that you aren't just making all this stuff up as you go along. After all, you could drive your car quite happily without ever seeing what makes it go, but it's definitely reassuring to see a big shiny engine when you pop the bonnet open.

All most people have seen of the football model so far, is some percentages that get spat out at the far end. My first post about football simulation explained the basics of the model, but what am I actually up to, that makes these numbers happen?

Who else gets excited by screenshots of spreadsheets? Just me then? Ah well, here they are anyway.

Step one, is a list of the weekend's games and predicted starting line-ups. I get these from Fantasy Football Scout and week to week, input team changes by hand. I'd really like to automate this bit because it's a pain - it doesn't take ages to set up, but being manual means if I'm not at a computer, the model can't run itself.

This list of fixtures and players gets read in by the simulator (Visual Basic for the moment, if you're wondering) so that it can simulate virtual games.

Next, we need stats for each player, so that the simulator knows how each of them performs in real life. These stats come from EPL Index and give us a database describing each player's decision making, successes and failures in real games so far this season.

For each team, that looks something like this one for Southampton.

Yeah, I've missed out the column headers. Sorry about that, but this is turning into something I've invested a lot of time in! You can probably work some of them out if you're determined to...

These stats get pulled into the simulator and then it's ready to run a virtual game.

Or actually, to run 1000 virtual games and tell us what the result was in each of them, so we can find the percentage chance of either team winning, or of a draw.

Who wants to see what a footballer looks like inside The Matrix?

Probably prettier in real life. He's got a good engine though.

And now we're ready to run the weekend games. I press go and this happens.

If it's just running one weekend's games then I'll read Twitter for a bit. If it's simulating a whole season then the laptop gets some alone time and we come back later. It's playing through each fixture 1000 times, with around 800-1000 events per game.

At the top end, that's a million events to get one simulated result. 380 games in a season and so when I do a large run to assess whether changes to the model have improved its performance, we're simulating 380m individual events. Definitely gives me time to fit in a cup of tea.

And finally, out come the percentages that I've been posting for the past couple of months.

So now you know, it really is a proper model. One that I've spent far too long building.

And it works...

Thursday, 4 April 2013

Football Sim: Early predictions for 6/7/8 April 2013

Let's start this one with a quick recap because I haven't done a review for a couple of weeks. Last week wasn't bad. The week before the international break was the model's first loss, but only a small one.

Not enough detail? OK, let's round up the betting at least.

Since I've been calling bets on Wallpapering Fog before the games and playing £10 stakes every game, we've had:

2 March:    + £26.30
9 March:    + £3.30
16 March:  - £3.50
30 March:  + £17.23

Total:   + £43.33

Which as a newbie who'd only ever placed one football bet before February this year, I reckon is not too bad! (Sean Devine didn't score against Man U by the way. Who knew?) I've been tweeting a bit today about projections for what wins would have been if the model had been in place last season too and they're quite promising. Here if you're interested.

Predictions are early this week because I plan to be up a Lake District mountain with a paraglider on Saturday and Sunday. Can't wait, but football first. Here they are then...

Two notable ones this week. The model's more confident than the market odds that the red half of Manchester will be happy on Monday night and it also thinks that West Brom will beat Arsenal.

The Arsenal one's quite interesting. Giroud's away stats are less than impressive and it's really not helping Arsenal's chances of scoring, but the model's also not sure what to do with Rosicky (assuming he starts - line-ups from here) because he hasn't played much this season. I artificially gave him some very healthy shooting stats though, re-ran the model and Arsenal still lost with the chances being a little closer, so I'm going with it as a prediction.

Here are the calls for betting:

West Brom v Arsenal - Home Win
Stoke v Villa - Home Win
Chelsea v Sunderland - Home Win
Spurs v Everton - Home Win
Newcastle v Fulham - Home Win
Liverpool v West Ham - Home Win
Man U v Man City - Home Win
Norwich v Swansea - Draw
QPR v Wigan - Draw
Reading v Southampton - Away Win

By the way, I know I list Home / Away / Draw in the table above and nobody else does it in that order. It's even started annoying me now, so I'll change it for next week!