Interesting article in MultiChannel Merchant about sourcing sales across catalog and online using fractional allocation models. I’m pretty sure “allocation” and “attribution” are really different concepts, though they seem to be used interchangeably right now. Let’s just say from reading the article allocation sounds more like a gut feel thing and attribution, from my experience, implies the use of a mathematical model of some kind.
I know Kevin rails against a lot of the so-called matchback analysis done in catalog and I have to agree with him; that practice is a whole lot more like allocation then attribution in my book, particularly when it is pretty easy to measure the real source from a lift in demand perspective by using control groups. Take a random sample of the catalog target group, exclude it from the mailing, and compare the purchase behavior in this group with those customers who get the catalog over some time period. That should give you an idea of what the incremental (not cannibalized) demand from catalog is – just look at gross sales per customer. We did this at HSN for every promotion, since the TV was “always on” and creating demand by itself.
So does a web site.
Just because someone was mailed a catalog and then at some point later on ordered from a web site does not mean they ordered because they received the catalog; heck, you don’t even know for sure if they even received the catalog – as anyone who has used seeded lists knows. And just because someone was exposed to an ad online doesn’t mean the ad had anything to do with a subsequent online order – even if you believe in view-through.
Anyway, I see lots of people doing what I would call allocation rather than attribution in the web analytics space, and when Jacques Warren asked me about this topic the other day, I decided it might make a good post.
You have to understand this same discussion has been going on at least 25 years already in offline, so there is a ton of history in terms of best practices and real experience behind the approach many folks favor. And there is a twist to the online version I don’t think many folks are considering. So for what it’s worth, here’s my take…
For most folks, the simplest and most reliable way to attribute demand is to choose either first campaign or last campaign and stick to it. The words simplest and reliable were chosen very specifically. For the very few folks who have the right people, the right tools, and the right data, it is possible to build mathematically precise attribution models. The word precise was also chosen specifically. I will go into more detail on these choices below after some background.
Choosing first or last campaign for attribution is not ignoring the effects of other campaigns, but simply recognizes you cannot measure these effects accurately, and to create any “allocation model” will be an exercise in navel gazing.
Unfortunately, a lot of this kind of thing goes on in web analytics – instead of admitting something can’t be measured accurately, folks substitute a “model” which is worse than admitting the accuracy problem, because now you are saying you have a “measurement” when you don’t. People sit around with a web analytics report, and say, “Well, the visitor saw the PPC ad, then they did an organic search, then they saw a banner, so we will give 1/3 of the sales credit to each” or worse, “we will allocate the credit for sales based on what we spend on each exposure”.
This approach is worse then having no model at all, because I often see these models used improperly, (for example) to “justify budget” – if you allocate a share of responsibility for outcome to PPC, then you get to keep a budget that would otherwise be “optimized” away. A similar argument is being made by a few of the folks in the MultiChannel Merchant article above to justify catalog spend.
This is nuts, in my opinion.
I believe the core analytical culture problem at work here (if you are interested) is this:
Difference between Accuracy and Precision
http://en.wikipedia.org/wiki/Accuracy
I’d argue that given a choice, it’s more important to be precise than accurate – reproducibility is more important (especially to management) than getting the exact number right. Reproducibility is, after all, at the core of the scientific testing method, isn’t it? If you can’t repeat the test and get the same results, you don’t have a valid hypothesis.
And given the data stream web analytics folks are working with – among the dirtiest data around in terms of accuracy – then why would people spend so much time trying to build an “accurate” model? Better to be precise – always using first campaign or last campaign – than to create the illusion of accuracy with an allocation model that is largely made up from thin air.
When I make the statement above, I’m excluding a team of Ph.D. level statisticians with the best tools and data scrubbing developing the model, though I suspect only a handful of companies doing these models actually fit that description. For the vast majority of companies, the principle of Occam’s Razor rules here; what I want is reliability and stability; every time I do X, I get Y – even if I don’t know exactly (accurately) how I get Y from X.
Ask yourself if that level of accuracy really matters – if every time I put in $1 I get back $3, over and over, does it matter specifically and totally accurately exactly how that happens?
Whether to use first or last campaign is a matter of philosophy / culture and not one of measurement. If you believe that in general, the visitor / customer mindset is created by exposure or interaction to the first campaign, and that without this favorable context none of the subsequent campaigns would be very effective, then use first campaign.
This is generally my view and the view of many offline direct marketing folks I know. Here is why. The real “leverage” in acquisition campaigns is the first campaign – the first campaign has the hardest job, if you will – so if you are going to optimize, the biggest bang for the buck is in optimizing first campaign, where if you get it wrong, all the rest of the campaigns are negatively affected. This is the “leverage” part of the idea; on any campaign other than first, you can’t make a statement like this. So it follows that every campaign should be optimized as “first campaign”, since you don’t normally control which campaign will be seen first.
Some believe that the sale or visit would not have occurred if the last campaign was not effective, and all other campaigns are just “prep” for that campaign to be successful. Perhaps true, but it doesn’t fit my model of the world – unless you know that first campaign sucks. If you know that, then why wouldn’t you fix it or kill it, for heaven’s sake?
All of the above said, if you have the chops, the data, and the tools, you can produce attribution models that will provide direction on “weighting” the effect of different campaigns. These “marketing mix” models are used all the time offline, and are usually the product of high level statistical models. By the way, they’re not generally accurate, but they are precise. I do X, I get Y.
You can produce a similar kind of information through very tightly testing using control groups, but that’s not much help for acquisition because you usually can’t get your hands on a good control group. So for acquisition you are left with trying to synch time periods and doing sequential or layered testing.
For example, in June we are going to kill all the AdSense advertising and see what happens to our AdWords advertising – what happens to impressions, CTR, conversion, etc. Then in July we will kick AdSense on again and see what happens to the same variables, along with tracking as best we can any overlapping exposures.
Then given this info, we decide about allocation using the human brain and database marketing experience.
This approach is not accurate, but I’d rather be precise and “directionally right” then accurate and be absolutely wrong, if you know what I mean. This test approach should give you directional results if executed correctly – the spread for the AdSense OFF / ON test results should be healthy, and you should be able to repeat the test result with some consistency.
Bottom line – it doesn’t really matter exactly what is happening, does it? Do you need an accurate accounting of the individual effects of each campaign in a multiple campaign sequence? No. What you need is a precise (reliable and reproducible) way to understand the final outcome of the marketing mix.
Even if you think you have an accurate accounting of the various campaign contributions, what makes you think you can get that with data as dirty as web data is? Despite the attempt at accuracy, all you have to do is think through cookies, multiple computers, systems issues, and web architecture itself to understand that after all that work, you still don’t have an accurate result.
Hopefully it is more precise than simply using first campaign.
Thoughts from you on this topic? I know there are at least two “marketing mix” folks on the feed…