Marketing Attribution Models

Interesting article in MultiChannel Merchant about sourcing sales across catalog and online using fractional allocation models.  I’m pretty sure “allocation” and “attribution” are really different concepts, though they seem to be used interchangeably right now.  Let’s just say from reading the article allocation sounds more like a gut feel thing and attribution, from my experience, implies the use of a mathematical model of some kind.

I know Kevin rails against a lot of the so-called matchback analysis done in catalog and I have to agree with him; that practice is a whole lot more like allocation then attribution in my book, particularly when it is pretty easy to measure the real source from a lift in demand perspective by using control groups.  Take a random sample of the catalog target group, exclude it from the mailing, and compare the purchase behavior in this group with those customers who get the catalog over some time period.  That should give you an idea of what the incremental (not cannibalized) demand from catalog is – just look at gross sales per customer.  We did this at HSN for every promotion, since the TV was “always on” and creating demand by itself.

So does a web site.

Just because someone was mailed a catalog and then at some point later on ordered from a web site does not mean they ordered because they received the catalog; heck, you don’t even know for sure if they even received the catalog – as anyone who has used seeded lists knows.  And just because someone was exposed to an ad online doesn’t mean the ad had anything to do with a subsequent online order – even if you believe in view-through.

Anyway, I see lots of people doing what I would call allocation rather than attribution in the web analytics space, and when Jacques Warren asked me about this topic the other day, I decided it might make a good post.

You have to understand this same discussion has been going on at least 25 years already in offline, so there is a ton of history in terms of best practices and real experience behind the approach many folks favor.  And there is a twist to the online version I don’t think many folks are considering.  So for what it’s worth, here’s my take…

For most folks, the simplest and most reliable way to attribute demand is to choose either first campaign or last campaign and stick to it.  The words simplest and reliable were chosen very specifically.  For the very few folks who have the right people, the right tools, and the right data, it is possible to build mathematically precise attribution models.  The word precise was also chosen specifically.   I will go into more detail on these choices below after some background.

Choosing first or last campaign for attribution is not ignoring the effects of other campaigns, but simply recognizes you cannot measure these effects accurately, and to create any “allocation model” will be an exercise in navel gazing.

Unfortunately, a lot of this kind of thing goes on in web analytics – instead of admitting something can’t be measured accurately, folks substitute a “model” which is worse than admitting the accuracy problem, because now you are saying you have a “measurement” when you don’t.  People sit around with a web analytics report, and say, “Well, the visitor saw the PPC ad, then they did an organic search, then they saw a banner, so we will give 1/3 of the sales credit to each” or worse, “we will allocate the credit for sales based on what we spend on each exposure”.

This approach is worse then having no model at all, because I often see these models used improperly, (for example) to “justify budget” – if you allocate a share of responsibility for outcome to PPC, then you get to keep a budget that would otherwise be “optimized” away.  A similar argument is being made by a few of the folks in the MultiChannel Merchant article above to justify catalog spend.

This is nuts, in my opinion.

I believe the core analytical culture problem at work here (if you are interested) is this:

Difference between Accuracy and Precision

I’d argue that given a choice, it’s more important to be precise than accurate – reproducibility is more important (especially to management) than getting the exact number right.  Reproducibility is, after all, at the core of the scientific testing method, isn’t it?  If you can’t repeat the test and get the same results, you don’t have a valid hypothesis.

And given the data stream web analytics folks are working with – among the dirtiest data around in terms of accuracy – then why would people spend so much time trying to build an “accurate” model?  Better to be precise – always using first campaign or last campaign – than to create the illusion of accuracy with an allocation model that is largely made up from thin air.

When I make the statement above, I’m excluding a team of Ph.D. level statisticians with the best tools and data scrubbing developing the model, though I suspect only a handful of companies doing these models actually fit that description.  For the vast majority of companies, the principle of Occam’s Razor rules here; what I want is reliability and stability; every time I do X, I get Y – even if I don’t know exactly (accurately) how I get Y from X.

Ask yourself if that level of accuracy really matters – if every time I put in $1 I get back $3, over and over, does it matter specifically and totally accurately exactly how that happens?

Whether to use first or last campaign is a matter of philosophy / culture and not one of measurement.  If you believe that in general, the visitor / customer mindset is created by exposure or interaction to the first campaign, and that without this favorable context none of the subsequent campaigns would be very effective, then use first campaign.

This is generally my view and the view of many offline direct marketing folks I know.  Here is why.  The real “leverage” in acquisition campaigns is the first campaign – the first campaign has the hardest job, if you will – so if you are going to optimize, the biggest bang for the buck is in optimizing first campaign, where if you get it wrong, all the rest of the campaigns are negatively affected.  This is the “leverage” part of the idea; on any campaign other than first, you can’t make a statement like this.  So it follows that every campaign should be optimized as “first campaign”, since you don’t normally control which campaign will be seen first.

Some believe that the sale or visit would not have occurred if the last campaign was not effective, and all other campaigns are just “prep” for that campaign to be successful.  Perhaps true, but it doesn’t fit my model of the world – unless you know that first campaign sucks.  If you know that, then why wouldn’t you fix it or kill it, for heaven’s sake?

All of the above said, if you have the chops, the data, and the tools, you can produce attribution models that will provide direction on “weighting” the effect of different campaigns.  These “marketing mix” models are used all the time offline, and are usually the product of high level statistical models.   By the way, they’re not generally accurate, but they are precise.  I do X, I get Y.

You can produce a similar kind of information through very tightly testing using control groups, but that’s not much help for acquisition because you usually can’t get your hands on a good control group.  So for acquisition you are left with trying to synch time periods and doing sequential or layered testing.

For example, in June we are going to kill all the AdSense advertising and see what happens to our AdWords advertising – what happens to impressions, CTR, conversion, etc.  Then in July we will kick AdSense on again and see what happens to the same variables, along with tracking as best we can any overlapping exposures.

Then given this info, we decide about allocation using the human brain and database marketing experience.

This approach is not accurate, but I’d rather be precise and “directionally right” then accurate and be absolutely wrong, if you know what I mean.  This test approach should give you directional results if executed correctly – the spread for the AdSense OFF / ON test results should be healthy, and you should be able to repeat the test result with some consistency.

Bottom line – it doesn’t really matter exactly what is happening, does it?  Do you need an accurate accounting of the individual effects of each campaign in a multiple campaign sequence?  No.  What you need is a precise (reliable and reproducible) way to understand the final outcome of the marketing mix.

Even if you think you have an accurate accounting of the various campaign contributions, what makes you think you can get that with data as dirty as web data is?  Despite the attempt at accuracy, all you have to do is think through cookies, multiple computers, systems issues, and web architecture itself to understand that after all that work, you still don’t have an accurate result.

Hopefully it is more precise than simply using first campaign.

Thoughts from you on this topic?  I know there are at least two “marketing mix” folks on the feed…

12 thoughts on “Marketing Attribution Models

  1. :)

    I had a CEO recently call me about this very topic. He understood what you are writing about, he was looking for a reliable and reproducible way to understand the final outcome of the marketing mix. It was odd to hear a CEO

    And you’re right … this topic has been floated around since the 70s, in retail. Vendors looking to maintain their share of business, and catalogers hoping to validate the effectiveness of paper are jumping on the fractional allocation bandwagon.

  2. Hi Jim,

    I’m a marketing mix analysis guy (without a PhD it has to be said) and I think what we’re really looking at here is a case of risk reduction.

    Is it more risky to attribute sales using some arbitrary system or better to use what clean data there is to create a precise “model” from the data. I’d say the latter with a caveat which is that models need to be validated and the measurement error acknowledged.

    The good news is that this is possible whereas with attribution it will never be possible. However the bad news is that if “measurements” are unreliable then it takes a very logical mind to acknowledge the potential for forecast error.

    One final point on attribution – i’ve rarely seen it done where three teams get together and give each other a third of the sales for three activities. Usually, they each try to claim as big a chunk as possible with the result that more sales are claimed than exist on the books.

  3. Marketing Mix Modeling involves breaking up of sales volume into various components, and analyzing spend on each of them to calculate ROI from each of these components. With so many marketing channels in hand, particularly on web, it becomes too complex to analyze each channel. On web, one can extract massive amount of unstructured data, but then to extract meaningful information, you need to deploy analytical data models. Using predictive analytics, analysts can forecast sales through each of these channels and optimize the marketing spends to gain maximum value.

    In the last 10 years many Fortune 500 companies such as P&G, Kraft, Coca-Cola and Pepsi have made Marketing Mix Modeling an integral part of their marketing planning.

  4. It seems to me that most of the discussion on this blog is focused on “silos”…web analytics, sales force automation, crm analytics, etc, whichi s all good…..but only marketing mix modeling addresses the “whole MROI enchillada”. Sure, there are some limits to how many variables you can put in a model, but you would be surprised how detailed it can be.
    It can address offline marketing, online marketing, direct marketing, promotions, etc, all in one swhoop. These silo tools are great for drilling down into the detail that mix modeling doesn’t do, but none of these tools address the whole marketing investment enchillada.

    The problem with mix modeling now is that most of it is in CPG, but many other businesses can’t do it, and don’t get it. When they do, look out!

  5. Perhaps the CPG folks are so into it because that’s all they can really do?

    When you have actual customer behavior data, and are acting on individuals and not “the masses”, the need for mix modeling is substantially reduced, IMHO. Goes hand in hand with whether you are a big spender on untrackable media. If so, then have at it!

  6. Hi Jim, Brilliant post – and lovely to hear common sense applied to the subject.

    It seems to me that so many people are asking how we should model attribution effectively, few – other than your good self – are asking whether we should try.

    As you so rightly say, just because a particular ad shows up in a user’s path to conversion, that doesn’t mean that ad had a positive effect, negative effect or no effect whatsoever – particularly given the cookie-flooding practices so common and the general issue of, for example, banner blindness.

    However, with all that said, your last comment – about the ability to follow just one user’s path to conversion – points us to what makes online tracking technologies (like our own brilliant TagMan) so interesting. Online we do not need to plan against a defined (and fictitious) ‘audience’ – we can take an individual conversion and work backwards. Do this enough times and common sense tells us what’s happening.

    Then we can test our theories and THEN we can make decisions of allocation and – if you so choose – attribution, though in my view allocation is just the ultimate form of attribution since, if I don’t allocate you any spend, you stand no chance of being attributed with any sales.

    Like I say, great post.


    Philip Buxton
    Marketing director

  7. I still think allocation is a sucker’s bet – you are allowing people to choose winners rather than the math. The result is “justification”, as opposed to analysis, and while some might say this result is better than nothing, I’m not sure the kinds of distortions that can occur make this a true statement.

  8. Comment on Allocation vs attribution.
    “I’m pretty sure “allocation” and “attribution” are really different concepts, though they seem to be used interchangeably right now. Let’s just say from reading the article allocation sounds more like a gut feel thing and attribution, from my experience, implies the use of a mathematical model of some kind. ”

    From my experience in the industry for the last 11yrs Allocation is the Model and attribution is the real-time application credit of the action to the session or series of events during a period time defined by the allocation model.

    I apply a global allocation model to my system and attribute the credit value as it happens to the event. Allocaiton Most recent /Expires 30 days/Attribute credit to the event within my 30 day period

    Just my two cents — but the assumption is correct in that not many people get this and not many use the terms correctly.

  9. Great comments to original post on an old problem which has recently arisen from the internet display camp.

    Same old problem, not of measurement vs practicality: Turf protection and the budget $$$ associated with silo’d channels. Multi-channel campaign measurement & allocation has traditionally been a touchy issue in any environment (advertiser or vendor space) where competing spend tactics all desire to lay claim to the revenue recognition. So, instead of a realistic value proposition to attempt to accurately associate comparative value, it usually results in a zero sum game to trump the other guy’s revenue momentum.

    Yesterday, it was TV & direct mail. Now it’s display & paid search. Same arguments in both cases.

    The real business question you adeptly imply in your post, “why spend limited capital & scarce resources attempting to optimize an illusive & deceptively futile measurement problem?”

    Doesn’t it make far better practical sense, to put that same level of energy into simply improving each of those separate silos to their maximum potential – especially for large branded advertisers who have fragmented channel responsibility themselves?

  10. DSTM, have to agree. One way to get at some level of mix modeling without needing a whole lot of tools / experience is simply to kill all the media except one, optimize that, then add the next, optimize that mix, etc. Typically I would want to start at the “bottom” of the AIDAS funnel, where folks are closest to Action, then add higher levels all the way up to Awareness. Online, this probably means you start with Search, then add BT Overlay, then add straight Display, then add Offline, looking for incremental performance with each additional layer. This is essentially what I call the Marketing Bands Model.

    Often I hear people say this can’t be done because of media silos etc. but there must be *someone* in charge of all Marketing efforts who can make this happen. Just takes some substantial guts.

  11. Hi Mr. Jim Novo, i like the discussions here; so thought i will provide 2.1(!) cents view.

    Regarding “accuracy or precision” – the wikipedia article makes a big issue of lot of terminologies; accuracy is simply % of accurately classified obs (the tabulation they give is for binary classifiction). So total predicted in the diagonal of the “confusion matrix” / total sample.

    The question of trade off between accuracy and precision is legendry; however the graphical example seems to give the impression all problems are continuous targets; if it is a question of classification, where the decision is whether one hit the target or not, then the graphics does not do justice to the problem.

    I like to kindle thoughts; this is not a retort! Great group…keep going.

    I have some comments on attribution analysis; comes later.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.