Category Archives: Analytics Education

More Tips on Evaluating Research

To continue with this previous post…other things to look for when evaluating research:

Discontinuous Sample – I don’t know if there is a scientific word for this (experts, go ahead and comment if so), but what I am referring to here is the idea of setting out the parameters of a sample and then sneaking in a subset of the sample where the original parameters are no longer true.  This is extremely popular in press about research.

Example:  A statement is made at the beginning of the press release regarding the population surveyed.  Then, without blinking an eye, they start to talk about the participants, leaving you to believe the composition of participants reflects the original population.  In most cases, this is nuts, especially when you are talking about sending an e-mail to 8000 customers and 100 answer the survey. 

Sometimes it works the other way, they will slip in something like, “50% of the participants said the main focus of their business was an e-commerce site”, which does not in any way imply that 50% of the population (4000 of 8000) are in the e-commerce business.  Similarly, if you knew what percent of the 8000 were in the e-commerce business, then you could get some feeling for whether the participant group of 100 was biased towards e-commerce or not.

Especially in press releases, watch out for these closely-worded and often intentional slights of hand describing the actual segments of participants.  They are often written using language that can be defended as a “misunderstanding” and often you can find the true composition of participants in the source documentation to prove your point. 

The response to your digging and questioning of the company putting out the research will likely be something like, “the press misunderstood the study”, but at least you will know what the real definitions of the segments are.

Get the Questions – if a piece of research really seems to be important to your company and you are considering purchasing it, make sure the full report contains all the research questions

I can’t tell you how many times I have matched up the survey data with the sequencing and language of the questions and found bias built right into the survey.  Creating (and administering, for that matter) survey questions and sequencing them is a scientific endeavor all by itself.  There are known pitfalls and ways to do it correctly, and people who do research for a living understand all of this.  It’s very easy to get this part of the exercise wrong and it can fundamentally affect the survey results.

So, in summary, go ahead and “do research” by e-mailing customers or popping up questionnaires, or read about research in the press, but realize there is a whole lot more going on in statistically significant, actionable research than meets the eye, and most of the stuff you read in the press in nothing more than a Focus Group.

Not that there is anything inherently wrong with a Focus Group, as long as you realize that is what you have.

Research for Press Release

I think one of the reasons “research” has become so lax in design and execution is this idea of doing research to drive a press release and news coverage. Reliable, actionable research is expensive, and if all you really want to do is gin out a bunch of press, why be scientific about it?  Why pay for rigor?  After all, your company is not going to use the research to take action, it’s research for press release.

So here’s a few less scientific but more specific ideas to keep in mind when looking at a press release / news story about the latest “research”, ranked in order of saving your time.  In other words, if you run into a problem with the research at a certain level, don’t bother to look down to the next level – you’re done with your assessment.

Press about Research is Not Research – it’s really a mistake to make any kind of important decision on research without seeing the original source documentation.  For lots of reasons, the press accounts of research output can be selectively blind to the facts of the study. 

If there is no way to access the source research document, I would simply ignore the press account of the research.  Trust me, if the subject / company really had the goods on the topic, they would make the research document available – why wouldn’t they?  Then if / when you get to the research source document, run the numbers a bit for your self to see if they square with the press reports.  If not, you still may learn something – just not what the press report on the research was telling you!

Source of Sample – make sure you understand where the sample came from, and assess the reliability of that source.  Avoid trusting any source where survey participants are “paid to play”.  This PTP “research” is often called a Focus Group and though you can learn something in terms of language and feelings and so forth from a Focus Group, I would never make a strategic decision based on a non-scientific exercise like a Focus Group. 

Go ahead and howl about this last statement Marketers,  I’m not going to argue the fine points of it here, but those wish to post on this topic either way, go ahead.  Please be Less Scientific or More Specific than usual, depending on whether you are a Scientist or a Marketer. 

For a very topical and probably to some folks quite important example of this “source” problem, see Poor Study Results Drive Ad Research Foundation Initiative.  If you want a focus group, do a focus group.  But don’t refer to it as “research” in a scientific way.

Size of Sample – there certainly is a lot of discussion about sample sizes and statistical significance and so forth in web analytics now that those folks have started to enter the more advanced worlds of test design.  Does it surprise you the same holds true for research?  Should’t, it’s just math (I can feel the stat folks shudder.  Take it easy, relax).

Without going all math on this, let’s say someone does a survey of their customers.  The survey was “e-mailed to 8,000 customers” and they get 100 responses to the survey.   I don’t need to calculate anything to understand the sample is probably not representative of the whole, especially given the methodology of “e-mailed our customers”.  Not that a sample of 100 on 8000 is bad, but the way it was sourced is questionable.

What you want to see is something more like “we took a random sample of our customers and 100 interviews were conducted”.  It’s the math thing again.  Responders, by definition, are a biased sample, probably more of a focus group.  This statement is not always true, but is true often enough that you want to verify the responders are representative.  Again, check the research documentation.

OK Jim, so how can political surveys be accurate when they only use 300 or so folks to represent millions of households?  The answer is simple.  They don’t email a bunch of customers or pop-up surveys on a web site.  They design and execute their research according to established scientific principles.  Stated another way, they know exactly and specifically who they are talking to.  That’s because they want the research to be precise and predictive.

How do you know when a survey has been designed and executed properly?  Typically, a confidence interval is stated, as in “results have margin of error +- 5%”.  This generally means you can trust the design and execution of the survey because you can’t get this information without a truly scientific design (Note to self, watch for “fake confidence level info” to be included with future “research for press release” reporting).

More rules for interpreting research

Will Work for Data

But will do a sub-optimal job…

Trying to catch up on what is going on in the analytics blogosphere, and it seems like I’m seeing a common thread – we’re getting much better at analyzing customer data, but whoever is in charge of Turning Customer Data into Profits is not quite with the program yet. 

Based on my experience, and assuming the people responsible are Marketing folks, the challenge to solving this problem often lies in understanding the difference between executing against behavioral data and executing against data about “characteristics” like demographics.

Marketing is not always about buying mass media, yet most Marketing people have never had to create and execute a campaign using behavioral data against a behavioral Objective.  So they do what they have always done – they create campaigns based on characteristics – and then execute against behavioral objectives using behavioral data.

This is a recipe for sub-optimal performance.  It’s like buying a car with a high performance engine then putting the cheapest gas in it you can find and never getting a tune up.  Sure, the car will run, but it’s not going to run very well, and you sure are not going to win any races with the competition.  Provided, of course, they don’t treat their car the same way.

For example, Ron is commenting on weak segmentation practices and lack of understanding the new customer experience in banking.  He is absolutely right.  Segmenting by “number of products” is often a static characteristic; segmenting by “change in number of products” is behavioral and many times more profitable.  As for new customer experience, the initial experience defines a customer’s “view” of the company and I don’t think I have to explain the importance of that.

Kevin is bemoaning the lack of temporal segmentation and use of appropriate creative for this segmentation by many e-mail folks.  He is absolutely right.  You want to speak to the customer based on their level of engagement with the company, not in terms of static perceptions.

Avinash perceives a problem coming down the road with behavioral targeting, that is, while the machine is smart, the results are only as good as the content you feed the engine.  Absolutely right.  If you run campaigns designed around static demographics on a behavioral platform you have created a way to “efficiently target crap to your customers”.

Is anybody listening?  If the message is not clear, try this:

Most Marketers are looking to drive “behavior” of some kind – even the Brand folks, who simply have a longer time horizon.  If behavior is the outcome you want, the campaigns must be created around “when”, “what”, and “why”, not “who”.  “When”, “what”, and “why” are behavioral ideas, “who” is a static characteristic (like a demographic) that probably has nothing to do with past or future behavior.

I know, you have probably been told segmenting by demographics is the way to go, or read so somewhere.  Was the source talking about buying media or data-driven marketing?

Sure, if you don’t have any behavior – when buying TV for example – then you go with what you can get.  Some segmentation is always better than none at all.  But if you have behavior, then using demographics to drive campaign segmentation is going to be sub-optimal.

Static characteristics like age and income do not predict behavior.  Behavior is in motion; it changes over time.  You can’t take a static characteristic and expect it to do a very good job predicting behavior because behavior changes over time.  Behavior predicts behavior.

The fact I am a 48 year old male predicts nothing about my behavior.  These characteristics are simply a proxy for buying media against me more efficiently; they really mean nothing when you cross the line into using data sets with actual behavior in them.  The fact I stopped visiting / posting / purchasing or that I am in the top 10% for writing reviews is much more powerful.

When addressing behavioral segments, first ask When?  When did I stop visiting / posting / purchasing?  Over what time period am I in the top 10%?  Am I still in the top 10%?

Then ask, What?  What events led up to my behavior?  What campaign did I come in from, salesperson did I talk to, products did I buy, areas of the site did I visit?  What has happened to me?

Then, understanding my experience, ask Why am I behaving like I am? Then knowing Why (or more likely, making an educated guess), can you think of a message that is going to change my behavior?

Now you are ready to design and execute a campaign that will blow the socks off of anything you can do by knowing I am a 49 year old male, because you can directly address me with a message that is more relevant to me.

Marketers, please take the time to think about “when”, “what”, and “why” in campaign design and execution if using behavioral data, and forget about “who”.  You will be glad you did

Analysts, have you ever run into this problem?  Rich evidence of a behavioral “edge” you might have that is ignored in the creation and execution of the campaign?

P.S.  The glad you did link above shows what you can learn by looking at behavioral segments as opposed to demographics.  All the folks in this test are in the same demographic segment, with a 10% overall response rate to a 20% discount offer – better response than any other demographic segment.  But they sure had different levels of profitability, based on behavior. The more engaged they were – as measured by time since last purchase – the less profitable they were for this campaign.  And you can predict this result, because it will happen every time you use the same behavioral segmentation and offer, with slight variations possible across demographic segments.