Several questions came in on the ability of surveys to predict actual behavior, covered in the post Measuring the $$ Value of Customer Experience (see 2. Data with Surveys). My advice is this: if you are interested in taking action on survey results, make sure to survey specific visitors / people with known behavior if possible, then track subjects over time to see if there is a linkage between survey response and actual behavior. You should do this at least the first time out for any new type of survey you launch.
Why? Many times, you will find segments don’t behave as they say they will. In fact, I have seen quite a few cases where people do the opposite of what was implied from the survey. This happens particularly frequently with best customers – the specific people you most want to please with modifications to product or process. So this is important stuff.
You’ve Got Data!
Turns out there’s a new academic (meaning no ax to grind) research study out addressing this area, and it’s especially interesting because the topic of study is ability of customer feedback metrics to predict customer retention. You know, Net Promoter Score, Customer Effort Score and so forth, as well as standard customer satisfaction efforts like top-2-box.
The authors find the ability of any of one of these metrics to predict customer retention varies dramatically by industry. In other words, you might want to verify the approach / metric you are using by tying survey response to actual retention behavior over time.
Now, understand social science research is rarely perfect. We’re not talking about chemistry here in terms of measuring outcomes, and culture can affect behavior. But when you see data like this over and over under many different circumstances, it’s prudent to at least think critically about whether your Cust Sat, NPS or CES score means what you think it means. Following logically, you might consider if the customer experience work you are doing is as effective or (gasp) as profitable as you think it is.
Please note before reading the following: I am not saying these study results apply to your business, or that you should abandon customer feedback research. What I am saying (repeating, actually) is when you do survey work, you really should confirm “what they say” is “what they do” over time if you want to use the survey results to make important decisions. This study doesn’t answer all the questions surrounding this topic, but it does provide a third-party, scientific view in step with my own experienced-based opinions.
If you’re a analyst, you really should read the study, if for no other reason than to get exposed to an authentic science-based consumer research project. Tip: You don’t have to understand the math; it’s peer reviewed so is solid. Shortcut: read the Introduction, Methodology (to learn thought process / about proper techniques), and Summary. Hat tip to Ron Shevlin for the source; if you are in banking be sure to read his review here.
The research team looked at the ability of 5 different customer feedback metrics across 18 different industries, so the work does not suffer from the “unique circumstance bias” so often seen in online feedback research. Plus, this makes it tough to argue about possible industry bias in the outcome. Here’s my summary of significant ideas in this work:
1. The ability of these different feedback metrics to predict customer retention varies a LOT by industry, e.g. a method that works well for airlines may not work well for online shopping. Makes sense, yes? Just think about the difference in what’s important to you when shopping at a supermarket versus deciding whether to keep your phone carrier.
2. Performance of the same metric can be different depending on whether “retention” means company level or customer level. In other words, the customer experience doesn’t always “roll up” to the company experience. This is a less intuitive idea than #1, and best understood by reading the study itself. If I had to summarize, the competitive landscape of the industry has a lot of influence on this effect; this also seems quite logical to me.
3. This comment is not so much about the study itself but the practical application of statistical significance. In this study, the difference between any one feedback metric being “statistically significant” and “the best” is, um…. quite significant.
For example, at the customer level, the best performing top 2-box metric was statistically significant in 10 of 18 industries, but best performing in only 4 of 18 industries (online booking, online shopping, drugstores and banks). The CES metric was significant in only 1 industry (banking) and best performing in none.
I’ve made this point in relation to highly automated A/B testing before, where often we hear cries of victory based on statistical significance. Yes, the outcome is significant. No, this does not mean the chosen combination is the best you can do; significant does not mean best, even among versions you are testing! See local maximum, digital-centric here.
Said another way, just because your NPS score is significant relative to retention does not mean it’s the best metric for you, as can be seen clearly in the study tables (see pg 29 – 30).
Most academic studies end with a list of areas the authors think should be studied further as a result of their work, and this one is no different. To assist with pushing the research direction forward, the authors often toss out “wish we had this” ideas and point out how the study could be made even better. Pretty refreshing when compared to the “research” coming out of many agencies and vendors, which is always positioned as the last word on the topic and very rarely provides complete documentation on how the study was done.
You can (and many will) nitpick this type of academic study to death, but before you do, remember this: the people doing the study gave you the ammo to do the nitpicking up front. When was the last time an agency or vendor did the same with their research?
Questions on this study? Have you seen these kinds of disparities in your own work when using customer feedback to drive process improvement efforts? Ever made exactly the changes suggested by surveys only to experience zero (or negative) effects on retention?