Category Archives: Analytics Education

KFI’s: Key Forecast Indicators

As I said in my presentation at the eMetrics / Marketing Optimization Summit, if you want to get C-Level people to start paying attention to web analytics, you have to get into the business of predicting / forecasting.  Let’s face it, KPI’s are about the past, right?  You don’t know “Performance” until it has already happened.

But C-folks don’t really care much about what has already happened, because they can’t do anything about it.  What they really want to know is what you think will happen.  For example, ideas like “sales pipeline” – a forecast.  If you start forecasting – and you are right – you will get attention from the C-folks pronto.  The web is a great forecasting tool because it’s so frictionless; it tends to provide tangible signals before many other parts of the business.

So: Do you have any KFI’s – Key Forecast Indicators?

I have one for the Lab Store, and it tripped about 2 months ago.  It’s the Unwanted Exotic Index (UEI).

As part of the Lab Store, we run a moderated board where people who want to give up exotic pets can post the availability, and people looking for exotic pets can post requests.  Typically, the ratio of people giving them up to wanting them is about .25 – for every post looking to give an exotic up, there are 4 posts looking to adopt.

A couple of months ago, this ratio starts popping higher.  A couple of weeks ago it hit 1.25 – for every 5 posts looking to give up an exotic there were 4 posts looking to adopt.  The last time something like this happened was prior to the mini-recession of 2004, when the Unwanted Exotic Index tagged 1.0 for a short time.  After this happened, our sales got soft about 2 – 3 months later.

Why is the UEI predictive?  Let’s go through the logic – my logic, anyway!

Keeping certain types of exotic animals can be a strain on a family, both from a time and money perspective.  They can be high maintenance.  On the margin, as the economy gets tougher and people look to manage household budgets, these pets can get some scrutiny – particularly if kids have lost interest or gone off to college.  So more go up for adoption.  At the same time, requests to adopt fall, as families who might have considered an exotic pet put the “owning decision” on hold.  Taken together, these decisions cause the UEI to spike higher.  Both giving up and deciding not to own exotic pets affects Lab Store revenues “expected” in the future.  So the UEI ends up being predictive of future demand.

Makes sense to me.

Now, I’m a pretty good student of macroeconomics and pay attention to many economic indicators, especially predictive ones like the ECRI’s US Weekly Leading Index.  If you’re an analyst, you should too; economic indicators provide context for any analysis you might have to do, and clients often want to understand the impact of these external issues on their business.

As far as the Lab Store specifically, I don’t usually pay much attention to the macroeconomic cycles.  The pet business tends to be insensitive to the economic cycle; people don’t stop caring for pets as the economy wobbles up and down.  That’s why it’s such a good business – if you can find a niche.  So I don’t get too concerned when I see these predictive macroeconomic indexes forecasting a slowing economy.

However, what we have here with our Unwanted Exotic Index is a confirmation of the broader economic forecasting tools that is specific to our exotic pet business.  That makes me sit up and take notice!  Looks like our business is setting up for a repeat of the 2004 slowdown – the last time the UEI spiked like this.  Why is this important?  Because I can do something with this knowledge.  I can re-allocate and re-prioritize based on this knowledge.  For example, I can move from a “grow bigger” to a “grow smarter” mode.

And please note: this KFI has nothing to do with traffic or sales on the web site; traffic and sales are “rear view”.  By the time you see the sales slow down it will be too late to do anything about it.  And that’s why the C-folks don’t care much about web analytics reports.  

You could track an index like the UEI with a web analytics tool, but you’d have to come up with the idea first.  My point is you will probably have to look outside the usual “rear view” metrics to find one with forecasting ability.  I caution you not to substitute a “survey” for a predictive model; people’s opinions are a notoriously lagging indicator.  You’ll be up to your ears in the slowdown before people start turning bearish.

So: Do you have any KFI’s – Key Forecast Indicators?  Tell us about them. 

If you don’t have any KFI’s, now is the time to start looking for them.  What can you see now that predicts what will happen in the future?  Think about the business, think about the data sources, and put together a bunch of different ideas.  Track them back a couple of years and post them monthly going forward.  You’re bound to find something predictive.  Perhaps something about posting, like the UEI.  Recommendations / comments as a percent of visitors or something like that.

If you’re stuck, start with a simple “engagement” idea – percent visitors / members / customers who visited / logged in / bought in the past 90 days.  If this percentage is falling, so will your business in the next 3 – 6 months.  If your business has a lot of seasonality in it, look to year-over-year comps of the same metric.

If you’ve never played this game before, you won’t have proof your KFI’s work until after the business is in the soup, but you’ll be ready with accurate and actionable KFI’s the next time around!

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

What’s the Frequency?

The following is from the October 2007 Drilling Down Newsletter.  Got a question about Customer Measurement, Management, Valuation, Retention, Loyalty, Defection?  Just ask your question.  Also, feel free to leave a comment. 

Want to see the answers to previous questions?  The pre-blog newsletter archives are here, “Best Article” reviews here.

Q:  I ordered your book and have been looking at it as I have a client who wants me to do some RFM reporting for them.

A:  Well, thanks for that!

Q:  They are an online shoe shop who sends out cataloges via the mail as well at present.  They have order history going back to 2005 for clients and believe that by doing a RFM analysis they can work out which customers are dead and Should be dropped etc.  I understand Recency and have done this.

A:  OK, that’s a great start…

Q:  But on frequency there appears to be lots of conflicting information – one book I read says you should do it over a time period as an average and others do it over the entire lifecycle of a client.

A:  You can do it either way, the ultimate answer is of course to test both ways and see which works better for this client.

Q:  Based on the client base and that the catalogues are seasonal my client reckons a client may decide to make a purchase decision every 6 months.  My client is concerned that if I go by total purchases , some one who was  really buying lots say two years ago but now  buys nothing could appear high up the frequency compared to a newer buyer who has bought a few pairs, who would actually be a better client as they’re more Recent?  Do I make sense or am I totally wrong?

A:  Absolutely make sense.  If you are scoring with RFM though, since the “R” is first, that means in the case above, the “newer buyer who has bought a few pairs” customer will get a higher score than the “buying lots say two years ago but now buys nothing” customer.

So in terms of score, RFM self-adjusts for this case. The “Recent average” modification you are talking about just makes this adjustment more severe.  Other than testing whether the  “Recent average” or “Lifetime” Frequency method is better for this client, let’s think about it for a minute and see what we get.

The Recent average Frequency approach basically enhances the Recency component of the RFM model by downgrading Frequency behavior out further in the past.  Given the model already has a strong Recency component, this “flattens” the model and makes it more of a “sure thing” – the more Recent folks get yet even higher scores.  

What you trade off for this emphasis on more recent customers is the chance to reactivate lapsed Best customers who could purchase if approached.  In other words, the “LifeTime Frequency” version is a bit riskier, but it also has more long-term financial reward.  Follow?

So then we think about the customer.  It sounds like the “make a purchase decision every 6 months” idea is a guess as opposed to analysis.  You could go to the database and get an answer to this question – what is the average time between purchases (Latency), say for heavy, medium, and light buyers?  That would give you some idea of a Recency  threshold for each group, where to mail customers lapsed longer than this threshold gets increasingly risky, and you could use this threshold to choose parameters for your period of time for Frequency analysis.

Also, we have the fact these buyers are (I’m guessing) primarily online generated.  This means they probably have shorter LifeCycles than catalog-generated buyers, which would argue for downplaying Frequency that occurred before the average threshold found above and elevating Recency.

So here is what I would do.  Given the client is already pre-disposed to the “Recent Frequency” filter on the RFM model, that this filter will generally lower financial risk, and that these buyers were online generated, go with  the filter for your scoring.

Then, after the scoring, if you find you will in fact exclude High Frequency / non-Recent buyers, take the best of that excluded group – Highest Frequency / Most Recent – and drop them a test mailing to make sure fiddling with  the RFM model / filtering this way isn’t leaving money on the table.

If possible, you might check this lapsed Frequent group before mailing for reasons why they stopped buying – is there a common category or manufacturer purchased, did they have service problems, etc. – to further refine list and creative.  Keep the segment small but load it up if you can, throw “the book” at them – Free shipping, etc.  

And see what happens.  If you get minimal  response, then you know they’re dead.

The bottom line is this: all models are general statements about behavior that benefit from being tweaked based on knowledge of the target groups.  That’s why there are so many “versions” of RFM out there – people twist and  adopt the basic model to fit known traits in the target populations, or to better fit their business model.

Since it’s early in the game for you folks and due to the online nature of the customer generation, it’s worth being cautious.  At the same time, you want to make sure you don’t leave any knowledge (or money!) on the table.  So you drop a little test to the “Distant Frequents” that is “loaded” up / precisely targeted and if you get nothing, then you have your answer as to which version of the model is likely to work better.

Short story: I could not convince management at Home Shopping Network that a certain customer segment they were wasting a lot of resources on – namely brand name buyers of small electronics like radar detectors – was really worth very little to the company.  So I came up with an (unapproved) test that would cost very little money but prove the point. 

I took a small random sample of these folks and sent them a $100 coupon – no restrictions, good on anything. I kept the quantity down so if redemption was huge, I would not cause major financial damage.

With this coupon, the population could buy any of about 50% of the items we showed on the network completely free, except for shipping and handling.

Not one response.

End of management discussion on value of this segment.

If you can, drop a small test out to those Distant Frequents and see what you get.  They might surprise you…

Good luck!


Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

Speaking Schedule, WAA Projects, etc.

It’s been a ruthless couple of weeks, with tons of Web Analytics Association work on top of the usual client / Lab Store stuff.  Why do the folks in the pet supply industry change packaging and labeling going into the holiday season?  That’s nuts, if you ask me, unless you think all your customers are offline stores – which I guess most of them are.  Still, there’s a large enough mail order pet business out there you would think the suppliers would catch a clue or two.  I have plenty to do during the holiday season without having to re-write copy and re-shoot photography…

Anyway, the weeks that were.  First was a WAA Webcast on Money, Jobs and Education: How to Advance Your Career and Find Business Opportunities (site registration required, but you don’t have to be a WAA member) to get ready for and execute.

And there was the ongoing wrestling match to establish a framework for higher educational institutions to create course offerings in Web Analytics, leveraging the course content the Web Analytics Association has developed.  Very tricky stuff dealing with these Higher Ed folks, but we think we have it figured out.  The WAA’s first partner in this area will be the University of California at Irvine – not a bad start, methinks.

Then of course, it’s Conference season.  I’m going to be on a “Measuring Engagement” panel at WebTrends Engage October 8 -10.  The following week is of course the eMetrics Marketing Optimization Summit where I will be doing a conference presentation in the Behavioral Targeting Track and then sitting on a no holds barred “Guru Panel” with Avinash Kaushik and Bryan Eisenberg immediately after. 

Part of getting ready for the Summit this year was a review of the WAA BaseCamp teaching materials, a pretty substantial piece of work all by itself.  We’ve done some tweaking based on comments from students in previous classes.

Unfortunately, I have to split the Summit right after the Guru panel for the Direct Marketing Association Conference in Chicago, so if you’re going to eMetrics and you are looking to chat with me, make sure you hit me up before my presentation Tues at 1:30 PM (I will be there Sunday 10/14 @ 4 PM for the WAA meeting). 

At the DMA, I’ll be doing a presentation with fellow web analytics blogger Alan Rimm-Kaufman in the Retention & Loyalty Marketing Track called Smart Marketing: Advanced Multichannel Acquisition and Retention Economics.  Control groups, predictive models, oh boy.

The next day, I’ll still be in Chicago doing a real “stretch event” at the invitation of Professor Philippe Ravanas of Columbia College Chicago for The Chicago Community Trust.  Nine (9!) non-profit arts groups are battling for grant money to help execute their marketing plans, and yours truly is going to vet those plans and teach donor / membership marketing in a live format – with all nine institutions exposing their guts to me and each other –  in real time!  Budgets, response rates, web sites, direct mail, newspaper, radio, database marketing, it’s all on the table.

Should be a real kick – if I survive the format, that is.  As a musician, I have always had a great interest in arts / donor marketing and this will be a great opportunity to interact directly with the folks in the trenches.

So, I apologize for the lack of posts the past couple of weeks as we now join our regularly scheduled life (in progress).

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

More Tips on Evaluating Research

To continue with this previous post…other things to look for when evaluating research:

Discontinuous Sample – I don’t know if there is a scientific word for this (experts, go ahead and comment if so), but what I am referring to here is the idea of setting out the parameters of a sample and then sneaking in a subset of the sample where the original parameters are no longer true.  This is extremely popular in press about research.

Example:  A statement is made at the beginning of the press release regarding the population surveyed.  Then, without blinking an eye, they start to talk about the participants, leaving you to believe the composition of participants reflects the original population.  In most cases, this is nuts, especially when you are talking about sending an e-mail to 8000 customers and 100 answer the survey. 

Sometimes it works the other way, they will slip in something like, “50% of the participants said the main focus of their business was an e-commerce site”, which does not in any way imply that 50% of the population (4000 of 8000) are in the e-commerce business.  Similarly, if you knew what percent of the 8000 were in the e-commerce business, then you could get some feeling for whether the participant group of 100 was biased towards e-commerce or not.

Especially in press releases, watch out for these closely-worded and often intentional slights of hand describing the actual segments of participants.  They are often written using language that can be defended as a “misunderstanding” and often you can find the true composition of participants in the source documentation to prove your point. 

The response to your digging and questioning of the company putting out the research will likely be something like, “the press misunderstood the study”, but at least you will know what the real definitions of the segments are.

Get the Questions – if a piece of research really seems to be important to your company and you are considering purchasing it, make sure the full report contains all the research questions

I can’t tell you how many times I have matched up the survey data with the sequencing and language of the questions and found bias built right into the survey.  Creating (and administering, for that matter) survey questions and sequencing them is a scientific endeavor all by itself.  There are known pitfalls and ways to do it correctly, and people who do research for a living understand all of this.  It’s very easy to get this part of the exercise wrong and it can fundamentally affect the survey results.

So, in summary, go ahead and “do research” by e-mailing customers or popping up questionnaires, or read about research in the press, but realize there is a whole lot more going on in statistically significant, actionable research than meets the eye, and most of the stuff you read in the press in nothing more than a Focus Group.

Not that there is anything inherently wrong with a Focus Group, as long as you realize that is what you have.

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

Research for Press Release

I think one of the reasons “research” has become so lax in design and execution is this idea of doing research to drive a press release and news coverage.  Reliable, actionable research is expensive, and if all you really want to do is gin out a bunch of press, why be scientific about it?  Why pay for rigor?  After all, your company is not going to use the research to take action, it’s research for press release.

So here’s a few less scientific but more specific ideas to keep in mind when looking at a press release / news story about the latest “research”, ranked in order of saving your time.  In other words, if you run into a problem with the research at a certain level, don’t bother to look down to the next level – you’re done with your assessment.

Press about Research is Not Research – it’s really a mistake to make any kind of important decision on research without seeing the original source documentation.  For lots of reasons, the press accounts of research output can be selectively blind to the facts of the study. 

If there is no way to access the source research document, I would simply ignore the press account of the research.  Trust me, if the subject / company really had the goods on the topic, they would make the research document available – why wouldn’t they?  Then if / when you get to the research source document, run the numbers a bit for your self to see if they square with the press reports.  If not, you still may learn something – just not what the press report on the research was telling you!

Source of Sample – make sure you understand where the sample came from, and assess the reliability of that source.  Avoid trusting any source where survey participants are “paid to play”.  This PTP “research” is often called a Focus Group and though you can learn something in terms of language and feelings and so forth from a Focus Group, I would never make a strategic decision based on a non-scientific exercise like a Focus Group. 

Go ahead and howl about this last statement Marketers,  I’m not going to argue the fine points of it here, but those wish to post on this topic either way, go ahead.  Please be Less Scientific or More Specific than usual, depending on whether you are a Scientist or a Marketer. 

For a very topical and probably to some folks quite important example of this “source” problem, see Poor Study Results Drive Ad Research Foundation Initiative.  If you want a focus group, do a focus group.  But don’t refer to it as “research” in a scientific way.

Size of Sample – there certainly is a lot of discussion about sample sizes and statistical significance and so forth in web analytics now that those folks have started to enter the more advanced worlds of test design.  Does it surprise you the same holds true for research?  Shouldn’t, it’s just math (I can feel the stat folks shudder.  Take it easy, relax).

Without going all math on this, let’s say someone does a survey of their customers.  The survey was “e-mailed to 8,000 customers” and they get 100 responses to the survey.   I don’t need to calculate anything to understand the sample is probably not representative of the whole, especially given the methodology of “e-mailed our customers”.  Not that a sample of 100 on 8000 is bad, but the way it was sourced is questionable.

What you want to see is something more like “we took a random sample of our customers and 100 interviews were conducted”.  It’s the math thing again.  Responders, by definition, are a biased sample, probably more of a focus group.  This statement is not always true, but is true often enough that you want to verify the responders are representative.  Again, check the research documentation.

OK Jim, so how can political surveys be accurate when they only use 300 or so folks to represent millions of households?  The answer is simple.  They don’t email a bunch of customers or pop-up surveys on a web site.  They design and execute their research according to established scientific principles.  Stated another way, they know exactly and specifically who they are talking to.  That’s because they want the research to be precise and predictive.

How do you know when a survey has been designed and executed properly?  Typically, a confidence interval is stated, as in “results have margin of error +- 5%”.  This generally means you can trust the design and execution of the survey because you can’t get this information without a truly scientific design (Note to self, watch for “fake confidence level info” to be included with future “research for press release” reporting).

More rules for interpreting research

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

Will Work for Data

But will do a sub-optimal job…

Trying to catch up on what is going on in the analytics blogosphere, and it seems like I’m seeing a common thread – we’re getting much better at analyzing customer data, but whoever is in charge of Turning Customer Data into Profits is not quite with the program yet. 

Based on my experience, and assuming the people responsible are Marketing folks, the challenge to solving this problem often lies in understanding the difference between executing against behavioral data and executing against data about “characteristics” like demographics.

Marketing is not always about buying mass media, yet most Marketing people have never had to create and execute a campaign using behavioral data against a behavioral Objective.  So they do what they have always done – they create campaigns based on characteristics – and then execute against behavioral objectives using behavioral data.

This is a recipe for sub-optimal performance.  It’s like buying a car with a high performance engine then putting the cheapest gas in it you can find and never getting a tune up.  Sure, the car will run, but it’s not going to run very well, and you sure are not going to win any races with the competition.  Provided, of course, they don’t treat their car the same way.

For example, Ron is commenting on weak segmentation practices and lack of understanding the new customer experience in banking.  He is absolutely right.  Segmenting by “number of products” is often a static characteristic; segmenting by “change in number of products” is behavioral and many times more profitable.  As for new customer experience, the initial experience defines a customer’s “view” of the company and I don’t think I have to explain the importance of that.

Kevin is bemoaning the lack of temporal segmentation and use of appropriate creative for this segmentation by many e-mail folks.  He is absolutely right.  You want to speak to the customer based on their level of engagement with the company, not in terms of static perceptions.

Avinash perceives a problem coming down the road with behavioral targeting, that is, while the machine is smart, the results are only as good as the content you feed the engine.  Absolutely right.  If you run campaigns designed around static demographics on a behavioral platform you have created a way to “efficiently target crap to your customers”.

Is anybody listening?  If the message is not clear, try this:

Most Marketers are looking to drive “behavior” of some kind – even the Brand folks, who simply have a longer time horizon.  If behavior is the outcome you want, the campaigns must be created around “when”, “what”, and “why”, not “who”.  “When”, “what”, and “why” are behavioral ideas, “who” is a static characteristic (like a demographic) that probably has nothing to do with past or future behavior.

I know, you have probably been told segmenting by demographics is the way to go, or read so somewhere.  Was the source talking about buying media or data-driven marketing?

Sure, if you don’t have any behavior – when buying TV for example – then you go with what you can get.  Some segmentation is always better than none at all.  But if you have behavior, then using demographics to drive campaign segmentation is going to be sub-optimal.

Static characteristics like age and income do not predict behavior.  Behavior is in motion; it changes over time.  You can’t take a static characteristic and expect it to do a very good job predicting behavior because behavior changes over time.  Behavior predicts behavior.

The fact I am a 48 year old male predicts nothing about my behavior.  These characteristics are simply a proxy for buying media against me more efficiently; they really mean nothing when you cross the line into using data sets with actual behavior in them.  The fact I stopped visiting / posting / purchasing or that I am in the top 10% for writing reviews is much more powerful.

When addressing behavioral segments, first ask When?  When did I stop visiting / posting / purchasing?  Over what time period am I in the top 10%?  Am I still in the top 10%?

Then ask, What?  What events led up to my behavior?  What campaign did I come in from, salesperson did I talk to, products did I buy, areas of the site did I visit?  What has happened to me?

Then, understanding my experience, ask Why am I behaving like I am?  Then knowing Why (or more likely, making an educated guess), can you think of a message that is going to change my behavior?

Now you are ready to design and execute a campaign that will blow the socks off of anything you can do by knowing I am a 49 year old male, because you can directly address me with a message that is more relevant to me.

Marketers, please take the time to think about “when”, “what”, and “why” in campaign design and execution if using behavioral data, and forget about “who”.  You will be glad you did

Analysts, have you ever run into this problem?  Rich evidence of a behavioral “edge” you might have that is ignored in the creation and execution of the campaign?

P.S.  The glad you did link above shows what you can learn by looking at behavioral segments as opposed to demographics.  All the folks in this test are in the same demographic segment, with a 10% overall response rate to a 20% discount offer – better response than any other demographic segment.  But they sure had different levels of profitability, based on behavior.  The more engaged they were – as measured by time since last purchase – the less profitable they were for this campaign.  And you can predict this result, because it will happen every time you use the same behavioral segmentation and offer, with slight variations possible across demographic segments.

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

Aberdeen on Web Analytics Education

John Lovett at Aberdeen has produced a review of the educational opportunities out there for folks interested in learning web analytics.  It’s a wide ranging piece covering everything from the Yahoo Group to the various agencies to the WAA courses to the Master of Science in Analytics from NC State.  John says:

“Web analytics usage has reached mainstream status with 82% adoption among companies surveyed recently by Aberdeen.  However, a vast range of maturity exists regarding analytics process, data analysis and corporate understanding of web metrics.  A fundamental impediment precluding many companies from building a successful analytics program is a lack of skilled employees required to manage, distribute and analyze web analytics.”

He addresses this situation in two parts:

Vendor sponsored programs and consultants, blogs, and guru sessions

Community forums, industry associations, and academic programs

These are unlocked research reports, no charge to view. 

The NC State effort is quite interesting; they are taking the “blended approach” I feel is where we are headed.  Data is data, behavior is behavior, and many of the offline analytical disciplines have a lot to offer the folks in web analytics.  We’re already seeing web analytics job postings with phrases like “strong knowledge of SAS and SPSS highly desirable” meaning employers are looking for cross-platform, cross-tool, cross-channel analysts.

The folks with this cross-knowledge set who can also “speak business” are going to be a very hot commodity going forward.  Fortunately, most web analysts already “speak business”, it’s part of the WA culture – and speaking business is the hard part for most analytical minds.  Like I said, the data is data, the behavior is behavior – and the tools are just tools.  Web analytics is patient zero, infecting the corporation with a proper analytical culture.

If you’re a web analyst and are offered a chance to do SAS / SPSS / Business Objects / etc. training, I would jump on it.

Thanks John / Aberdeen for a great “Sector Insight” piece of research.

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

Live Web Analytics Knowledge Events

WAA BaseCamp and Gurus of Online Marketing Optimization Tour

I’ll be giving an all day workshop on Web Analytics for Site Optimization as part of the WAA BaseCamp series in Los Angeles on 7/23 and Chicago on 8/22.  More details, other courses and cities for this series are here.

The BaseCamps are built on the course material I produced with help from many others for the Web Analytics Association.  This effort resulted in the 100% online Award of Achievement in Web Analytics offered by the University of British Columbia.  The Award of Achievement is four courses with 96 hours of content, so you’re not going to get all of that content in a one day event.  You will get a great “flyover” of all the material in one of the courses in a day long BaseCamp Session – plus the fact it’s live and interactive with the Instructor and peers in the class.

The Gurus of Online Marketing Optimization Tour is also a very interactive presentation plus Q & A event put together in conjunction with the WAA BaseCamp courses.  I’ll be one of the Gurus on the panel in Los Angeles 7/24, Boston 8/21, and Chicago 8/23.  This should be a lot of fun and maybe even a bit of a wrestling match in some cases with fellow gurus Eisenberg, Peterson, Sterne, & Veesenmeyer

More info here, hope to see you there!

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

What Data Mining Can and Can’t Do, Part 2

The previous post was about what data mining is good for and what it is not good for, and how to use data mining properly for Marketing efforts.  This post further explains this concept in response to comments received.

Detecting credit fraud, especially with a data set as huge as the one at MCI, is a perfect application for data mining – classification, as in “this is fraud, this is not”.  These are not predictions, they are classifications based on a certain type of behavior that has already occurred.  As long as what a Marketer is really trying to accomplish is classification, then data mining is a great tool.  If you are trying to predict behavior, not so good.

I agree data mining has “real potential is to call attention to things for further investigation” as long as the classification will be actionable, but often times it is not.  There is a great deal of confusion about just what data mining can and cannot do and I’m just trying to bring some clarity to this issue for Marketing folks. 

Bottom line: classifying people into “buckets” is not particularly helpful without some end result to act on as a result of having people in these buckets.  Ask yourself: if I know that people differ in a certain way, what will I do with that information, how will I act on it? 

The most common mistake in this area is thinking demographics in some way predict behavior.  Demographics are not predictive, they are merely suggestive, yet many marketers cling to demos because that’s what they grew up with.  And then the analysts jump right in and say, “We can segment this population by demographics using data mining!” and you’re right off down the rat hole.  Then the Marketers create programs with an Objective of influencing behavior based on this demographic segmentation and wonder why they don’t work.

I certainly don’t have a problem with using “models” in general to solve Business and Marketing problems – that’s what I do for a living. 

What I do have a problem with is the tendency to throw brute force machine learning technology at Marketing problems that ultimately can’t be solved using that particular approach.  It’s a waste of time and money.  Paula, I think this is an area similar to your: “If this is the answer, what was the question?”

Said another way, detecting a behavior and predicting one are very different Objectives, and a lot of what you want to do in Marketing is prediction, not detection; it’s a “when” question, not a “who” question.  Often in Marketing, by the time you know “who”, it’s too late to do anything about it.  So Marketers need to know the probability of, the propensity to, not a classification of  “who” after something happens. 

On the flip side, if I have a prediction or propensity already, and then you want to tell me “who” they are with data mining, that’s fine, provided that information will make any difference.  And here we get to the crux of my comment: knowing who after I have the propensity usually does not make any difference at all.  On this point I am sure there will be a lot of disagreement, but I urge anybody who disagrees to simply test the hypothesis.  Show me the time, money, and effort spent on finding out “who” created enough economic value to pay off the investment, created incremental profit beyond the profit generated by simply understanding the propensity all by itself.

More data is not the answer; only the right data is required.  Huge numbers of models are not the answer either; just because I can segment doesn’t mean that segmentation is worth anything.  Data / model output can be considered as must know, good to know, nice to know, and who cares?  Machine learning technologies seem to drive much more “who cares” than “need to know” output, and people end up drowning in irrelevant noise.   This is not a fault of the technology, but the application of it improperly.

For most Marketing needs, data mining is like “crop dusting with the SST”, to quote a former CEO I worked for.  Discovering a Marketing problem is typically the easy part and doesn’t require data mining; taking the right action to solve the problem is where the difficulty lies and machine learning is not going to provide that answer, despite many people hoping or believing it is true.

Of course, the inability of many Marketers to understand and communicate the actual problem they are trying to solve, and / or the inability of many technology people to turn those requirements into an actionable solution, is a different story that we won’t begin to address in this forum.  To the extent either one is responsible for the misapplication of a certain technology to solving a problem, oh well, where have we heard that before.

I hope I explained my position more clearly this time!

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss

***** What Data Mining Can and Can’t Do

Timing, Counting, & Choice.  “Most real-world business problems are just some combination of those building blocks jammed together” – Peter Fader

Over at CIO Insight we have this very practical article on Data Mining by Fader.  What it’s good for, what it’s not good for.  If you have wondered how you might use this tool, especially if you are a Marketer, you should read this article. 

I say the article is practical because even though there are many ways to create mathematical models of customer data, if the end result is not something a Marketer can use to actually increase Marketing Productivity, then you really cannot do much with the output.  The models have to create leverage of some kind that can be used to take real world action.  In other words, a model can be “technically correct” but completely useless to a Marketer.

For example, just because you can identify a segment doesn’t mean it is practical or viable to address that segment with a unique marketing treatment.  And just because the segment has unique characteristics doesn’t mean those characteristics create any real marketing opportunity.

Key takeaways for Marketers from this article should be:

1.  Too much data tends to mess up a model.  This is especially true if you try jamming all kinds of demographic crap into a model that is trying to predict behavior.  If you want behavior as an output, use behavioral variables in your models.

2.  Data mining is a great classification tool; it is good at telling you why segments are different.  But in order for this to be useful, you need actionable segments to begin with.  For example, data mining can tell you the demographic differences between people likely to respond versus people not likely to respond – if there is a demographic difference.  But you have to know this “likely to respond” element first.  While we’re on this topic, the same idea holds true for surveys.  If you want the survey output to be actionable, get to known behavioral segments first, then do your surveys of each segment.

Often, people use technical tools for the wrong Marketing reasons.  I see this problem coming down the tracks in web analytics, people are getting so wrapped up in the minutia and the automation of testing they are missing out on the basic stuff.  Just like the data mining wave got people off track and into the bushes with “collecting all the data so we can mine it”.  But it doesn’t matter how much data you have, the tool does what it does and doesn’t do what it doesn’t do.

Check out the article What Data Mining Can and Can’t Do here.

Any thoughts from the Data Miners out there on this?

Share:  twittergoogle_plusredditlinkedintumblrmail

Follow:  twitterlinkedinrss