What Data Mining Can and Can’t Do, Part 2

The previous post was about what data mining is good for and what it is not good for, and how to use data mining properly for Marketing efforts.  This post further explains this concept in response to comments received.

Detecting credit fraud, especially with a data set as huge as the one at MCI, is a perfect application for data mining – classification, as in “this is fraud, this is not”.  These are not predictions, they are classifications based on a certain type of behavior that has already occurred.  As long as what a Marketer is really trying to accomplish is classification, then data mining is a great tool.  If you are trying to predict behavior, not so good.

I agree data mining has “real potential is to call attention to things for further investigation” as long as the classification will be actionable, but often times it is not.  There is a great deal of confusion about just what data mining can and cannot do and I’m just trying to bring some clarity to this issue for Marketing folks. 

Bottom line: classifying people into “buckets” is not particularly helpful without some end result to act on as a result of having people in these buckets.  Ask yourself: if I know that people differ in a certain way, what will I do with that information, how will I act on it? 

The most common mistake in this area is thinking demographics in some way predict behavior.  Demographics are not predictive, they are merely suggestive, yet many marketers cling to demos because that’s what they grew up with.  And then the analysts jump right in and say, “We can segment this population by demographics using data mining!” and you’re right off down the rat hole.  Then the Marketers create programs with an Objective of influencing behavior based on this demographic segmentation and wonder why they don’t work.

I certainly don’t have a problem with using “models” in general to solve Business and Marketing problems – that’s what I do for a living. 

What I do have a problem with is the tendency to throw brute force machine learning technology at Marketing problems that ultimately can’t be solved using that particular approach.  It’s a waste of time and money.  Paula, I think this is an area similar to your: “If this is the answer, what was the question?”

Said another way, detecting a behavior and predicting one are very different Objectives, and a lot of what you want to do in Marketing is prediction, not detection; it’s a “when” question, not a “who” question.  Often in Marketing, by the time you know “who”, it’s too late to do anything about it.  So Marketers need to know the probability of, the propensity to, not a classification of  “who” after something happens. 

On the flip side, if I have a prediction or propensity already, and then you want to tell me “who” they are with data mining, that’s fine, provided that information will make any difference.  And here we get to the crux of my comment: knowing who after I have the propensity usually does not make any difference at all.  On this point I am sure there will be a lot of disagreement, but I urge anybody who disagrees to simply test the hypothesis.  Show me the time, money, and effort spent on finding out “who” created enough economic value to pay off the investment, created incremental profit beyond the profit generated by simply understanding the propensity all by itself.

More data is not the answer; only the right data is required.  Huge numbers of models are not the answer either; just because I can segment doesn’t mean that segmentation is worth anything.  Data / model output can be considered as must know, good to know, nice to know, and who cares?  Machine learning technologies seem to drive much more “who cares” than “need to know” output, and people end up drowning in irrelevant noise.   This is not a fault of the technology, but the application of it improperly.

For most Marketing needs, data mining is like “crop dusting with the SST”, to quote a former CEO I worked for.  Discovering a Marketing problem is typically the easy part and doesn’t require data mining; taking the right action to solve the problem is where the difficulty lies and machine learning is not going to provide that answer, despite many people hoping or believing it is true.

Of course, the inability of many Marketers to understand and communicate the actual problem they are trying to solve, and / or the inability of many technology people to turn those requirements into an actionable solution, is a different story that we won’t begin to address in this forum.  To the extent either one is responsible for the misapplication of a certain technology to solving a problem, oh well, where have we heard that before.

I hope I explained my position more clearly this time!

Share:  twittergoogle_plusredditlinkedintumblrmail


Follow:  twitterlinkedinrss

5 thoughts on “What Data Mining Can and Can’t Do, Part 2

  1. Hi Jim,
    A great post indeed.

    You have rightly pointed what data mining can do and what it cant do. But I find something missing in your post and that is, what exactly marketers neeed to do and what an analyst need to.

    I think, the need of the hour is a very strong consulting who is aware of both the analytics and the marketing. Analytic vendors (in house analytics team) need to sale the strategy based on analytic results (analysis + models) and not the raw results to the marketers. Then half the problem is solved.

    I suggest good Customer Profiling after segmentation and Decision Table after Modeling (models either one or more).

    What do you think?

  2. I think we are both saying the same thing in different ways!

    Data-oriented technology (data mining, web analytics, CRM, etc.) is often sold as a “cure” when it simply provides a path. The need for marketers (or consultants) to understand both what the technology can do and how to use the output is very real.

    This problem is particularly acute when the Marketer has a background in general advertising; they have trouble moving from conceptualizing everything in terms of “who” (demographics) as opposed to “what” (behavior). The creating of marketing solutions is very different for each.

    And too often the analyst acts as an “enabler” in this regard…just because something can be analyzed or modeled doesn’t mean the output will be helpful or actionable. Each side needs to push back on the other.

    Problem there is, you can’t push back if you lack the background to do so.

    Hmmm…maybe I should write a book on solving this problem.

    Oh yea, I did!

  3. As a data/analytical consultant to direct marketers, I hate seeing analytics go to ‘waste’. For example, the cool segmentation scheme that was never acted upon–the marketer who realized, after the fact, that while it is cool to find those segments of customers, their budget didn’t include several creative versions or new offer testing to take advantage of the findings…

    A clear goal for each and every analysis must be articulated before any data is prepped. Actionable outcomes (what do we implement as a result of the analysis) must be laid out, budgetted and agreed-upon by the pertinent decision-makers. Then, after the work is done, results that show how the program improved (hopefully) as a result of the analysis must be shared–communicated within the organization.

    If all of the above are done the right way, you’re laying the groundwork for future success. Job security is another side benefit :)

  4. Thanks for the comment Suzanne, agreed!

    And just to circle your comment back, I think I have seen more “wasted analytics” using data mining than any other approach, because folks simply don’t know what it is good for and what it is not good for…

Leave a Reply

Your email address will not be published. Required fields are marked *