Millennium Marketing Research®
Tom Schori DBA Millennium Marketing Research®, 808 Ironwood, Normal IL 61761, 309-532-8466

Statistics don't lie, but those who use them. . .

By Thomas R. Schori, Ph.D., and Michael L. Garee, Principals,  Millennium Marketing Research®, 808 E. Ironwood, Normal, IL 61761-5239.  

NOTE: This article was also published in the Sept. 15, 1997, issue of American Marketing Association's Marketing News

Like sharks to a feeding frenzy, the world was quickly attracted to the tome "How to lie with statistics," when it was first published some years ago. Undoubtedly, at least part of the attraction stemmed from a belief that the unscrupulous use statistics to deceitfully convince people of this or that.

It may well be that some intellectually dishonest individuals do indeed use statistics to intentionally deceive. On the other hand, however, the far greater problem is that posed by those who deceive out of ignorance. These people simply don’t know how to objectively evaluate statistics. Statistical tools, be they descriptive or inferential, are now readily available to the masses, principally through various popular software programs. And the masses have been quick to embrace and use them, sometimes to their own distinct disadvantage, as well as to the disadvantage of those they may influence.

In recent years, the marketing literature has been replete with information and case studies about the relatively low costs associated with retaining customers in contrast to acquiring new ones. Consequently, it is no wonder that there has been a fascination with identifying that which "drives" customer retention, as well there should be.


Some marketers use statistics merely to 'prove' their beliefs


What frequently happens, though, is a marketer has some a priori belief about what drives customer retention, and then sets out to prove that belief. For example, assume that the marketer believes that the customer’s overall satisfaction with his or her product or service "drives" customer retention. To prove that belief, the marketer may ask 300 customers who’ve just experienced a transaction to rate their overall satisfaction with the organization on a 5-point scale, where "5" equals "greatly above expectations" and "1" equals "greatly below expectations." Then some time later, say a year, the marketer will look to see how many of these 300 are still customers.

Let’s suppose the marketer simply calculated the frequency distribution (a descriptive statistic) which I’ve depicted below:

Overall Satisfaction Category

Rating

Retention Ratio

Customers

Greatly above expectations

5

.90

20

Exceeded expectations

4

.89

180

So-so

3

.85

75

Below expectations

2

.87

15

Greatly below expectations

1

.80

10

Average

3.62

.88

 

From this table, we see that 20 customers rated their overall satisfaction with the product or service as a "5," i.e., it greatly exceed their expectations. Among those 20 individuals that had rated it as a "5," it was found that 90% were still customers a year later. Among those rating it as a "4," 89% remained customers. Overall, 88% of the customers were still there a year later.

At first blush, one might conclude something like this: "Wow, were I to provide such outstanding service that we greatly exceeded every customer’s expectations, I could increase overall retention by two points, from 88% to 90% and I’d be a hero!" And, without a doubt, that’s precisely what this descriptive statistic (that is, the frequency distribution) appears to show!


'Correlation' isn't necessarily 'causation'


There is a relationship, but just how strong (and significant) is it? Thank heavens for knowledgeable, professional statisticians. He or she merely calculates a simple inferential statistic, a correlation coefficient (r), which will portray the magnitude of a supposed relationship between two variables. While correlation does not imply causation, it does depict the degree to which variations in the scores on one variable correspond to variations in the scores on another variable.

A correlation coefficient (r) ranges in value from 1 to –1. A coefficient of "1" indicates that variations in the score of one variable are perfectly predictable of variations in the score of the other variable, i.e., they are directly related.

In our "satisfaction"-"retention" example, then, the higher the "satisfaction" score, the higher the "retention" score. At the other extreme, a coefficient of "-1" also indicates that the two variables are perfectly predictable of one another, but in an inverse manner. That is, and again using our "satisfaction"-"retention" example, the lower the "satisfaction" score, the higher the "retention" score.

As the correlation coefficient approaches zero, the degree of relationship between the two variables also approaches zero, i.e., meaningless. A correlation coefficient of zero simply means that scores on one variable are in no way predictable of scores on the other.

When our statistician calculates a simple correlation coefficient (technically, a Pearson Moment Product Correlation Coefficient) for the "satisfaction"-"retention" example, he or she finds that r=.06, which of course means that the relationship between how customers judge overall satisfaction is not very strongly related to customer retention. In fact, he or she then calculates the r2, which represents the proportion of variation in one variable that can be accounted for by the variation in the other. In this case, the r2=.0036. What this simply means is that .4 of 1% of the variation in customer retention can be accounted for by variation in customer perception of overall satisfaction. Said differently, 99.6 % of the variation in customer retention is not related to overall satisfaction. Disappointing, but nonetheless true!

If the marketer did not have access to the services of a knowledgeable statistician, he or she may well want to trumpet the "fact" that overall satisfaction "drives" customer retention. Not knowing any better, he or she might proclaim that, "We must direct all our efforts toward improving overall satisfaction as a means of improving customer retention." Though the marketer is not intentionally deceiving, he or she is directing the efforts of the organization in the wrong direction. Instead, they should be looking for a stronger "driver" of customer retention.

While folks may not intentionally use statistics in a deceptive fashion, they can nonetheless unintentionally use them in a fashion which may lead the organization astray. Sometimes, literally into the abyss.