Skip to: Main Content / Navigation

All customers are not created equal



Article ID:
20011216
Published:
December 2001
Author:
Jon Pinnell

Article Abstract

Customer satisfaction researchers often use statistical methods to infer how "important" various drivers are to overall satisfaction scores or customer loyalty indices. This article discusses latent class analysis, a technique that addresses respondent heterogeneity, which is one of the most significant pitfalls of traditional methods for computing derived importances. These methods provide a deeper understanding of individual differences between customers.

Editor’s note: Jon Pinnell is president/COO of MarketVision Research, Cincinnati.

The notion of derived importance is not a new one. In fact, heated debates over the merits of derived importance versus stated importance have resounded in conference rooms and industry publications alike. All debates aside, customer satisfaction researchers often use statistical methods to infer how “important” various product and service attributes, or drivers, are to overall satisfaction scores or customer loyalty indices.

Importance measures are known by many names and presented in several formats, including importance-performance grids, key driver analyses, and quadrant maps. Techniques used to derive these measures include correlation, linear regression, logistic regression and logit models, to name just a few.

While researchers must be cognizant of the many potential statistical pitfalls of each technique — such as autocorrelated observations or non-normal error distributions — researcher awareness of conceptual pitfalls is even more tantamount. For example, customers may overwhelmingly agree with an attribute solely due to question wording. This may result in a lack of variability and obscure the association between the driver and the response. Similarly, using linear models to describe non-linear relationships may also obscure strong, albeit nonlinear, associations. The list of potential pitfalls is long.

An example
Perhaps the most troubling shortcoming of traditional methods is the assumption that essentially treats all customers as though they share similar importance structures. Consider the following case in point:

A researcher wishes to regress the performance ratings of three attributes onto an “overall satisfaction” measure. Suppose there are two populations in the data with known — and different — importance structures.


Table 1: Known Parameters

   

    ß1    

      ß2     

      ß3     

    Relative Size of Population

Population 1    

1.00

0.60

0.30

70%

Population 2    

0.30

0.60

1.00

30%


After conducting a simple linear regression by regressing X1, X2, and X3 onto the overall rating, we observe the following estimates for ß1, ß2, and ß3:


Table 2: Regression Output

   

      ß1     

      ß2     

      ß3     

Parameter Estimates    

0.60

0.65

0.71


The model appears to fit well as the Adjusted R2 is a respectable .75 and all three parameter estimates are significant (p < 0.05). From a model fit standpoint, such regression results would please many analysts.

However, we see that the model suggests that all attributes are, for all practical purposes, of equal importance. A researcher conducting this aggregate level analysis would be unaware that two distinct populations exist and would be likely to incorrectly recommend that all attributes are equally worthy of further investment/improvement.

This example demonstrates that methods for deriving importances for customers en masse are insensitive to individual differences, thus fail to fully explain much of the information in the data. Our discussion turns to a set of methods that solves this problem latent class analysis (LCA). While LCA applies to many modeling techniques, we focus on its application to driver analysis using simple linear regression.

Latent class analysis
LCA is an iterative technique that identifies market segments while simultaneously estimating separate parameters for those segments. LCA begins by assigning respondents into k arbitrary and deterministic classes, where k is the number of classes in the model. For each class, a separate regression model is estimated. Using maximum likelihood methods, the probability that each respondent belongs to each class is then determined.

Next, another regression model is estimated for each class — this time the data is weighted based on the probability that each respondent belongs to that class. These iterations continue, alternately estimating new weighted class-level models and readjusting the probability of each respondent’s membership in each class. When no class assignments change with an additional iteration, the iterations halt and the final model is reached.

Let us turn back to our dataset to demonstrate the power behind this technique.

Again, the researcher wishes to regress X1, X2 and X3 onto the overall ratings. The population parameters are again known and displayed below:


Table 3: Known Parameters

   

     ß1     

      ß2     

      ß3     

    Relative Size of Population

Population 1    

  1.00  

  0.60  

  0.30  

  70%  

Population 2    

  0.30  

  0.60  

  1.00  

  30%  


After fitting a two-class latent class regression model to these data, the researcher obtains the following parameter estimates:


Table 4: Regression Output

   

      ß1     

      ß2     

      ß3     

    Relative Size of Population

Class 1    

  1.15  

  0.65  

  0.34  

  61%  

Class 2    

  0.12  

  0.68  

  1.03  

  39%  


This researcher has located two distinct and identifiable groups in the data. The first group places the highest degree of importance on X1, while the second group places the most importance on X3. The researcher would now correctly conclude that, for a large percentage of the population, efforts to improve the organization’s performance on X1 would yield the most influence on overall satisfaction. Similarly, efforts to improve organizational performance on X3 would impact overall satisfaction for a smaller, but readily identifiable, percentage of the customer population.

Addresses a pitfall
Latent class analysis addresses one of the most significant pitfalls of traditional methods for computing derived importances — respondent heterogeneity. These methods provide a deeper understanding of individual differences between customers and allow the analyst to segment a market without the need to make a priori assumptions about important basis variables for the segmentation. Further, LCA yields a much more accurate decision-making path for enhancing overall satisfaction — not of an “average” customer that may or may not exist — but rather of homogeneous subgroups who share similar importance structures.


Page Tools
Bookmark and Share

Related Suppliers: Research Companies from the SourceBook

Click on a category below to see firms that specialize in the following areas of research and/or industries

Specialties

Conduct a detailed search of the entire Researcher SourceBook directory

Related Articles

There are 756 articles in our archive related to this topic. Below are 5 selected at random and available to all users of the site.

Determining product feature price sensitivities
This article discusses several approaches to determining customer price sensitivities – analyzing actual sales as a function of price, laboratory purchase experiments and preference studies where buyers are asked to express their purchase likelihoods for a product at various price levels. The article then describes the use and advantages of a form of conjoint analysis that allows researchers to estimate both feature prices and the overall price in order to better measure price sensitivities of consumers.
Data Use: Statistical non-significance does not mean unimportant
Information from statisical significance testing is necessary but is not always sufficent. Statistical significance does not provide information about the impact of the significant result on business. This should be evaluated using an effect size index (e.g., eta-squared).
Qualitatively Speaking: Ten tips for using interpreters in international research
How to maximize the effectiveness of your interpreter, including advice on topics such as eye contact, where and when to pause and how to prepare yourself and the interpreter.
Vexed by significance testing? Try the bootstrap technique
Significance testing can be difficult to teach and learn. This article explains how the bootstrap technique is simple to use and understand, valid and valuable-in hypothetical and real-world application. Though not new, the technique is becoming newly accessible to a majority of market researchers with varying degrees of computing resources.
Automating market segmentation
Classification tree software scans through survey-type data sets to automatically identify the key multi-dimensional attributes that define a customer segment. This article describes how Market Facts of Canada Limited applied this tool to analyze its Market Facts' Household Flow of Funds survey. It explores advantages of using classification trees, including the ability to summarize multi-dimensional data relationships and evaluate all possible segmenting variables at any level of the tree.

See more articles on this topic

Related Events

DATA MATTERS CONFERENCE
February 17, 2010
Research Magazine will hold a conference, themed 'Data Matters,' on February 17 at the Mayfair Conference Centre in London.
RIVA COURSE 241: QUALITATIVE ANALYSIS AND REPORTING
February 18-19, 2010
RIVA Training Institute will hold a course, themed 'Qualitative Analysis and Reporting' on February 18-19 in Rockville, Md.

View more Related Events...

Related Discussion Topics

TURF Simulator
01/11/2010 by William Bailey
TURF Simulator
01/08/2010 by Manmit J. Shrimali
TURF in Excel
07/14/2009 by William Bailey
TURF excel-based simulator
07/13/2009 by Kris Kumar
Stat testing / Bonferroni correction
05/06/2009 by Ian L. Straus

View More

Related Glossary Terms

Search for more...