All customers are not created equal

Editor’s note: Jon Pinnell is president/COO of MarketVision Research, Cincinnati.

The notion of derived importance is not a new one. In fact, heated debates over the merits of derived importance versus stated importance have resounded in conference rooms and industry publications alike. All debates aside, customer satisfaction researchers often use statistical methods to infer how “important” various product and service attributes, or drivers, are to overall satisfaction scores or customer loyalty indices.

Importance measures are known by many names and presented in several formats, including importance-performance grids, key driver analyses, and quadrant maps. Techniques used to derive these measures include correlation, linear regression, logistic regression and logit models, to name just a few.

While researchers must be cognizant of the many potential statistical pitfalls of each technique - such as autocorrelated observations or non-normal error distributions - researcher awareness of conceptual pitfalls is even more tantamount. For example, customers may overwhelmingly agree with an attribute solely due to question wording. This may result in a lack of variability and obscure the association between the driver and the response. Similarly, using linear models to describe non-linear relationships may also obscure strong, albeit nonlinear, associations. The list of potential pitfalls is long.

An example

Perhaps the most troubling shortcoming of traditional methods is the assumption that essentially treats all customers as though they share similar importance structures. Consider the following case in point:

A researcher wishes to regress the performance ratings of three attributes onto an “overall satisfaction” measure. Suppose there are two populations in the data with known - and different - importance structures.

Table 1: Known Parameters

   

    ß1    

      ß2     

      ß3     

    Relative Size of Population

Population 1    

1.00

0.60

0.30

70%

Population 2    

0.30

0.60

1.00

30%

After conducting a simple linear regression by regressing X1, X2, and X3 onto the overall rating, we observe the following estimates for ß1, ß2, and ß3:

Table 2: Regression Output

   

      ß1     

      ß2     

      ß3     

Parameter Estimates    

0.60

0.65

0.71

The model appears to fit well as the Adjusted R2 is a respectable .75 and all three parameter estimates are significant (p < 0.05). From a model fit standpoint, such regression results would please many analysts.

However, we see that the model suggests that all attributes are, for all practical purposes, of equal importance. A researcher conducting this aggregate level analysis would be unaware that two distinct populations exist and would be likely to incorrectly recommend that all attributes are equally worthy of further investment/improvement.

This example demonstrates that methods for deriving importances for customers en masse are insensitive to individual differences, thus fail to fully explain much of the information in the data. Our discussion turns to a set of methods that solves this problem latent class analysis (LCA). While LCA applies to many modeling techniques, we focus on its application to driver analysis using simple linear regression.

Latent class analysis

LCA is an iterative technique that identifies market segments while simultaneously estimating separate parameters for those segments. LCA begins by assigning respondents into k arbitrary and deterministic classes, where k is the number of classes in the model. For each class, a separate regression model is estimated. Using maximum likelihood methods, the probability that each respondent belongs to each class is then determined.

Next, another regression model is estimated for each class - this time the data is weighted based on the probability that each respondent belongs to that class. These iterations continue, alternately estimating new weighted class-level models and readjusting the probability of each respondent’s membership in each class. When no class assignments change with an additional iteration, the iterations halt and the final model is reached.

Let us turn back to our dataset to demonstrate the power behind this technique.

Again, the researcher wishes to regress X1, X2 and X3 onto the overall ratings. The population parameters are again known and displayed below:

Table 3: Known Parameters

   

     ß1     

      ß2     

      ß3     

    Relative Size of Population

Population 1    

  1.00  

  0.60  

  0.30  

  70%  

Population 2    

  0.30  

  0.60  

  1.00  

  30%  

After fitting a two-class latent class regression model to these data, the researcher obtains the following parameter estimates:

Table 4: Regression Output

   

      ß1     

      ß2     

      ß3     

    Relative Size of Population

Class 1    

  1.15  

  0.65  

  0.34  

  61%  

Class 2    

  0.12  

  0.68  

  1.03  

  39%  

This researcher has located two distinct and identifiable groups in the data. The first group places the highest degree of importance on X1, while the second group places the most importance on X3. The researcher would now correctly conclude that, for a large percentage of the population, efforts to improve the organization’s performance on X1 would yield the most influence on overall satisfaction. Similarly, efforts to improve organizational performance on X3 would impact overall satisfaction for a smaller, but readily identifiable, percentage of the customer population.

Addresses a pitfall

Latent class analysis addresses one of the most significant pitfalls of traditional methods for computing derived importances - respondent heterogeneity. These methods provide a deeper understanding of individual differences between customers and allow the analyst to segment a market without the need to make a priori assumptions about important basis variables for the segmentation. Further, LCA yields a much more accurate decision-making path for enhancing overall satisfaction - not of an “average” customer that may or may not exist - but rather of homogeneous subgroups who share similar importance structures.