Editor's note: William G. McLauchlan, Ph.D., is principal, McLauchlan & Associates, Cincinnati.

The notion that overall supplier satisfaction can be "modeled" as a function of attribute performance or satisfaction ratings is not new. Those who advocate this type of analysis typically advance a premise along the following lines: if the variance in supplier performance ratings on a given attribute explains variance in overall satisfaction with the suppliers, then the attribute is determinant of satisfaction. On the face of it, the premise is compelling. At the same time, however, it is fraught with both mathematical and philosophical dangers.

Consider the following simple (hypothetical) data set:

Overall
Satisfaction

Satisfaction
A

Satisfaction
B

Satisfaction
C

7

8

8

7

8

9

8

5

6

7

8

5

9

8

8

4

4

4

8

3

9

9

8

8

Visual examination of the data indicates that, in general, as ratings on Attribute A increase, so too do the ratings on Overall Satisfaction. Further, while not as striking as the apparent relationship between Attribute A and Overall Satisfaction, ratings on Attribute C also appear to be positively correlated with Overall Satisfaction. Finally, Attribute B is constant regardless of Overall Satisfaction (8). Calculating the Pearson bivariate correlation coefficients confirms these observations:

A

C

Overall Satisfaction

.91

.54

A

.69

Proponents of correlational or regression-based approaches to satisfaction would argue that Attribute A, and to a lesser extent Attribute C, are determinants of satisfaction. Attribute B would be regarded as inconsequential in the understanding of overall satisfaction.

The correlational analysis can be extended in the following simple regression analysis:

Overall Satisfaction=Constant + (b1) (Attribute A) + (b2) (Attribute C)

(Because Attribute B exhibits linear dependency, the matrix with B included is ill-conditioned. Therefore, Attribute B is dropped from the analysis.)

The following parameters are estimated:

Constant

0.13

b1

1.07

b2

-0.18

Here is where the trouble begins. The negative coefficient associated with Attribute C would suggest that one way to increase Overall Satisfaction would be to strive for lower satisfaction on Attribute C. Clearly this is not the case. The bivariate correlations indicate that Overall Satisfaction and Attribute C are positively correlated. The seeming inconsistency actually results from the relatively high positive correlation between Attributes A and C. When two regressor variables are themselves highly correlated (collinear), the parameter estimates are unstable and therefore, unreliable.

In a more realistic example where one might have 20 or more regress or variables, the problem (now multicollinearity) is exacerbated. Consider an industrial product category where satisfaction or performance might be measured on such attributes as on-time delivery, product availability, adequate inventory, and compliance with shipping dates. Another group of attributes might be related to pricing policies and value, and so forth. It quickly becomes obvious that in our efforts to quantify satisfaction on as many aspects of the product offering as possible, we are typically measuring the same dimension in several similar and likely correlated ways.

To handle these types of situations, a principal components analysis is often performed. By reducing the attributes to their underlying and orthogonal or independent constructs, it is possible to avoid problems of multicollinearity. At the same time, however, the regression-based analysis can become even less appealing. Experience tells us that because we usually have many fewer observations per attribute than is demanded for a robust principal components analysis, the solutions that we typically settle on may explain only 50%-60% of the variation in the data. As such, any regression analysis based on component scores is already compromised by the unexplained variation in the components.

Further complicating the issue is the plain fact that the regression analysis itself will never explain 100% of the variance in the criterion. Typical R2 values for satisfaction regressions are never as high as one would like and are often very low (.20-.40). The question begged by these kinds of outcomes can be stated quite simply: Is it reasonable to ask management to make decisions based on models that explain only 15% or so of the variation in satisfaction measures using component scores that explain only 50% of the variation in the attribute ratings?

If the mathematical issues discussed thus far were not enough to raise serious concerns about the value of regression-based satisfaction analyses, there are also philosophical issues to consider. Proponents of a regression-based approach to satisfaction often view the results as surrogates for attribute importance measures. In other words, an attribute or component that explains a significant portion of the variation in the criterion is deemed to be important; attributes that are not significant predictors of satisfaction are not important.

The risks in this argument are two-fold. First, as is well known, correlation is not synonymous with causation. To suggest, then, that an organization would improve Overall Satisfaction ratings by improving on attributes that are correlated with the Overall measure is an extremely dangerous proposition.

The second element of risk is a reflection of what Carl Finkbeiner of National Analysts referred to at the recent Sawtooth Software Conference as the "world-as-it-is" perspective presented by satisfaction-based regression analyses. This perspective can be understood by reconsidering the hypothetical data set.

Recall that the ratings on Attribute B were a constant 8 and that the regression-based analyst would regard Attribute B as inconsequential as it relates to Overall Satisfaction. Assume for a moment that stated attribute importance ratings were also collected and that the complete data set was as follows:

Overall
Satisfaction

Satisfaction
A

Satisfaction
B

Satisfaction
C

7

8

8

7

8

9

8

5

6

7

8

5

9

8

8

4

4

4

8

3

9

9

8

8

Stated Importance ==>

5

9

7

By using the correlational results as a measure of attribute importance or determinance, in spite of the already discussed dangers of this approach, Attribute A emerges as most important, followed by Attributes C and B. Using stated importance ratings, Attribute B is most important, followed by Attributes C and A.

The fact that satisfaction on Attribute B is high and equivalent across observations should not tee construed to mean that Attribute B is unimportant. It simply means that performance is currently satisfactory, and highly so, on an attribute that, on a stated basis is extremely important. A manufacturer that considers allocating resources away from Attribute B because performance in a "world-as-it-is" today perspective is fine is at serious risk. As soon as performance drops on the now disregarded Attribute B, the variance that was once missing in the ratings data is introduced and correlational or regression-based results would reflect the newfound importance of Attribute B. Unfortunately, it is too late for the manufacturer.

There are numerous other risks associated with regression based approaches to satisfaction analyses. For one, missing data can be the curse of any multivariate technique. If a respondent chooses not to rate an organization on one or two attributes, "plugging" the missing values with means is an acceptable procedure. If more than a few ratings are missing, the respondent is lost for further analyses. While there are techniques available for directly factoring correlation matrices, as opposed to the raw data, the results of these analyses can be highly unstable and misleading.

Also, regression-based results can vary widely depending on the way the model is specified. Stepwise regression and forward and backward selection techniques can produce different outcomes when compared to regression procedures that fit a full model. The manner in which the sums of squares are partitioned (Type I versus Type 11) will lead to different conclusions as well. Anyone venturing into these areas is well advised to consider the implications of the modeling technique.

In conclusion, regression-based satisfaction analyses are not as straightforward as might appear at first glance. They do not produce causal models, they are subject to problems related to ill-conditioned data (multicollinearity, missing values), they do not do a particularly good job of explaining the variance in the Overall Satisfaction measure, and they reflect a "world-as-it-is" perspective. I believe that we should place greater weight on stated attribute importance ratings and perform importance-performance and GAP analyses. Not everything we do needs to be "modeled."