Editor’s note: Steven Hokanson is president and founder of Pattern Discovery, Inc., a Honolulu research firm.

When a child asks “Why is the sky blue?” there are several answers you could give. One would describe the energy from the sun in the form of visible light and how the atmosphere of the Earth transforms it into the color blue. A second answer would describe the human eye and how it perceives visible light, especially the color blue. Yet a third answer would discuss the evolution of words and the word for blue in different languages. All of these answers would be both valid and informative, depending on how good your explanation is and how bright the child is. As the adult, it is up to you to decide on the appropriate answer.

Similarly, when a client wants to know what drives customer satisfaction (“Why aren’t all my customers satisfied?”), there are also different answers you could give. One answer would be to focus on the client’s product or service, the features and aspects of what the client is providing to customers. A second would focus on the customers, their needs, desires, perceptions and expectations. Yet a third would focus on what is meant by the word satisfaction; its different meanings or the relationship between expectations and actualization. All these answers would be good and worthy of detailed analysis and explication. As the market research analyst, it is up to you to decide on the appropriate answer.

Most often what clients want to know is where they should spend their time, energy and money in order to increase their customers’ satisfaction. Analyses that focus on where to expend resources fall into the first category of answers listed above: features and aspects of the product or service. Qualitative research (e.g., focus groups) helps identify the more subtle aspects of the client’s product and service while quantitative research (e.g., closed-ended surveys) measures what percentage of customers rate the client’s offerings at different levels. The quantitative data is usually rich enough to permit assessments of the relationships between overall customer satisfaction and detailed aspects of the product or service.

It is here, in assessing the relationships between overall satisfaction and detailed variables, that I believe there are opportunities for improved analytical methods. With better methods we can give our clients better answers and improve the efficiency and effectiveness of both their marketing and operations.

Background

For the analysis of relationships between  variables what I see most often in market research literature is regression analysis. Usually this is multiple linear regression though there is frequent mention of a related methodology, principal components analysis. Both of these methodologies have several major flaws when applied to customer satisfaction and the limitations of these linear methods have been described in detail in numerous other papers, journals, articles and books. However, I feel I need to touch on their primary weaknesses in order to contrast these older methods with the new one I want to introduce for your consideration.

Linear methods are based on several assumptions: the data are numeric (or more precisely, interval), there is no missing data, the relationships between the independent variables and the dependent variable are linear, and there are no (or inconsequential) interrelationships between the independent variables. None of these assumptions is valid for most customer satisfaction data.

Interval data

The measure of customer satisfaction is predominantly ordinal. One example of a commonly used scale is: completely satisfied/very satisfied/satisfied/dissatisfied/not at all satisfied. This scale typically has numeric values running from 5 down to 1 associated with it. However, the use of the associated interval scale (i.e., 1, 2, 3, 4, 5) does not mean the data is intervally scaled. The amount of difference between “satisfied” and “very satisfied” might be more or less than the difference between “not at all satisfied” and “dissatisfied.” I have seen no compelling explanation as to why these differences should be assumed equal. Usually the assumption that the data is intervally scaled is dictated as necessary by the linear methods.

Missing data

A second major difficulty with using linear methods to analyze customer satisfaction data is missing data. For products, some customers do not purchase all the optional features. For services, some customers do not experience the full range of services. In both cases the customers who do not participate fully leave answers blank on the survey (e.g., “does not apply,” “no opinion”). Linear methods require values to be provided for all answers for all customers in order to perform the analysis. How values for the blank questions are determined is less important than the fact that the values are filled in by the analyst instead of the customer.

Linear relationships

A third major flaw in using linear methods is the assumption of linearity itself. Very few of the independent variables have a linear relationship to the dependent variable. This is especially true at the end points of the scale. Eventually, the customers do not want the service to be any friendlier; the customers don’t care if the product is yet even more clean or bright or new or colorful or better packaged. At the other end of the scale, if the product is dirty, tarnished, old-fashioned, dull colored or poorly packaged, doing any of these things even worse isn’t going to decrease the customers’ satisfaction any more. They have already decided to never buy it again. In actual practice, the relationship between satisfaction with a detailed aspect of a product or service to overall customer satisfaction is usually a curve with several threshold points.

Customer satisfaction changes sharply once certain threshold values are reached. A simple example is how hot to serve coffee. Cold coffee is usually unacceptable and small increases in temperature do not change the customer’s overall satisfaction. At some point the coffee reaches lukewarm and customer satisfaction goes up a notch. At the threshold where it becomes warm (and later hot) satisfaction will increase another notch. Note that increasing the temperature to scalding causes satisfaction to decrease. Basically, the assumption of linearity is wrong for the vast majority of variables because of 1) non-linear jumps at threshold points, 2) the drop-off in sensitivity at end points, and 3) the downward slope at the extreme upper end. Money is the primary exception to this non-linear rule, because people usually measure satisfaction as inversely linear to cost. Almost everything else is non-linear.

Interdependency of detailed variables

In customer satisfaction data many strong relationships exist among the detailed aspects of the client’s product or service. Indeed, it can be argued that the most distinguishing characteristic of customer satisfaction data is the tremendous interaction among the detailed variables (usually referred to as correlation). Though questions about a service representative’s friendliness, helpfulness, courtesy and product knowledge are almost always strongly interrelated, each of these questions concerns a unique aspect of the customer’s experience. To omit any one of them from the questionnaire would leave a hole for which the other questions could not fully compensate. For clarity’s sake, and to reinforce the point of this section, I use the label “detailed” instead of “independent,” as the later description can be quite misleading. Linear methods go to great lengths to work around the inherent correlation between detailed variables. They expend the effort because a fundamental basis for linear methodologies is that the detailed variables be truly independent. Practitioners of linear methods believe the tools they use work around this problem. However, all of them understand that the high correlation between the detailed variables is a problem they always need to worry about.

Impact! Analysis

Impact! Analysis (IA) was specifically developed for analyzing customer satisfaction surveys. It evolved over a period of six years (1992 to 1998), and we have been using it in its current form for clients over the last seven years (1999 to 2005). The primary benefit of Impact! Analysis is that it measures performance as “percentage of happy customers” and it measures impact as the rate of change in performance. That is, impact is the increase (or decrease) in the percentage of happy customers. We feel these units of measurement are superior to those of correlation and regression coefficients.

IA is performed in four steps and it is reasonable to view each step of the analysis as an assumption. Your judgment as to the utility of IA depends primarily on your evaluation of the reasonableness and validity of each step. In overview, the steps are: 1) reduce all multi-point scales to three-point scales, 2) calculate performance scores for all variables based on the three-point scale, 3) calculate the impact of each detailed variable on the aggregate variable using the three-point scale and the performance scores, and 4) set priorities based on the relative impact of each detailed variable on the aggregate variable. Each of these steps is described more fully below.

Three-point scale

Perhaps the strongest influence on me to use a three-point scale came from Roland Rust of Vanderbilt University . In the early 1990s, I read several papers written by Rust that recommended the use of a three-point scale for customer satisfaction data. IA does not use the scale precisely the way Rust espoused. Instead of his dissatisfied/satisfied/delighted, IA uses the labels low/medium/high. The low, medium, and high labels refer to how a customer rates the product (or service or attribute).

In actual practice, surveys typically use five-, seven-, or 10-point scales. IA doesn’t really care what kind of scale is used on the survey. Given a choice, we recommend a seven-point scale because it provides flexibility. In any event, IA reduces all the scales to low, medium, high. How the values are converted depends on the individual survey. In an industry that is very competitive and where the client is surveying its existing customer base, a five-point scale will usually be converted as 5 = high, 4 = medium, and 1-3 = low. This is because the scale is going to be heavily skewed to the high end. In an industry that is fairly non-competitive, a conversion of 5 = high, 3-4 = medium, and 1-2 = low would be more appropriate. When surveying all potential customers (including those who currently buy from a competitor), a conversion of 4-5 = high, 3 = medium, and 1-2 = low might be best. The primary criteria in choosing how to convert to the three-point scale are: 1) to convert all questions that have the same scale on the questionnaire in the same manner, and 2) to achieve performance scores (described in the next section) in the range of 40 to 70.

We want to be as uniform as possible in converting scales, so that all the questions are treated equally. At the same time, we want performance scores to be in a middle range to leave room for both improvement and deterioration.

To complete this section, a discussion of some exceptions to using the three-point scale is warranted. Binary variables are converted to high and low, with no medium responses. Some four-point scales are quite intractable to conversion to three-points. For example, “likelihood to buy again,” with a scale of definitely will/probably will/probably will not/definitely will not, is tough to convert if each value has 20 percent or more respondents. To handle these difficult cases, a four-point scale is acceptable for performing IA: high/medium high/medium low/low. This four-point scale is avoided if possible because it violates the goal of uniformity.

Performance scores

My motivation for how to calculate performance scores comes from the common practice in market research to measure performance as the percentage of the customers answering in the “top two boxes.” For example: “75 percent of the customers rated us in the top two boxes” (on a five-point scale). This is a fairly standard industry practice and is easily understood by all levels of management. I have changed this slightly to give full points for high, half points for medium, and no points for low. For example, if 50 percent of the respondents rate a variable high, 40 percent medium, and 10 percent low, the performance score would be 70 (50 + 40/2). The basic philosophy is that high ratings are the goal and that medium ratings are partial success. When using the four-point scale described in the immediately preceding paragraph, medium high is worth .67, and medium low is worth .33.

Performance scores are calculated for all variables but they are most important for target (or aggregate) variables. Target variables are most easily described through example: overall satisfaction, likelihood to recommend, likelihood to buy again, value for the money, frequency of purchase, and quantity purchased. All other variables are considered suspect variables. This label is used to convey the meaning that the other variables are suspected of impacting the target variable(s). The analysis will bear out whether they do impact the target variable, and if so, by how much.

Impact! calculation

The previous two steps prepare the ground for calculating the impact of each suspect variable on each target variable. For convenience sake, let’s limit this discussion to a specific target variable: overall satisfaction.

The impact value of each suspect variable on the target variable is calculated by splitting the respondents into three groups based on how they answered the suspect variable: low (SL), medium (SM) and high (SH). For each group the performance score for the target variable is calculated. For example, the SL group might have an overall satisfaction score of 35. The SM group might have a score of 60, and the SH group a score of 75. These would be fairly typical numbers and after some reflection you should see why. The customers who are unhappy about the suspect variable are likely to have low overall satisfaction, and conversely for the upper end of the scale. Indeed, what these numbers reflect mathematically is that the suspect and target variable are somewhat correlated.

The impact on overall satisfaction when the suspect variable is improved can now be measured. Moving customers from the SL group to the SM group results in an improvement to overall satisfaction of 25 (60 - 35). Moving customers from SM to SH results in an improvement of 15 (75 - 60). We call the first change “fixing mistakes” and the second change “achieving excellence.” In this example, fixing mistakes (FM) will have a greater impact on overall satisfaction than achieving excellence (AE). IA also calculates an average impact = (FM + AE) / 2. We refer to the average impact as Impact and it is our primary measure of the relative importance of each suspect variable on the target variable.

Setting priorities

The Impact of each suspect variable is calculated and then they are all compared. However, instead of setting priorities solely on the basis of Impact a second criterion should also be considered: the performance score for each suspect variable. To make this discussion more coherent, examine the quadrant graph shown in Figure 1.

The X axis is the performance score and the Y axis is the Impact value. Each point on the graph represents a suspect variable’s performance score and Impact on the target variable. The target variable name is displayed at the top of the figure and the performance score for the target variable is displayed as a vertical line separating the quadrants on the left from those on the right. The average of the Impact values for all the suspect variables is displayed as a horizontal line separating the upper quadrants from the lower ones.

The vertical and horizontal dividing lines create four quadrants which have been numbered I, II, III and IV. The numbers refer to the upper left, upper right, lower left and lower right respectively. The numbering scheme reflects IA’s recommended priority. Thus, quadrant I is labeled “critical improvements needed”; quadrant II is labeled “maintenance required”; quadrant III and IV are labeled “lower priority improvements” and “lowest priority.”

The reasoning behind these priorities is that improving a suspect variable that has a high Impact value will produce more improvement in overall satisfaction. This is why quadrants I and II come first. Looking at the suspect variables, it is clear that there is more room for improvement for those with low performance scores. By giving variables in quadrant I a higher priority than those in quadrant II, we are taking advantage of the additional opportunity for large improvements offered by those in quadrant I. Similar logic applies to the ranking of quadrants III and IV.

Based on Figure 1, our recommendation to the client was to work on P5 as the highest priority (Quadrant I). P1, P7, T6, T2, and T3 should be maintained at their current levels (Quadrant II). If there is any deterioration in the Quadrant II variables, then that will have a marked negative impact on overall satisfaction. The other variables are lower priority. Note the grouping shown in the figure. The perception questions were relevant to all respondents and related to their perception of several attributes of the corporation. The telephone questions were only asked of respondents who had recently placed a phone call to the corporation. The graph shows the similarities and dissimilarities within and between the groups.

New vocabulary

This article has introduced the methodology Impact! Analysis. IA was developed specifically to work within the restrictions imposed by market research data. Because it is different from the linear, polynomial methods commonly used in market research, it requires the reader to learn a new vocabulary, and to take a different view of the both the problem domain and the possibilities available for understanding customers. While Impact! Analysis has as its foundation very simple mathematics, the methodology addresses the complexities of market research problems and generates conclusions that are easily understood by all levels of management.