The fine art of data reduction | Articles

Abstract

Clients do not need voluminous data but answers - answers that come from reducing data to a form that is useful and enlightening. This article discusses the fine art - skill and challenges, too - of data reduction.

Editor's note: Rich Vondruska, Ph.D., is founder of Research Mentors, a division of Vondruska Associates, Chicago.

The great challenge in dealing with data is to find what amounts to the best portrayal of the conglomeration of numbers at hand. In some regard, the task we face is much like that of Michelangelo confronting a block of marble - to chip away everything that is not David. If this implies that there is much art, as well as science, in dealing with data, then the point has been well made. Regardless of the data itself, there is an inner drive to make sense of it; to derive meaning from an array of observed or experienced events.

Although there are many sophisticated ways to attack a data set, the process always starts with common sense. It has been said that the role of science is to "cut nature at her joints," meaning that there are natural breaks in the data that must be respected. Demographic variables such as sex, educational level and income are standard "joints" in marketing research. Reduction and understanding of the data usually begins by inspecting cross-tabs of key measures by demography. Unfortunately, it often ends there as well. In many cases, the data is not "mined" to its full potential. Part of the true joy of doing research is to explore the data sets, not merely to dutifully record the outcome. When analyzing data, I have discovered "gems" in the data with major marketing implications. Those of you who have had similar experiences know that cases like these make the whole process of data reduction worthwhile.

Loosening the Gordian knot

A Gordian knot is symbolic of the process of data reduction. What presents itself as a Gordian knot still appears inextricable when it is first reduced. Upon greater reduction, however, the knot eventually disappears. So it is in the analysis of data. What seems insurmountable at first becomes a non-problem when the knot disappears.

Sometimes what appears very obvious to a skilled researcher is less than obvious to a corporate manager who must choose to act on the findings. Data is typically sampled from a larger population, and inferences must be made in order to make the leap of faith to implementation. I have even had a person unsophisticated in sta-tistics accuse me of having "tainted data! " The truth is that as long as the sample can be construed by reasonable people as representative, there will always be some uncertainty as to the reliability of the findings.

Decision-making criteria

The accepted scientific criterion for what might be called reasonable doubt is a probability of .05 (i.e., the likelihood that the finding could have been obtained by chance alone is very small - 1 in 20). This stringent criterion is relaxed a good deal in marketing research.

This is due partly to the sample sizes, and due partly to the fact that marketers rely on what have been called "natural experiments," in which no attempt is made to manipulate independent variables such as the brand of a product consumed. Rather, most marketing surveys are opportunistic, and rely on the fact that people have already chosen certain products.

The attitude of many corporate managers is what I like to call the Joe Friday attitude - just the facts ma'am. The plain and simple truth in most of these cases is rarely plain and never simple. One of my university professors used to characterize the difference between fact and opinion in this way: "Whereas paternity is sometimes a matter of opinion, maternity is always a matter of fact." Marketers will always be in the "maternity ward" of the research process. They have the accountability for the actions taken in light of research.

There is an interesting phenomenon that sometimes occurs in marketing research. When a marketing effort is successful, the marketer takes all of the glory, but when a marketing effort is unsuccessful, the research is scrutinized for "taints." This serves to demonstrate the wisdom of an ancient Chinese proverb - "Success has many fathers, but failure is an orphan."

The point, of course, is to emphasize the care that must be taken in the process of data reduction. It is not enough to merely reduce data. One eye must always remain on the practical use of the findings, and the repercussions that may result from them. The value of research depends upon the ability to move from incomplete information to action. Often the best way to do this is to make changes in a test market situation, where the extent of failure (the possible downside) can be contained.

Observing the signals

Companies that do not rely on marketing research can only take a "reaction posture" in the marketplace. The true innovators are willing to take the necessary risks. Just as when one is driving in traffic, research findings provide green lights, red lights and amber lights. Usually the light is amber in marketing research. As long as one remembers that amber means caution, accidents will be minimized.

An analogy from chess may shed light on the problem as well. Most people know that the object of the game is to checkmate the enemy king. However, since both sides have the same objective, the attainment of that goal becomes problematic. To attain the ultimate goal, it is necessary to attain a number of sub-goals, such as controlling the center of the board, before the opponent does. Likewise, analysis of data requires a disciplined, step-by-step procedure.

Dealing with data doubt

Unlike an academic test, there is often no "answer key" that allows us to verify the way in which we construe the facts. So somehow we need to assure ourselves, and our clients, that the data can be used to address real-world business situations. On this score, I have even heard a hierarchical cluster analysis referred to as "Astrology to five decimal places!" The researcher becomes the target of the doubting Thomas: "How do you know that the data says what you claim it says?"

An example is in order here. Imagine yourself delivering a presentation of the findings from a recent survey. Midway through the presentation, the chairman of the board waltzes into the room. He glares at a chart that shows his company dead-last in the industry ratings for customer satisfaction. Then his glare passes from the chart to you. In a rather fierce voice, he demands, "Why should we believe this chart?" Assuming that you first take a sip of water to remove the dry feeling in your mouth, you must then address the issue forthrightly. The non-obvious answer is that data never stands alone. There is always contact with a realm of reality that transcends the chart on the presentation screen.

Is the data consistent with what is already known? Is there any reason to question the methodology of the survey that obtained the information? Does the presenter have a hidden agenda with regard to the information? By casting doubt upon the constellation of findings, the chairman has revealed a new dimension of the role of information in marketing research. Since he is the primary decision maker, he has the ultimate responsibility for accepting or rejecting the information, regardless of the external validity of the information itself. By analogy, the captain of a vessel may obtain a report of an iceberg in the path of the ship, but it is up to him to act upon the information.

Heading off data problems

It must be borne in mind that any set of data, regardless of its reduction level, can be construed in more than one way. At one point in the history of astronomy, there were two separate theories that explained the movement of heavenly bodies. There was the Ptolemaic system in which Earth was the center of the heavens, and the Copernican system in which the sun was at the center. Obviously, there were major metaphysical implications to be drawn from these two perspectives, since they had religious as well as scientific overtones. The same is true with data in marketing research. If a particular piece of information fits the chairman's overall direction for a company, that information will be highlighted. If it does not, that information will be suspect. So, there is a delicate balance between the information itself and the way in which it is implemented. Unwelcome news still sometimes results in the messenger's head being chopped off.

It is instructive to note that the level of precision in the data is of paramount concern in many instances. There is a dynamism between the amount of data that can be collected and the funds available to collect them. Oftentimes, greater precision is demanded than is possible with the data at hand. In those cases, either decisions must be made using "incomplete" information, or more data must be collected.

Statistical techniques such as factor analysis are aimed toward reducing an array of data into understandable factors or dimensions. In work I have done in both the automotive and recreational marine industries, clients have requested "maps" of the marketplace. In those instances, the challenge was to represent the various products within a N-dimensional space. The logic was that products that were in close proximity to each other were close competitors, whereas those that were far apart were not. By factor analyzing measures on attributes such as style, durability, safety, ease of handling and so forth, it was possible to create a two-dimensional "competitive environment" in which the products could be plotted. These maps were so popular that one of the clients requested wallet-sized versions for easy reference. Talk about data reduction!

The point here should be clear - clients do not need voluminous data, except possibly to impress those who are impressed by "data by the pound." What clients truly want are answers, and those can only be achieved by the proper portrayal of the data in a form that is both useful and enlightening.