I agree with many points that Joseph Rydholm made in his February 2013 Trade Talk column, “Are we ready to become scientists?” The article, however, overly emphasizes big data analysis while neglecting the role background knowledge plays in scientific research.
Three potential problems can arise from this unbalanced perspective. (I emphasize “potential” because there are, indeed, many valuable applications of big data analysis when appropriate statistical methods, e.g., experiments, guided by theory, are used.)
Overreliance on the behaviorist method: My primary concern is focused on those organizations for which big data is the dominant form of non-experimental consumer inquiry, because it reflects an implausible method of understanding human behavior – behaviorism. Arising from the field of psychology in the early 20th century, behaviorism holds that, because mental states are unobservable, the only valid way to understand and predict human behavior is to study only behavior. That’s what big-data scientists do if their analyses are exclusively based on non-experimentally derived “marketing-mix variables and sales data,” or what I call behavior-only data.
Biased models: As a consequence, building models using behavior-only data (e.g., scanner-based data) can bias a model’s coefficients, as discussed by Dan Horsky et al.’s article, “Observed and unobserved preference heterogeneity in brand-choice models” (Marketing Science, July-August 2006, pp. 322-35). Using scanner-based data, they report that “the addition of individual-specific brand-preference information [i.e., brand attitude data] significantly improves fit and prediction” and that, without such data, the model (using behavior-only data) produces biased estimates of factors such as brand loyalty and price sensitivity (p. 322).
Ignoring consumer mental states: Behaviorism lost favor in the field of psychology simply because of its implausibility and it is easy to see why from our perspective as marketing researchers. Different consumer beliefs, intentions and emotions can give rise to the same purchasing behavior. Yet this kind of consumer information is not contained in behavior-only data.
* * *
So what’s the solution? Rydholm briefly touched on this topic when he talked about what scientists might discover in their deep data dives – a “set of hypotheses that can be tested.” From where, however, do these hypotheses come? They partly come from the data, of course. But without appropriate background knowledge, a data “scientist” is nothing more than a data “statistician.”
Background knowledge is one’s total knowledge in all fields relevant to a subject of inquiry. For a marketing data scientist, this background knowledge will, of course, include a sold grounding in statistics and data modeling. But more is needed. A marketing data scientist (or, at minimum, the team that she works with) needs a solid background in the market and products under investigation, marketing principles and theories and other fields that are having a profound effect on marketing knowledge today, such as behavioral economics and neuroscience. Using this background knowledge, a data scientist can then develop more useful hypotheses to test. For example, scanner data may show that Product X, a frozen snack, does not sell well relative to a competitor’s Product Y, a similar snack that is sold in the dairy aisle. Attitudinal research can shed light on why: Consumers perceive the dairy version as healthier because it’s found in the dairy section with other healthy foods vs. the freezer section where high-calorie frozen desserts are located. That kind of insight is lost in scanner-only data.
From another perspective, what I’m arguing for is the recognition that all research methods are tools and marketing researchers can no more conduct good analysis by using only big data analysis than a carpenter can use just a hammer to build a house.
So perhaps instead of asking, “Are we ready to become scientists?” we should be asking, “What kind of scientists should we become?”