Listen to this article

One thing doesn’t always lead to another

Editor's note: William Cimarosa is director of quantitative insights at Egg Strategy, with 20 years of experience in quantitative research, design thinking and global insights, having worked across both client-side and vendor-side roles. He holds an M.A. from San Diego State University and a certificate in machine learning business implications from MIT Sloan. Find William on LinkedIn.

The concept tested well. The appeal scores were strong. Consumers said they wanted it, said they would use it, said the features mattered to them. So the organization invested. Development budgets were approved. Timelines were set. Teams were assembled. And then, somewhere between the research debrief and the market launch, something went wrong.

The product underperformed. Adoption lagged. Engagement never materialized the way the data predicted it would. And now, months or years later, a different kind of research begins: the postmortem. The forensic work of understanding why something that tested so well performed so poorly. This is expensive work. It is slow. And it almost always arrives at the same uncomfortable conclusion: Consumers said they wanted one thing but did something else entirely.

By the time you discover the gap between what consumers say and what they actually do, the damage is already done. The budget has been spent. The opportunity cost has compounded. The original research, which seemed so clear at the time, now looks like a map that led somewhere other than where you needed to go.

This is not a flaw in consumers. They are not unreliable witnesses to their own behavior. They are, more accurately, incomplete ones. The machinery of human motivation runs deeper than conscious awareness can access. People do not always know why they do what they do. This is not a failure of honesty. It is a feature of human psychology (and a reason researchers should be deeply suspicious of synthetic response data).

The question, then, is what we do about it.

The limits of asking directly

Traditional concept testing relies heavily on stated preference. We show consumers an idea and ask them to evaluate it. Do you like this? Would you use this? How important is this feature? The logic feels sound. If we want to know what people think, we should ask them.

But importance, it turns out, is not something consumers can accurately self-report. Decades of behavioral science have demonstrated that the attitudes people claim drive their decisions often diverge from the attitudes that actually predict their behavior. A consumer might tell you that "having fun" is the most important reason they engage with a product category. The behavioral data might reveal that "feeling like an expert" is what actually correlates with engagement frequency. Both are true in their own way. But only one helps you build something people will use.

This is the derived importance gap. It is the distance between what consumers say matters and what the data shows actually predicts behavior. And it is far wider than most innovation processes acknowledge.

Closing the gap through behavioral prediction

The alternative to asking consumers what matters is to calculate what matters. But this calculation does not happen in a vacuum. It requires a foundational study, conducted before any innovation testing begins, that establishes the empirical relationship between attitudes and behavior.

This foundational work looks different from traditional concept testing research. It is larger in scope, often involving thousands of respondents rather than hundreds. It measures both attitudinal constructs (motivations, values, psychological needs) and behavioral outcomes (engagement frequency, purchase behavior, product usage) within the same sample. And it applies regression modeling to identify which attitudes genuinely predict which behaviors, with what strength and for which consumer segments.

The output is not a report that sits on a shelf. It is a diagnostic framework: a validated set of motivational items, each linked to specific behavioral outcomes through quantified prediction coefficients. This framework becomes the measurement backbone for every innovation test that follows.

Consider a segmentation study that surveys 3,000 consumers on 50 motivational statements and 30 behavioral measures. The regression analysis reveals that for one segment, category engagement is driven primarily by motivations related to mastery and expertise. For another segment, the same behaviors are driven by social connection and belonging. These are not hypotheses. They are empirical findings, validated through behavioral prediction.

When you test an innovation concept six months later, you do not start from scratch. You deploy the diagnostic battery that emerged from the foundational study. You measure how well the concept delivers on the specific motivations that predict behavior for each target segment. The foundational investment pays dividends across every subsequent test, providing continuity and cumulative learning rather than isolated snapshots.

Instead of asking consumers whether they like an idea, you ask whether the idea activates the specific psychological drivers that predict engagement. Instead of generic appeal scores, you get diagnostic specificity. Instead of a pass/fail gate, you get a roadmap for refinement.

Experience as the unit of measurement

There is something elegant about this reframing. The motivations that emerge from derived importance analysis are not abstract constructs. They are experiences. Feeling like you are making smarter decisions. Sensing that you are part of something larger than yourself. Experiencing the satisfaction of going deep into something you care about. These are felt states, not demographic boxes or attitudinal segments.

When you test a concept against these derived experiences, you are asking a more honest question than traditional research permits. You are not asking whether the consumer likes the idea in some general sense. You are asking whether the idea delivers on the specific psychological experiences that the data has proven drive real-world behavior.

This transforms the diagnostic output. A concept scorecard built on derived importance does not just tell you that an idea underperformed. It tells you which experiences the idea fails to deliver, how much that matters (quantified through the behavioral prediction coefficients) and therefore where iteration should focus. The creative team does not receive vague feedback to "make it more appealing." They receive precise guidance: strengthen the concept's ability to deliver on this particular experience, because this experience is what actually moves behavior.

The craft of refinement

There is a tendency in insights work to treat concept testing as a judgment. Good idea or bad idea. Above benchmark or below. But judgment is not the same as understanding. And understanding is what allows iteration.

The derived importance framework treats testing as craft. Each wave of research reveals not just how well an idea performs but specifically where it falls short and by how much. Teams can refine with precision, addressing the gaps that matter most, then retest to confirm improvement. The concept evolves not through intuition alone but through a feedback loop grounded in behavioral validity.

This requires a different relationship between research and creative development. Research becomes less about evaluation and more about guidance. The insights function becomes less of a gatekeeper and more of a collaborator in building something that works.

Looking forward

The derived importance gap is not new. Behavioral scientists have understood for decades that stated preference is an imperfect predictor of action. What has changed is our ability to operationalize this insight at scale. Modern segmentation studies can generate the behavioral prediction models that identify true drivers. Typing tools can assign consumers to segments in real time. Adaptive diagnostics can tailor measurement to stimulus characteristics.

But realizing this potential requires a commitment to foundational research as a strategic asset rather than a one-time project. Organizations that invest in rigorous segmentation and behavioral modeling upfront create a testing framework that compounds in value over time. Each innovation test builds on validated diagnostics rather than reinventing measurement from scratch.

The room will still grow quiet sometimes. Data will still surprise us. But we can at least ensure that when we test ideas, we are measuring the right things. Not what consumers claim drives their behavior but what actually does.

That gap is worth closing.