Understanding consumer behavior with synthetic data
Editor’s note: Mary Rose Walker, Senior Brand Strategist, MDRG, New Orleans, LA.
This is a common problem in market research: traditional methodologies can sometimes miss the System 1, or nonconscious, drivers of behavior. Many market research approaches capture what people say, not always what they feel or do. Add to this the difficulties of meeting project timelines, budgets or scope, and the need for new, emerging methods of research is clear.
Enter virtual personas and synthetic data.
Virtual personas use generative AI to simulate how real people in defined audience segments would respond to questions, visuals, marketing stimuli and more. But unlike traditional methods, virtual personas can model this behavior beyond conscious awareness in ways that can be challenging to access in real consumers. The invisible forces that shape consumer choices can be hard to verbalize, hard to observe and hard to identify even when they’re in action, and therefore hard to study. Virtual personas and synthetic data can “unbury” System 1 processes and reveal much of the hidden processes guiding consumer behavior. This can provide researchers a deeper understanding of the customer, especially in categories like beverages where purchasing decisions can be emotional and habitual.
Virtual personas: A primer
At a technical level, AI-generated virtual personas blend the strengths of multiple disciplines. They are not fictional composites, but algorithmic models grounded in real purchasing and consumer data. These synthetic data platforms use a mixture of sources to simulate how real people think and decide: public market trend sources, social media analyses, behavioral statistics, academic databases. They use machine learning to process massive data sets of real-world behavior, behavioral economics to model how cognitive shortcuts (or heuristics) influence decisions and psychology to replicate emotional nuance and motivation. The AI-powered synthetic personas that emerge can respond to open-ended, behavior-informed questions just as human participants would, with one exception. These personas are immune to the potential biases that traditional participants might have – they can speak to the System 1 processes that are driving their behavior as well as System 2. The result is a qualitative tool that gives researchers access to something traditionally difficult to capture: the subconscious and emotional layer of decision-making.
Importantly, synthetic data platforms aren’t designed to replace traditional methods or human insight. Instead, they supplement them. They offer a faster, lower-cost way to explore new ideas, validate hypotheses and uncover the emotional terrain beneath rational responses. In other words, they act as an early stage lens – helping researchers frame sharper questions before engaging real participants.
Virtual personas and beverage choice motivators
Getting started with synthetic data
MDRG had a goal: Add depth to our internal understanding of consumer beverage choice while validating the emerging technology. In short, how could we quickly and cost-effectively explore the automatic and habitual drivers behind consumer beverage decision patterns? Synthetic data and virtual personas were the right fit.
Here’s how we went about it:
- Using a synthetic data platform, we introduced our topic, specifically calling out our research goals and the behavioral science principles we wanted to explore.
- With this context and our audience specifications (diverse U.S. beverage purchasers between 21-60 years old), the platform generated eight virtual personas representing different beverage consumer segments.
- We then designed targeted questions for the personas aligned with six behavioral science principles to evaluate how each might influence beverage purchase decisions to surface intuitive System 1 responses.
- We further queried the personas using open-ended, behaviorally informed questions to explore broader decision drivers and confirm and expand on findings from the behavioral science analysis.
- Drawing on the results and our own analysis, we created an in-depth report outlining how specific behavioral science principles impact beverage purchases.
What the personas revealed
Each of the eight personas brought to life emotional nuance in beverage decision-making – the “why” behind the “what.” These personas served as the starting point for exploring how different consumer types make decisions, allowing us to dig deeper into their behaviors.

The findings validated behavioral science theories while adding new, intuitive depth.
As we queried these personas, we asked questions based on six behavioral science principles: cue-triggered craving, sensory appeal, choice architecture, social influence, brand loyalty and scarcity.
Actual questions we asked include:
- Have you ever wanted to try a drink just because you saw someone else drinking it? What happened next?
- What catches your attention when you see drink packaging or ads? Do any of these things make you want to try a drink?

In traditional studies, consumers may not be able to answer these accurately – or at all. They may not be aware of the way packaging affects their interest or attention or admit to wanting to try something just because “everyone else is doing it.” Synthetic personas don’t reflect this. Instead, they gave us insight into how each consumer segment behaves in these scenarios. Using these insights, we were able to draw conclusions to answer our original question: What are the automatic and habitual drivers behind consumer beverage decision patterns?

The researcher’s perspective: Conclusions on synthetic data
What did we learn?
Synthetic data and virtual personas are faster and leaner:
The insights were delivered 80% faster and at half the cost of traditional methods. Synthetic personas can also reveal patterns and preferences that can help inform how to design additional research before further investment – saving researchers time as they move on to next steps.
Scientifically rigorous and reliable:
Synthetic data predicts consumer behavior with accuracy comparable to large-scale surveys. It also reduces bias by modeling behavior rather than relying on self-reporting.
Actionable findings with additional opportunity:
The study not only provided us with a solid strategic base but also allowed us to return and query our data as additional questions arose – something that's not an option in traditional market research methods.
Deep, value-based insights:
Through the consumer personas, we learned a lot about the behavioral science principles that are driving beverage choice. Since 95% of decisions happen automatically, starting with synthetic data can lead to stronger strategies built on insights into the System 1 processes that really drive decision making.
What this means for the future of market research
Using virtual personas and synthetic data isn't a replacement for other market research methods or human voices, it's a supplement. It's a faster, sharper starting point that helps research professionals frame smarter hypotheses, design better studies and ultimately deliver more relevant findings.
We set out to understand the underpinnings of consumer beverage choice and came back with some incredible insights – but before we start making recommendations based on that data, it should be validated. Synthetic data acts as early directional insight, testing hypotheses and validating assumptions. The resulting strategy should be pressure tested with real consumers through traditional market research. The advantage here is that subsequent research can be more targeted and nuanced than broader exploratory brand studies.

Synthetic data offers an opportunity to study why people behave the way they do, faster and with greater nuance. The future of research will likely be hybrid: blending human connection with machine learning. When speed or budget is a major concern, it's just one more tool in our toolbelt to help meet client needs.
