In December I moderated an informal video chat among a dozen or so members of the Professional Insights Collaborative (PIC), a corporate researcher-only group that Quirk’s is helping operate, on the topic of synthetic data and digital twins. In our outreach for the gathering among PIC members, I had framed (hopefully without biasing!) the conversation as one that would likely be tinged with a kind of good-humored skepticism, based on my general sense of researchers’ views on the topic drawn from discussions at industry conferences.
I’ve seen iterations of that skepticism many times over the decades, as research vendors and agencies routinely take some new wrinkle on information-gathering (or in the case of synthetic data, information-generating) and exult about it, in true marketing fashion, like it’s the cure for all ills. From online research to big data, DIY to text mining, behavioral economics to AI, it feels like every few years a new elixir comes along. And I can only imagine what it’s been like for the end-client researchers to sit through all the breathless presentations from their existing or prospective supplier partners on why this new thing is THE thing, knowing full well that their budgets are already tight and, in any case, their internal clients are in no hurry to move on from their trackers and their focus groups just yet.
Based on my PIC conversation, it seems that while researchers are curious about synthetic data, no one is particularly eager to embrace it – and there's a growing sense that research agencies are pushing it harder than corporate researchers are demanding it.
The potential benefits are clear enough. When asked what would make synthetic data valuable, participants consistently mentioned being able to get data (granted, of the synthetic ilk) from hard-to-reach audiences and to overcome low response rates, particularly among niche B2B populations or specific customer segments.
But the concerns are substantial. Validity topped the list. As PIC member R noted, vendors keep touting 80% matches with real data, “But what about that 20%? That might be the really important 20%,” he said.
PIC member L emphasized the critical importance of understanding the input, since the data coming out will only be as good as what went in. She worried about models trained on older data that might not reflect current realities – particularly problematic in fast-moving tech sectors where “older” means pre-AI, just a couple of years ago.
The naming itself creates problems. PIC member J bluntly stated that “synthetic” means “fake,” making it difficult to sell internally. “The name itself is a turn-off,” she said. PIC member S highlighted another barrier: securing funding to test something that “might or might not work” when budgets for actual research are already tight.
When we discussed presenting synthetic findings to internal audiences, the consensus favored transparency and advance notice – never surprising leadership with methods they haven't approved. “I would tell my stakeholders ahead of time before we even launched the project that, ‘this is what we're doing,’ so they'd already have pre-bought into it so it wasn't a surprise when we we deliver the results,” said PIC member M.
Several participants worried about losing the human element with synthetic data. One member emphasized that natural research settings capture valuable spontaneity – the silences, the non-responses, the unexpected phrasings that reveal actual consumer thinking. “We're gonna lose the behavioral insights that come from observing how people naturally express themselves,” she said. PIC member S couldn't imagine replacing authentic customer quotes with responses “attributed to AI” in marketing materials or stakeholder presentations.
The emerging consensus suggested synthetic data might work for quick validation – choosing between message A or B – but not for understanding unmet needs or discovering unexpected insights.
The group's overall assessment? Synthetic data is another tool in the toolbox – not the salvation it's sometimes portrayed to be. PIC member A compared it to big data and social listening: “They turned out to be really important things but not necessarily the game changers that they were touted as when we first heard about them.”
It's the agencies that seem most enthusiastic, perhaps seeing a new revenue stream. But among the client-side researchers on our call? The verdict was clear: We're watching and we're learning but we're far from convinced.
PIC membership is free to any current corporate or client-side researcher. Head over to https://www.piconline.org/ for more info!