Listen to this article

What your data quality report reveals: A framework for smarter survey research 

Editor's note: This article is an automated speech-to-text transcription, edited lightly for clarity. For the full session, please watch the recording.

A recent study found that a third of survey responses are fraudulent, with 70% of which was not caught by data cleaning technology.  

Jonathan Goodbread, head of data quality strategy at aytm, presented a framework to enhance data quality during the June 11, 2026, Quirk’s Virtual – Data Quality, Security and Ethics series. He explained the 4P Framework developed by aytm and the need for a shift to a discipline first framework.

Session transcript

Joe Rydholm

Hey everybody. Welcome to our session, “What your data quality report reveals: A framework for smarter survey research.” I'm Quirks Editor Joe Rydholm, and before we get started, I just wanted to quickly go over the ways you can participate in today's session. 

You can use the chat tab to interact with other attendees during the program, and you can use a Q&A tab to submit questions to the presenters, and we'll address them live in a Q&A after the recording. 

Our session today is presented by aytm. Enjoy!

Jonathan Goodbread

Hi, thank you for joining today for, “What your data quality report reveals: A framework for smarter survey research.”

My name is Jonathan Goodbread. I am the Head of Data Quality Strategy at aytm. I have 20 years of experience in market research at the intersection of survey methodology and data quality engineering. And I'm so pleased to be talking with all of you today.  

I want to ground this talk in a solid that Green Book did all of us in 2025. They audited 4.1 billion survey attempts, and they shared the data quality information with all of us that came from it. The numbers were jarring.  

A third of those potential survey attempts were fraudulent. 27% were from inattentive respondents and 70% of all that fraud didn't get caught by cleaning processes. So, here's what I want to talk about today. I want to talk about why this matters. 

This problem has technical elements, but it is not a technical problem. It is a business problem with business outcomes and business challenges. And when we're done today, I want you to be able to look at your CMO when they ask, "Can I trust this research?" And we want you to be able to unequivocally say, “Yes, and here is the proof.”  

And here's the data I was just talking about; 4.1 billion survey attempts, 33 fraudulent, 27 inattentive. When you take those two groups and combine them accounting for overlap, that is half of all B2C traffic that is contaminated. That is a problem. 70% of that 50% contaminated overlap slips through standard data cleaning. So, when you see your clean dataset, it may not be what you think you're looking at. Most of the fraud may still be in it.  

So, here's what we'll cover today. 

The real cost of bad data is beyond the cleaning bill. What are the line items that don't appear on invoices and don't appear on fiscal reports?  

Second, aytm’s 4P Framework for great data quality; prevent, protect, purify and prove. 

Third, how to spot which of those stages is leaking? Where's the problem? What aren't you catching? And you're going to know that before you talk to a stakeholder with questions.  

Four, what to ask your suppliers and what answers to refuse. The enemy of good data quality is provider opacity, demand transparency.  

Here are those four costs I was just talking about. 

First is wrong decisions. Bad data means bad strategy. Contaminated data executed at scale means that you have misread the market and you have made a business decision that has impacts on revenue, reputation and trust. Those wrong decisions powered by wrong data are hugely problematic and have real consequences for business.  

Second, reruns and rework. Have you run a study with poor data quality? Well, you got to rerun it so that you have the answers you actually need. All of that work, all of that additional sample; you pay for that twice in money and three times in expert time. That is a massive problem for your business.  

Third is stakeholder trust. If you quote bad data to a stakeholder and that stakeholder makes a decision based on that poor quality data, you have just lost trust with that stakeholder and it takes a lot more time to gain trust than it does to lose it. So, you need great data quality. Otherwise, your relationships with your stakeholders could be at risk.  

And fourth is decision velocity. Markets demand quick action, and they demand wise action. They demand both at the same time to keep your position on top of that market. And when teams don't trust the data you've given them, they don't make decisions. They ask for more, they ask for more studies, they ask for more justifications and they ask for more insight.  

And all of those are actually fine things to ask for. But when they impact your decision velocity, your ability to react to markets, it's a problem. And remember all of this, all of these costs are a function of the fact that 70% of our flawed slips through cleanup. So a clean end just isn't what you think it is.  

Let's talk about how we solve this, how we approach it, how we change our outlook. The change is simple. Data quality is not a filter. It is a discipline.  

If you are filtering first, if you are letting the potential wave of respondents wash over your survey link and think you're removing and cleaning well, you're accepting contamination from the get- go and you're optimizing for removal, but removal isn't catching all of the problems. 70% makes it through. So, we have to move to a discipline first framework.  

We build the conditions where great data comes from our survey instrument, from engaged respondents who want to give us their insights. You got to act disciplined first. And these two approaches don't cost any different to run. They have the exact same costs, but there are wildly different business outcomes.