Editor's note: Deb Ploskonka is chief data scientist at Cambia Information Group. She can be reached at deb.ploskonka@cambiainfo.com.

An uncomfortable theme is beginning to dominate our industry, at least for those of us who draw sample from online panels.

It’s fraud.

Over time, we have had to shed our naivete around how cheaters cheat and how extensive the problem is. We have learned we must go out of our way with every online project to protect our data, our insights and our clients from falsehoods. The investment we are making to ensure clean data has risen dramatically over time, in proportion to the increasing fraud present, leading to this article.

Figure 1 shows a sampling of Cambia’s online panel toss rates over the past five years, for B2B and B2C studies, from almost a dozen panels. Although individual records may look fine in isolation, when we look across respondents, we see patterns of duplication indicating fraud. From these rates, it appears B2B studies are more attractive to fraudsters, likely due to the higher incentives. Additionally, consider what the incidence of fraudsters may be to that of the target audience. If your target audience is extremely narrow, fraudsters who figure out the qualification criteria may fill your quotas faster than genuine respondents. 

The degree of fraud has become so pervasive, and the skills of fraudsters so advanced, that it will take all of us to defeat it: sample suppliers, research agencies, software companies, corporate researchers, end clients and industry organizations such as ESOMAR and Insights Association.

Why does it matter? Conventional research industry wisdom has been that lower-quality respondents simply add random noise and cancel each other out, softening the findings but not meaningfully changing them.

Whether true or not in the past, this is certainly not true now. Over and over we at Cambia are seein...