New study explores the risks associated with self-claimed sampling methodologies 

Editor’s note: Megan Copas is director, consumer research, at 84.51°, a data science, insights and media firm. 

What would you do if you learned that 75% of the respondents in your research study had not actually purchased the category you were researching? Brands that rely on consumer insights derived from research based on self-claimed sampling methodologies may want to scrutinize those insights.

Research conducted by 84.51° indicates that brands could be making crucial business decisions based on flawed data. The study, which compared differences between behaviorally verified sampling methods and self-claimed sampling methods revealed a high degree of risk in using insights derived from research that relies on respondents' memories about specific behaviors.

What defines a self-claimed vs. behaviorally verified respondent?

The key difference between self-claimed and behaviorally verified respondents lies in the sampling method for research studies. Self-claimed respondents are selected based on their self-reported information, such as demographics, shopping habits and brand usage, provided during the screening portion of a questionnaire. When a new survey is conducted, these respondents are invited to participate and qualify based on the information they provide in the initial questions of the survey. However, their eligibility is not verified through objective first-party data or behavioral evidence.

Behaviorally verified respondents are selected using actual purchase behavior from loyalty card or other transaction data to ensure they qualify for the desired shopper profiles and then confirmed for remembering this behavior in the screener portion of the survey. This approach offers several advantages, such as higher incidence rates as respondents are pre-qualified based on their actual purchase data. Additionally, this method ensures that insights are gathered from real people rather than bots.

Study methodology

The research involved a sample of 2,700 consumers who had shopped several grocery categories at Kroger stores: 900 behaviorally verified buyers, 900 self-claimed respondents and 900 additional self-claimed respondents where we were able to measure the “say versus do” gap by matching them after the survey to their loyalty card data. Three categories were selected to represent a range of HH engagement and purchase cycles across grocery and health and beauty care (HBC), sample feasibility and having differentiated brands within their portfolio. Three brands for each category, nine in total, were included in the study. Data was collected through an online survey conducted in April and May 2023.

Key findings from the research show that when using self-claimed recruiting methods: 

  • Respondents aren’t who you think they are. A majority of self-claimed respondents failed to meet the survey criteria. For instance, 75% of self-claimed respondents stated they purchased a category at Kroger in the past three months, but behavioral data indicated zero sales and units during the same period.
  • Respondents are placed in the wrong research cells. The study also uncovered that 60% of the time self-claimed respondents end up in the wrong heavy/medium/light groups, further skewing the results and undermining the reliability of the findings.
  • You are at risk of making the wrong business decisions. The study revealed that when using self-claimed sampling methods, results across common critical metrics such as purchase intent and ad ratings were inflated by an average of 22% and 14%, respectively, compared to behaviorally verified respondents.

Self-claimed respondents are asked to do the impossible   

Asking survey respondents to accurately recall granular details about past spending and units purchased presents inherent risks. Given what we know about the fallibility of memory, it is unrealistic to expect any consumer to reliably recollect nuanced data like units purchased. Additionally, some research segments are too complicated for even the most insightful consumers to be able to report their own qualifying behavior such as lapsed households. For example, would you remember if you purchased a product in the past year, but not in the past three months?

However, many research studies still rely wholly on self-claimed respondents to accurately report their historic purchase quantities and frequencies. While easy to collect, unverified recall-based responses should raise red flags around misrepresenting true purchase patterns. For precise insights, research studies must move beyond unaided memory from self-claimed sampling methods and incorporate verification through transactional data and other forms of shopper data.

Additional best practices for improving survey results include: 

  • Requesting detailed descriptions of survey and respondent quality measures during the RFP process. When seeking proposals from providers, ask them to include a comprehensive description of the quality measures they have in place. This will help you gauge the quality of the respondents and make an informed decision.
  • Ensure relevance and accuracy through verification. Cross-check the information provided by respondents with purchase data or other behavioral records whenever possible.
  • Scrutinize self-claimed research results. When using self-claimed sampling methods, pay close attention to the results and watch out for any signs of inflated metrics or mismatched shopper groups. This will help you identify any potential issues early on and improve the overall quality of your data.
  • Evaluate survey quality checks cautiously and ask about any new innovative quality checks to ensure respondent quality. Be aware that advances in AI technology may enable fraudulent responses to mimic human patterns. Therefore, carefully assess any self-claimed survey quality checks to ensure the authenticity of the data you receive.

Smart approaches to survey sampling  

Respondents should be set up for success in providing brands with accurate and meaningful insights.  Expecting self-claimed respondents to rely solely on recall presents risks of distortion that must be weighed. Conversely, verification through shopper data delivers accuracy but requires greater data resources including expertise in utilizing the available data, designing the appropriate sample frame and the ability to connect with consumers in the behavioral database. 

Rather than take an all-or-nothing view, brands should deploy smart approaches to survey sampling, such as requesting detailed protocols from survey providers, monitoring engagement to flag questionable responses, incorporating verification where precision is paramount and scrutinizing self-claimed results for inflated metrics. With careful interrogation of research strengths and limitations, brands can optimize their approach and connect with the real voice of the customer.

Access the full study, Behaviorally verified sampling vs. self-claimed sampling: A study on data quality, effectiveness and accuracy (registration required).