An unwanted impact

Editor’s note: Wally Balden is director, online research, at Maritz Research, St. Louis.

In the November 2007 edition of Quirk’s, Kurt Knapton of e-Rewards and Rick Garlick of Maritz Research penned an article titled, “Catch Me If You Can.” This article focused on the problem of poor-quality panelists and how they can be detected and minimized within a particular online study. The basis of the article was a cooperative study conducted by e-Rewards and Maritz Research that tried to better understand the severity of the problem, how it can be detected and what can be done to minimize the influence of undesirable panelists.

A number of important conclusions were drawn from this cooperative effort:

  • Undesirable respondents are apparent in online research and are a cause for concern.
  • Significant differences were noted in the response patterns between undesirable and desirable respondents.
  • Eliminating poor-quality respondents positively impacts data validity.

Since that article was published, Maritz Research has conducted a number of Maritz Polls to better understand how poor-quality respondents impact data quality. While differences were found in response patterns in the earlier work, this article’s purpose is to provide additional learning on how these differences actually impact the data and, most importantly, the decisions we make based upon that data.

Better understanding

Based on learning from the previous Maritz Poll, our overall objective was to gain a better understanding of the specific response patterns of undesirable respondents – and if they differed significantly from those of desirable respondents. Undesirable respondents were put into three categories:

  • Fraudulent – respondents who intentionally misrepresent themselves in order to qualify for a survey or intentionally provide false or misleading answers.
  • Inattentive – respondents who do not pay attention or do not provide thoughtful answers when completing a survey.
  • Speeders – those who complete an online survey in an unreasonably short period of time.

Since critical business decisions may be at stake, the risk of making a bad decision based on data provided by poor-quality respondents can be significant.

While our initial work helped us better understand the severity of the problem - how we can effectively detect undesirable respondents and what we can do to minimize the problem - we were still left with a number of important questions about the data:

  • Do undesirable respondents provide different response patterns than truthful and engaged respondents?
  • If differences are noted, are they meaningful?
  • Can we identify response patterns that will help us better identify poor quality in future studies?
  • Do we increase our risk of making a poor decision by including poor-quality respondents in the data set?

Debate within the community

The first step in the process was to identify fraudulent or inattentive respondents within the survey by setting a number of overt traps. Overt traps involve embedding specific questions within the survey to serve as traps. There is debate within the research community about using such overt traps. Some argue that overt traps tip the respondent to our intention and make them more aware they are being watched - which in turn makes them harder to catch when they exhibit cheating behavior. While we recognize this situation may occur, we haven’t seen any evidence that the judicious use of trap questions makes the respondent more sensitive to our intentions. As such, we recommend they be employed in each online survey - again, in a judicious manner.


A variety of traps were employed within the Maritz Polls when they were deemed appropriate and within the context of the survey design. Examples of the types of traps employed are as follows:

  • “Red herring” - providing a fake brand or service for awareness, usage and “purchase most often” questions - e.g., Tagrill fuel, Tagrill Airlines, Teen Town.
  • Consistency of answer – fact-based questions asked at the beginning and end of survey - e.g., ask for zip code at beginning of survey and state of residence at end.
  • Oppositely-worded - opposite statements that respondent agrees to - e.g., “I always buy the most expensive item on the menu,” “I always buy the least expensive item on the menu.”
  • Specific instruction - respondent is instructed to check a specific response - e.g., “Please check slightly disagree.” Note: We purposely instructed them to check either the second or fourth item in a five-point scale to guard against incorrectly categorizing them as attentive, as straightlining typically involves the use of the end or middle points in the scale.

The Maritz Polls, which served as the basis for this analysis, are conducted on a regular basis throughout the year. Topics for each of these surveys varied and included: insurance, automobiles, retail shopping habits, travel, technology, airlines and employee engagement. The sample parameters also varied for these studies, but for the most part we ended with a sample of 1,000 respondents, split evenly between genders. A variety of online panel suppliers were employed for these polls to determine if quality varied within a given panel supplier. Survey length varied from 10 to 20 minutes.

The studies that serve as the basis for this analysis were conducted over a six-month period, typically every four weeks. This worked in our favor since we approached this topic with little empirical evidence and were unsure what to expect. As we completed these studies and analyzed the data we began to introduce new variables into subsequent studies to gain a more complete picture of the data quality situation. Later studies helped us better understand the following:

  • Does placement of a trap have an impact on inattentiveness, i.e., does disengagement increase as the survey length increases?
  • Do long grids encourage disengaged behavior?
  • Does length of grid have an impact on inattentiveness?

Being sensitive to the issue of overt vs. covert traps, we also conducted analysis of the summarized data set to look for inconsistency in response, straightlining and speeding.

Failure rate

To begin, let’s review the failure rate we witnessed across the various Maritz Polls.

As shown in Table 1, failure rates were quite high for oppositely-worded and specific-instruction tasks. While this finding was initially surprising, subsequent exploration found this is consistent with other research in this area (e.g., Doxus - “Satisficing Behavior in Online Panelists”).

Also evident in this data was the varying failure rate from study to study. This is attributed to three factors: subject matter of the questionnaire, length of survey and the inclusion of long grids and more complex tasks and exercises. Long questionnaires (20+ minutes) and those containing long grids or exercises had the highest percentage of trap failures.

We also tested a number of additional hypotheses regarding placement of the traps within the survey. Our findings:

  • Failure rates were comparable, whether the trap was at the beginning, middle or end of the survey.
  • Placing the trap within a grid or as a stand-alone question did not impact failure rate.
  • Breaking a long grid into smaller grids did not impact the failure rate for the specific instruction trap.

These additional findings would lead us to believe that a disengaged respondent enters the survey in that state of mind - which conflicts with our earlier finding that failure rates were higher for surveys considered long, boring and more complex. Whether we encourage poor behavior by subjecting panelists to long, complex and boring surveys is a question that requires additional investigation.

Inconsistent responses

We focused additional analysis on disengaged respondents - those unable to follow a simple instruction - e.g., “Please check slightly disagree” or who provided inconsistent responses (I always buy the most expensive item; I always buy the least expensive item).

In three of the studies (retail, technology and insurance) we experienced high failure rates for these questions. There were commonalities among these studies - all were 20+ minutes long, contained long attribute ratings and the subject matter was not particularly interesting.

Based on our data analysis from these three studies, we uncovered the following patterns that differentiated disengaged from engaged:

Awareness, usage, purchase

Disengaged respondents exhibited a consistently higher (significant) level of awareness, usage and purchase compared to engaged respondents (retail survey). The reason for this is not known, although it could be surmised that providing positive responses at qualification-type questions will improve their chances of not being terminated.


Attribute ratings

Straightlining was strongly evident among fraudulent/inattentive respondents in all three studies.

When straightlining was evident, fraudulent/disengaged respondents primarily chose the midpoint of the five-point scale.

Significant differences were noted for the vast majority of the attribute ratings. Engaged panelists were more likely to spread their responses among the top three boxes in the five-point scale, while disengaged where clustered at the midpoint. This resulted in significantly higher (positive) attribute rating scores for the engaged sample.

Brand ratings

For questions related to brand relationship, emotion and personality there were significant differences noted between the two groups. Disengaged respondents were significantly more likely to connect an emotion or personality to a brand than engaged respondents - by indicating a yes vs. no. Yes was the first response on the display and easiest for the disengaged respondent to select.


Negative impact

Our ultimate objective was to determine if including data from fraudulent or inattentive respondents would have a negative impact on decision-making. We have numerous examples to choose from to help answer this question.

  • Despite the significant absolute differences in awareness, usage and purchase we did not see a change in relative rankings of the brands on these measures.
  • Midpoint straightlining by the disengaged sample led to lower overall attribute ratings - however the relative rankings of any of these attributes did not change.
  • Absolute differences noted in the brand ratings (relationship, emotion, personality) did not impact the relative positions of the brands.

These examples indicate that relative rankings were not impacted by the inclusion of poor-quality respondents. If ranking is more important than absolute score, then decision-making would not be impacted. However, this situation would change if objectives or targets were set for the various measures, such as awareness, usage or purchase. In addition, there would be concern if this was an ongoing tracking study. If the balance of undesirable/desirable is not maintained, we would expect to see a spurious movement in scores at each wave. The same impact could occur if we became more vigilant in future waves and reduced the number of undesirables that are part of the ending sample.

Behavior is evident

This exercise’s purpose was to gain a better understanding of the issues we face when poor-quality respondents are included in the data set. We conclude the following:

  • Fraudulent and inattentive behavior is evident in studies involving online panels.
  • Fraudulent or inattentive panelists answer differently than truthful or engaged panelists.
  • Straightlining and satisficing behavior are more likely among fraudulent/inattentive respondents.
  • While relative position of brands/attributes was not impacted by the inclusion of disengaged respondents for these studies, care should be taken if this data is collected as part of an ongoing tracking study

A word of caution: This was by no means an all-encompassing or complete evaluation of the impact of inattentive or disengaged panelists on data quality. Work by other researchers both supports and refutes these findings. However, the fact that we have uncovered consistent differences in a number of different studies should serve to heighten awareness and raise the caution flag.

Ultimately, we must all be more vigilant in identifying and removing undesirable respondents - or minimizing them - at every step of the research process. This vigilance goes beyond simply setting traps in surveys. We must establish best practices at each step in the research process to ensure we are providing only the highest quality data available. The risks of making a bad decision, based on poor-quality data, are much too great to ignore the issue.