Designed to engage

Editor’s note: Nallan Suresh is senior director, panel analytics, at San Francisco research firm MarketTools Inc. Michael Conklin is chief methodologist in the Minneapolis office of MarketTools Inc.

The impact of questionable online survey respondents on data quality is well-documented. Previous research-on-research by our firm, MarketTools Inc., has shown that fake, duplicate or unengaged respondents compromise data quality. But what about the design of the survey, which may affect all respondents, with both good and bad intentions?

MarketTools conducted a comprehensive study that examines the effect of survey design on data quality and found that, in order to ensure the quality of research data, researchers must not only remove “bad” respondents from their samples, they must also design surveys that keep the good respondents engaged.

Are interrelated

Experienced researchers have long assumed that survey design, respondent engagement and data quality are interrelated. For example, it seems obvious that long and complex questionnaires will increase the likelihood of undesirable behaviors such as speeding and survey abandonment, and that data quality will suffer if there is a high percentage of unengaged respondents in the survey sample.

As we sought to understand and quantify the effect of survey design on respondent engagement and data quality, we used our firm’s TrueSample SurveyScore measurements from over 1,500 surveys and 800,000 responses to conduct a two-phase research study. Phase 1 of our research evaluated whether survey design influences the way respondents perceive a survey and how they behave while answering survey questions. Phase 2 of our research examined the effect design variables and engagement measures have on the quality of response data. Simply put, we sought to determine whether “good” respondents driven “bad” by poorly designed and complex surveys could lead to reduced data quality. If so, we can help researchers to optimize their survey design to improve overall data quality.

Show the impact

TrueSample SurveyScore is designed to be an objective measure of survey engagement and help show researchers the impact that survey design has on engagement. It is a function of both experiential variables, such as respondents’ rating of the survey taking experience, and behavioral variables, such as survey abandonment and speeding. To date, MarketTools has collected SurveyScore data for more than 10,000 surveys, with over 2.6 million completes. These surveys span product categories (such as food and beverage, financial, technology, entertainment, health and beauty, health care and travel) and research methods (such as concept screening, line and package optimization, and attitude and usage studies).

Our team sought to determine whether certain survey design variables could reliably predict the composite engagement measure of respondent behavior and perception that comprises TrueSample SurveyScore. We built a model to predict engagement using survey design variables and the TrueSample SurveyScore database as inputs. Predictability is an indication that survey design impacts engagement in a consistent way, implying that we could recommend adjustments to the design variables that would minimize adverse effects on engagement. Specifically, we modeled the impact of more than 20 survey design variables (independent variables) that are within the control of survey designers - such as survey length, and total word count - on several respondent engagement measures (dependent variables) reflecting the respondents’ perception of the survey and behavior during the survey.

Clear indication

The research revealed that a multivariate model that captures the complex interaction among design variables is able to predict overall engagement, comprised of both experiential and behavioral variables. The fact that the impact of these variables is predictable provides a clear indication that survey design directly influences respondent perception and behavior, i.e., engagement, in a consistent way. This means that survey designers do have some degree of control in improving engagement. This also means that the SurveyScore can be predicted prior to deploying a survey to help guide design modifications.

We uncovered another interesting finding when we examined the influence of particular survey design elements on specific aspects of engagement, such as survey rating or partial rates. While survey length proved to be generally predictive of most respondent engagement measures, there was wide variation in the design variables that were most influential in driving various measures of engagement. For example, for the survey rating measure, one of the most predictive design variables was the elapsed time per page of the survey. For the speeding measure, however, elapsed time per page was not even in the top five most important design variables.

Thus, adjusting just one parameter may not be sufficient to elicit desirable behavior from respondents, nor will it singlehandedly improve their perception of the survey-taking experience. Instead, the findings reveal that engagement is driven by a complex interaction among design variables.

This means that simple survey design guidelines or rules are inadequate for motivating the desired respondent engagement. There is no axiom that applies in all cases, such as, “Surveys that require more than 20 minutes result in poor respondent engagement.” In fact, our researchers uncovered several examples of long surveys that had a higher-than-normal survey rating as well as a lower-than-normal partial rate, which would run contrary to what one would expect if length alone were a deciding variable. Conversely, we found examples of short surveys that had a lower-than-normal survey rating because of the design of other variables.

An effect on quality

With the impact of survey design on respondent engagement established, the research team endeavored to determine whether engagement had an effect on data quality. The TrueSample SurveyScore database allowed us to test this hypothesis. MarketTools fielded three surveys with varying levels of complexity, categorized as moderate, medium and high. We analyzed 1,000 completes for each survey. The experimental surveys had the same series of questions about demographics, products purchased, etc., but differed based on the number of products respondents said they purchased. The level of complexity increased as more products were chosen and more brand attribute questions were displayed. In the moderate category, respondents were asked one question per product. In the medium-complexity category, respondents received 17 brand attribute questions per product. In the high-complexity category, respondents were asked 17 questions for every product chosen, plus additional open-ended questions.

We computed and compared the SurveyScore for the three surveys. Predictably, it dropped precipitously with the higher complexity levels. The medium- and high-complexity surveys received an extremely low score, as shown in Table 1.

Next, we conducted a series of statistical tests to evaluate the effect of respondent engagement on data quality. By conducting different analyses, we were able to examine data quality from various angles for a more comprehensive review. Specifically, we investigated the following.

Will unengaging surveys:

  • Increase the odds of sample bias?
  • Make respondents more apt to answer the same question inconsistently?
  • Make respondents more prone to random answer choices?
  • Make respondents more likely to provide inconsistent answer choices?
  • Make respondents tend to select “none” as an answer choice?

We examined whether a high abandonment rate could cause bias in completed responses and thereby reduce overall data quality. In other words, as the surveys became more complicated and their SurveyScore dropped, did the makeup of the respondents change and create the potential for biased data?

The answer was yes. As illustrated in the diagram in Figure 1, respondents who completed the medium- or high-complexity surveys were more tolerant of the increased question load (the more products they selected, the more questions they were asked), leading to bias in those groups compared to the group of respondents who completed the moderate survey. The graph on the left of Figure 1 shows that as the number of products selected increased - thereby increasing the number of questions to be answered - the partial or abandonment rate grew for the more complicated surveys.

As shown in the graph on the right of Figure 1, of those respondents who did not abandon the survey, the percentage who selected five products was much lower for the medium- and high-complexity surveys than it was for the moderate survey. So, while the actual data had a higher percentage of respondents that had purchased five products, many of these did not make it through the survey, resulting in sample bias.

Our research also tested whether the respondents’ ability to answer the same questions consistently during a single survey was a function of the survey’s complexity. We measured the consistency of the responses to questions that were repeated in separate sections of the survey, and we found that recall discrepancies increased as the SurveyScore dropped - proof that more complicated surveys lead to inconsistent and unreliable responses and lower data quality.

We then measured the consistency of responses across all possible question pairs to develop an inconsistency metric. This metric enabled us to determine if a given selection was random or closer to the expected response. The more unusual this pairing was - meaning the likelihood of its occurrence was low given the incidence of all the other options for these questions - the higher the departure from the expected value and the higher the inconsistency metric. Our finding was that inconsistency increased as the SurveyScore dropped, contributing to lower overall data quality for the more complex surveys (Figure 2).

Finally, we sought to determine if surveys with a low SurveyScore caused respondents to lose focus and provide inconsistent or unpredictable responses. To measure the choice predictability of each of the surveys, we used a discrete choice model (DCM) exercise (Figure 3). Specifically, we tried to predict respondents’ product selections on two tasks based on their selections on seven other tasks (DCM sections were identical across all surveys). We asked, for example, that respondents select the one product they would prefer to buy from each page, if any, and based on their answers to previous questions, we tried to predict their response. The respondents could also choose “none” as a response, indicating that they would choose none of the products.

During this exercise, we noticed that the accuracy of the prediction (when the selection of “none” was also included) was 75-79 percent for all surveys, a relatively high prediction rate. However, the model for the medium- and high-complexity surveys gave a much greater emphasis to the “none” selection, meaning that the respondents for these surveys tended to select no product, as opposed to one of the available products. Once we removed the “none” option from our model, the prediction accuracy dropped significantly for the high-complexity survey. In addition, the lower-scoring surveys had more violations in price selection order, meaning the respondents tended to violate the expected order of selecting a lower unit price over a higher one. The net result: surveys with a low SurveyScore translated to lower predictability and thus to lower data quality.

Take responsibility

Our conclusion? Researchers must take responsibility for data quality by removing bad respondents and designing surveys that keep good respondents engaged. Research professionals now have evidence that survey design not only influences whether respondents abandon a survey but also impacts the data for those who complete it.

The ability to predict the effect of various survey design variables on respondent engagement will help survey designers maximize engagement to increase the reliability of their data. Researchers no longer have to assume that a long survey will jeopardize the quality of the results, since we have shown that it is possible to compensate for the adverse effects of certain design variables by adjusting others. By using engagement measurement and prediction tools, researchers can know that survey design affects data quality, can measure engagement to help improve survey design and optimize design to enhance the reliability of results.