Editor’s note: Corinne Maginnis is the executive vice president in charge of online solutions for M/A/R/C Research, Irving, Texas.

As marketing research projects migrate from telephone or field to the Internet, a client’s first question is often, “Will the online sample will be representative?”

The first question should really be, “Is the Internet the best environment for the project?” The following factors can help determine whether the project is suited for the Internet:

  • Can the survey be self-administered?

  • Can the information about the product or service be effectively communicated on a computer monitor?

  • Can the target respondent be reached via e-mail?

  • Can members of the cyber-population reflect the client’s target market?

Only after these questions are answered with a confident “yes” should the marketing research firm begin to procure and work an online sample to ensure and maintain a representative frame.

When field data collection migrated to the convenience of shopping malls, researchers learned how to manage a mall sample to be representative enough to meet their needs. Researchers work diligently to make the mall-intercept sample representative by selecting representative markets, placing demographic and usage quotas in each market and spreading the completes evenly across markets.

Online sample can be similarly managed to accomplish optimal representation. Researchers first tap into a variety of sources for e-mail addresses, such as list brokers, online panel companies and a client’s own database of customers and prospects.

After an appropriate source for the online sample is identified, the research agency pulls and manages the sample. However, if a sample is randomly pulled simply to accommodate the target and number of completes needed, the sample will fail the representation test.

The following techniques can be used to ensure a representative sample for every online study.

  • Start with a balanced sample
    Online sample is a pull-type sample. The e-mail invitation to the survey “pulls” people to the interviewing site. The exception to pulling online sample is a “river sample” where respondents are “pushed” to an interview after being screened or intercepted on the Web. (For more details on the types of samples, please see the sidebar.) If invitations are sent to a balanced or stratified sample, the probability of a balanced response is high.

To ensure representation is considered at every step, it is important to instruct the sample provider on how to pull the sample. Most sample providers look to the research supplier for guidance or instructions. Research suppliers can provide specifications, such as stratification parameters. These parameters will vary depending on the target markets, target respondent types and sample universe definitions.

  • “Quilt” your sample sources
    Multiple sample sources can be quilted together to form a more representative sample frame. Perhaps the client database is the primary sample source, but a non-biased competitive analysis is needed. Quilting a list sample with the client’s database will provide two cells to analyze and compare while allowing a competitive context to be added to the data.

In some cases, the target respondent cannot be fully represented in the online sample. At those times, the study should be conducted using multiple data collection methods. The online and offline samples can then be quilted to mitigate response bias and deliver full representation.

If quilting sample sources is required to achieve a sufficient volume of sample for the study, be aware that online panels are populated using similar methods. Therefore, a process for removing duplicates from the final data set is necessary.

  • Use a national representation
    As clients move ongoing tracking studies online, many seek a way to replicate random-digit dialing (RDD) sampling as closely as possible. Most sample sources can be balanced to Census demographics and geographic regions. However, mimicking a Census frame may not be sufficient in attempting to create a random probability sample online.

A nationally representative sample includes a representative sample pull from more than 3,100 counties, proportioned according to population in the four county sizes. Because some online sample suppliers are new to research, teaching the sample vendor how to pull these types of samples takes time and patience. However, the payoff is in the resulting representative data.

  • Employ propensity scoring
    Many researchers are concerned that online respondents are different than offline respondents. This concern is underscored by the presence of “professional” online panel respondents. Panel respondents are frequently online, and they also may take a number of surveys online which could possibly influence the responses they provide. Propensity models can help minimize the impact of both of these issues.

Propensity models are useful with panel or database samples where behaviors and demographics can be modeled. Through the use of propensity models, respondents can be weighted proportionately for the sample pull based upon their Internet behavior, allowing for a more even representation of online/offline behavior.

Similarly, sample pulls can also be adjusted to minimize the impact of frequent responders and non-responders. Propensity scores applied during the sample pull will minimize the need for weighting during data processing.

  • Manage the sample to improve representation
    The techniques mentioned above are helpful in getting a representative sample pulled. But after the e-mail invitations are launched, the research supplier must continue to manage the sample to optimize representation.

Quota controls are needed to manage a balanced response. Letting all of the responses come into the data is fine if the client and research agency are only interested in volume/quantity of response. If a representative response is desired, then the research team must manage the response with quota controls and tracking quotas. Conversely, if the objective of the research is to define the profile of the target audience, managing response with quota controls runs counter to that objective.

Most telephone RDD samples are managed through contact quotas and daily, weekly or monthly targets. Online contact quotas can be established with pull or push samples. Coupled with quota controls, contact quotas can be worked effectively online to produce a nationally representative sample.

A tracking study can be managed online the same way it would be managed in a phone center, significantly increasing the consistency with previous data collection.

Understand the source

A representative sample can be acquired in an Internet survey if specific techniques are applied. To gain comfort with online research, researchers must understand the source of the sample, how it is pulled and how it is managed during the interviewing process.

ARTICLE SIDEBAR

Sample sources explained

  • River sample or Web intercept or Web screened sample is a “pull” sample. Participants are screened and directed to surveys for which they qualify. This sample source is not reflective of a panel environment and little is known about the respondent except what is learned in the screening.

Generally, river samples have access to millions of potential survey participants. Marketing messages are placed in banners and ads to drive traffic to survey sites. The participants who respond are willing to do surveys to receive awards/points and are, therefore, highly cooperative.

Usage frequency can be controlled during the screening process in some of the river environments. Respondents are coded based on the type/category of survey completed and the timing of the survey so that they can be locked out of future surveys based on the frequency of use rules.

  • List sample is similar to river sample in that little is known about the respondent. These potential respondents have agreed to receive e-mail messages relating to a topic of interest but have not necessarily agreed to participate in research surveys.

These lists typically have low response rates (1-2 percent) and, generally, no controls are in place to control for usage frequency.

Research suppliers that use list sample resources can exclude respondents who have participated in and/or completed a similar survey for them in the past by supplying respondent identification for de-duping.

  • Database sample is different from a list in that the member profile information is richer and there may be controls in place for usage frequency. In many cases, database sample is used primarily for marketing purposes, not research. Some databases are positioned as panels but lack the structure and response rate of a true panel. Response rates vary from 5 percent to 15 percent for most database samples.

  • Online panels have been developed predominantly for online survey research. Several online panels have been developed in recent years. Some have been developed by research companies, some by sampling companies and others by companies that know little about research or sample.

Panels enjoy the highest response rates (most 15-35 percent with some upwards of 50 percent) and offer the ability to control usage frequency by category, type of study, etc. Panel members have agreed to participate in surveys and are incented by the panel company for their participation.