Untangling the Web

Editor’s note: Steve Crabtree is corporate editor at The Gallup Organization, a Lincoln, Neb., research firm.

As writer Alphonse Karr once said, “The more things change, the more they are the same.” If Karr had lived in modern America, his oft-quoted insight might easily have been inspired by the World Wide Web. In just five years, the Web has gone from being a relative novelty to changing the way most Americans, and millions of others around the world, live. Practically every human interaction previously conducted in the physical world has been retooled for the online environment: banking, shopping, dating, even sex.

Now that the novelty of the Web’s initial explosion into everyday life is beginning to wane, its fundamental social implications are being explored more thoroughly. Some observers (see, for example, Andrew Shapiro’s new book The Control Revolution) have noted that the new medium has resulted a social revolution oriented around the individual. They contend that the balance of power has shifted toward ordinary people, because they are no longer as reliant on government, media and big business for the sending and receiving of information.

Empowerment of the individual would seem to make public opinion all the more significant, right? Maybe so, but the rise of the Internet also means that all the problems survey researchers addressed in the 20th century, in addition to a variety of new ones, must be reexamined in that new context. That process was a key focus in April of the Gallup Research Center’s 2000 Nebraska Symposium on Survey Research.

The potential for conducting surveys online is enticing - it’s cheap and fast, and practically anyone with a modem and some Web development software can administer them. But the temptation to jump in without fully exploring the methodological challenges presented by the new medium has resulted in countless bad surveys -- conducted not just by teenagers working out of their parents’ basement, but by otherwise reputable research organizations. “We should be concerned about this because badly done Internet surveys hurt us all,” said Andy Anderson of the University of Massachusetts. “They make market managers leery of commissioning the research and make the public cynical and uncooperative.”

If the symposium was any indication, survey scientists are now beginning to hammer out some of the fundamental problems associated with online polling. Titled “Survey Research: Past, Present and Internet,” the event brought together more than 80 researchers from the academic and business worlds who have been working to understand the theoretical and practical issues involved.

There’s a lot riding on this research. Phone surveys are becoming increasingly problematic, thanks to the growing volume of telemarketing calls and the corresponding increase in call screening by potential respondents. As Don Dillman of Washington State University contended, survey researchers may well come to rely primarily on mail and Internet questionnaires in the 21st century.

But there may also be implications for the way we view democracy itself. As Gallup Poll Editor-in-Chief Frank Newport noted, a growing cadre of people see the Internet as the holy grail of a more direct form of democracy -- a means by which decisions about policy can be made by all citizens, not just their elected representatives (see, for example, www.realdemocracy.com.) To some observers, the use of online polling is a test of the degree to which the Internet can be used to gauge consensus among a larger and more far-flung population than ever before.

Pull of the past

The barriers, however, are considerable. It is virtually impossible at this time to construct a viable sampling frame of e-mail addresses. And even if that could be done, Americans in lower socioeconomic strata are less likely to have regular Internet access than are their more well-heeled counterparts, so the degree to which results can be generalized to the larger population is a thorny question. Privacy issues are also important - assurances of confidentiality ring somewhat hollow to many respondents in the largely unregulated online environment. Then there is a whole set of questions regarding the ways in which the context and design of online surveys influence responses.

When facing these challenges, pollsters are bound to experience a strong sense of déjà vu. Many of these problems have been dealt with before, albeit in different form. It’s almost as if the history of survey research has been rewound and must be fast-forwarded through again in cyberspace.

Probability sampling

Questions of statistical representativeness, for example, preoccupied survey practitioners in the first half of the 20th century, highlighted by the lessons of the 1936 and 1948 elections, when sampling flaws contributed to spectacularly embarrassing failures among major pollsters. Such concerns subsided in the 1960s and ’70s as telephone penetration rose to over 96 percent of the U.S. population, so that random phone surveys could be considered representative. But even if a way to randomly sample e-mail addresses becomes available, it will be some time before regular Internet access regular Internet access reaches an acceptable level of coverage. In the meantime, citizens in lower socioeconomic strata are less likely to be able to participate in online surveys. Furthermore, there is a strong self-selection bias to contend with; only those with Internet access who actively choose to participate will do so - there is no trained interviewer tactfully cajoling reluctant respondents to stay on the line.

It’s valid to ask, as James Benniger of the University of Southern California did, if general representativeness is always necessary. Benniger compared Internet polls to the straw polls of the 1800s, which allowed anyone who showed up to participate. He characterized online studies as a return to “mind-speaking”: those who are available and feel strongly enough to express their opinion do so, and there is little regard for formal representativeness. Only in the last few decades, Benniger noted, has the preoccupation with “mind-reading” come about. Starting with commercial applications, polls became increasing concerned with gauging not just the opinions of respondents, but what was actually in the aggregate mind of the public. Though most Internet polls are reminiscent of mind-speaking, Benniger argued, there’s no reason to think they are less vital as a form of public expression.

In many cases, especially in the business world, general representativeness may not be a concern. For example, most of The Gallup Organization’s online surveys are conducted either among limited populations for which the membership is completely identifiable (e.g., employee surveys for large corporations), or for clients who are interested in generalizing only to that segment of the population which uses the Internet regularly. As Bill Sukstorf, Gallup’s product engineer for Internet surveys, noted, “Online companies want answers to questions like, ‘Who’s visiting my site?’ ‘What do they think of my site versus my competition’s site?’ ‘How many times do they come here before they make a purchase?’ One of the next big areas for Gallup is doing good e-commerce research.”

Technology

Nevertheless, in light of the mounting problems with telephone polls, the ability to conduct generally projectable public opinion surveys over the Web is highly coveted. Compounding the problem of incomplete coverage is that of inconsistent technology. Browser and bandwidth differences mean that, at least for now, survey practitioners can’t be confident that all their respondents will see the same thing the same way, and be able to complete the survey with comparable speed. Netscape’s browsers might produce a slightly different visual interpretation of the underlying code than Microsoft’s, introducing a possible source of unwanted variation among respondents. Researchers learned to minimize interview-specific variation in phone surveys by carefully designing questionnaires (putting emphasized words in all caps, for example) and by training interviewers to read all questions verbatim. A different approach is required for online surveys, however, as the source of variation rests in the respondent’s computer, and is out of the researcher’s control.

Privacy

Concerns about security, both real and perceived, represent another set of obstacles. Just as telephone pollsters learned the best ways to ensure their respondents’ confidentiality, online researchers must ease respondents’ fears that their information will be intercepted somewhere in the digital ether and somehow used against them. More advanced online security techniques are helping to alleviate such fears, but the issue must still be considered.

Sampling, technology and privacy issues will become less forbidding over time as more and more households go online and new technologies like WebTV and Palm Pilots make it possible to connect to the Web without even owning a computer. What’s more, as the differences between Internet users and non-users become better understood, weighting techniques will grow more sophisticated and capable of compensating for the remaining undercoverage. Indeed, George Terhanian of Harris Interactive described a parallel-methods technique called “propensity score adjustment” which his company uses to “efficiently balance the characteristics beyond demographics that differentiate our online and phone respondents.”

Assuming all these technical concerns will be smoothed out eventually, researchers are still left with issues created by the difference between self-administered and interviewer-administered surveys. Self-administered surveys have been around in the form of mail questionnaires for decades, but cognitive psychologists such as Jon Krosnick of Ohio State University have only recently begun to consider how the interface between respondent and computer affects the survey situation differently than that between respondent and interviewer. Krosnick noted that response effects can occur at several different stages of cognitive processing. “Beyond all of these mechanical considerations is the psychology of the respondent,” he said. “To have the questions coming from a computer screen or a television set, as opposed to a human being might compromise the psychological processes people bring to the task. If so, we may want to rethink what we can accomplish in this mode.”

Dillman, on the other hand, is a leading proponent of self-administered surveys. A Gallup senior scientist and author of the new book Mail and Internet Surveys: The Tailored Design Method, Dillman noted that the Internet is capable of combining all the benefits of computerization currently enjoyed by phone surveys with the cost and convenience advantages of paper questionnaires. But he noted that, unless pollsters learn to craft online questionnaires very carefully to account for browser effects and other vagaries of the online environment, those advantages will be moot.

Dillman also believes that mixed-mode surveys will become dominant in the coming decades, and that successful survey organizations will learn how to integrate the most viable techniques of the past into the shifting social and technological climate of the present. “Survey organizations,” Dillman said, “whether they are in universities like mine, in private-sector organizations or in government organizations, are going to have to change dramatically in some ways in order to do effective surveys as we bring these new technologies online and still use our other technologies where they work.”

Push of the present

Despite these complex issues, the push to develop reliable, broadly generalizable online surveys remains strong. Internet surveys can be done at least as quickly as phone surveys, and much more cheaply. They place less burden on the respondent because they can be completed at any convenient time, day or night. They can employ increasingly sophisticated interfaces and data capture techniques, while introducing none of the human error generated by live interviewers.

But the bottom line is this: Clients of polling organizations want to conduct online surveys. Thanks to ubiquitous tales of booming dot-com startups and the new class of computer-geek millionaire, everyone is hungry for a piece of the virtual pie. Thus, there is a currently a huge interest among business leaders in all things Web-based. Doug Rivers of InterSurvey contends that the richness of the design option offered by the Web means that, “It’s not just a matter of doing surveys faster and less expensively -- you can do them better.” Whether or not that’s the case, it’s hard to deny that the versatile, user-friendly visual medium appeals to executives who know more about marketing appeal than statistical validity. The result is considerable pressure on survey scientists to find ways to make online methodologies viable.

The use of panels is currently the most common method used by organizations attempting to conduct representative Web-based surveys. At the symposium, Harris’ Terhanian and InterSurvey’s Rivers discussed their companies’ respective panel approaches. Harris maintains data on a massive, randomly selected panel of more than five million Americans, and from that pool selects a quota sample of known Internet users to participate in online surveys. This approach is efficient but potentially non-representative, because only previous Internet users are capable of participating. As noted, these people tend to differ from non-users in meaningful ways. Terhanian discussed Harris’ strategy of “triangulation,” whereby several different methods are used to approach the problem, in hopes that the different biases of each method will cancel each other out.

InterSurvey more directly addresses the representativeness problem by actually hooking respondents up to the Net so they can participate. Randomly selected participants are provided with a WebTV box, ensuring not only that they will all be able to participate, but that they will all do so using identical technology. That allows InterSurvey to explore the potential of the Web as a visual medium; respondents can be shown different versions of television ads using streaming video, for example -- a technique which would otherwise be crippled by inadequate bandwidth for many respondents.

Gallup has taken a different approach, forgoing generally representative online polls until approaches to the sampling frame problem are further developed, and the sampling error of such surveys can be reliably assessed and minimized. But that doesn’t mean Gallup isn’t currently in the business of administering surveys via the Internet. Bob Tortora, Gallup’s chief methodologist, is focusing on ways to maximize response rates through the use of incentives, for example, and by using multiple modes, such as e-mail reminders and follow-up postcards to perfect the flow of communication to and from respondents.

Dillman is another key player in the development of Gallup’s online questionnaires, applying his ideas about tailored design to reduce opportunities for measurement error. Other Gallup associates are investigating related questions. Karen Swift and Julie Kohrell, for example, will soon release a study examining the mode effects of paper vs. phone vs. online surveys, assessing such factors as completion rates and non-response bias.

Prospects for the future

Speculation about what this current work will mean to future survey practitioners was also a part of the April symposium. Anderson, for example, envisioned data-driven neighborhood centers that serve a sweeping variety of entertainment and reference functions for community members, as well as microcomputers unobtrusively distributed throughout households, gathering a broad spectrum of information on different family members in order to conform to their preferences and make their lives easier. Newport spoke of a more finely tuned democracy, which could result from an increased effort to assess people’s opinions.

Though the possibilities offered by online technologies stretch the imagination, some participants also cautioned against losing sight of other considerations for the future. Ohio State’s Krosnick, for example, noted that it would be easy to let a preoccupation with technical issues distract researchers from such basics as the psychological needs of respondents. “Maybe we need to think less about what we do in surveys to design our procedures and think more about how we talk about survey research in the larger social dialogue,” he said. “If Americans were more excited about and convinced of the value of survey data, that every time they answered a survey they were going to have a real effect on something that matters, would response rates and motivation and effort go up?”

Such caveats notwithstanding, events like the April 2000 symposium are charged with the anticipation of future possibilities. There is the general feeling that new capabilities are developing so rapidly that it’s difficult to imagine any problem being insurmountable in the long run. Andy Anderson of the University of Massachusetts may have captured the overall mood of the event best when he concluded that, “Optimistic as it sounds, I think that Internet surveys or their functional equivalent are going to allow us to do much better market research and much better social science, while protecting, rewarding, entertaining and informing our respondents. In the long run, although we may be listening in a different way, we’re truly going to be better able to hear the world speaking.”