Editor's note: Tim Huberty is president of St. Paul-based Huberty Marketing Research.

"Let's test it." Without question, those are the three words that strike the most fear into the hearts of agency account executives, creatives, and researchers. All that heavy upfront investment in account planning ("getting into the soul of the consumer") has just had its credibility torpedoed once again. The scenario further unwinds when the agency finds out that they can't just "run it by a few focus groups." The agency has to turn over its "baby" to the cold-hearted, detached, objective expertise of an outside copytesting research supplier.

After the mandatory protests and brief periods of mourning, the next greatest challenge is to select the "bad guy," the company that will perform the testing. Many an assistant research manager on the agency side or an assistant product manager on the client side has had to find out who expertly performs this sort of atrocity. Usually, this entails determining who's out there and then painstakingly compiling a chart of what is done, how, when, where and for how much.

After reading this article, you'll never have to delegate that task again. Indeed, what I will do here is identify those companies and highlight their strengths and weaknesses, playing the role of detached matchmaker myself.

There are a couple of rules of thumb to keep in mind. First of all, these are the major players. They specialize in copytesting. Typically, they don't do tracking studies and customer satisfaction research. Copytesting is their bread and butter. Of course, there are countless other companies out there that do copytesting. But those generalists also do tracking studies and customer satisfaction research. I mean, I can do copytesting. Of course, I can also perform lobotomies, but haven't had too many takers calling me for that service.

Second, the thing which separates the specialists from the generalists is the fact that the copytesting experts have quantitative norms. Norms are those comparative numbers which indisputably measure how well your ads or campaign actually do stand up against others. Norms are the objective evidence that their systems work. And obviously, those norms are much more credible than the anecdotal, "Yeah, that good-looking woman wearing the silk blouse, sitting in the corner of the third focus group in Minneapolis, sure got upset over the second storyboard."

Third, testing an ad or campaign is not cheap. However, the old axiom of "you get what you pay for" certainly plugs in here. You could probably find some guy who works out of his basement to drive to Huron, S.D., to interview the local yokels about print ads for a regional chain of banks for $2,500 (I've done that!). But with ad production running into the hundreds of thousands of dollars, an amount which itself is dwarfed by the media spending, why would you want to try to save a few pennies on the most important step, i.e., determining if you've got anything worth producing or showing to anybody?

Fourth, the more "help" you can give these specialists, the more opportunity your baby has. Just look at the effectiveness of intense lobbying upon clueless legislators! I very strongly recommend passing on all creative briefs, brand profiles, and strategies to the copytesters beforehand. Help put them in the "proper mood." The more they can come to think like you do, the more likely they are to sympathetically interpret the results with you in mind. Some years ago, a campaign my agency had worked on for many, many months was turned over to one of these testing services. The only thing we (very, very grudgingly) sent to the copytesting service was the required videotape. That's it. Even today, I recall the horrifying screams of rage from the creatives when they found that the ad garnered a negative persuasion score. (In other words, not showing any advertising would have been more effective than showing this ad!)

Finally, I have chosen not to include the "physiological" testing services in this article. I acknowledge some companies have been touting eye tracking, galvanic skin response, voice pitch analysis, and even brainwave analysis for several years. I have tried several of them, but they've never worked.

Years ago during my agency days, I allowed several ads to be eye-tracked. Trouble was, each time I went to the mall to observe, the interviewers could never synchronize the machine to the respondents' pupils. (I pray they were more successful when I wasn't there!) Last fall, I had an individual trying to show me the potential of brainwaves. He enthusiastically told me how well the technique worked on his fellow employees. Problem was, the headpiece "measured" more brainwave activity when it was just sitting on an empty chair.

So, what follows is a brief synopsis of each service. This overview is meant to complement the attached comparison chart. Ultimately, however, both the description and the chart are to help you decide which supplier you actually want to call. Consequently, the most important row on each chart is probably the last one, the one which lists the name, phone number, and e-mail of a contact person at each company. Obviously, that person can provide much more in-depth information about the company - and attempt to "correct" some of my candid observations.

Ameritest

Ameritest is a company which is in love with the science of how - and why - advertising works. Interestingly, the company was founded by a former agency researcher who cut his teeth on the Leo Burnett Company pre-testing system but has made quantum improvements upon those early techniques. Ameritest claims to be a "fast-growing research company with major multi-international clients."

Data collection is done much like some of the other copytesting systems. Respondents are recruited via mall intercepts. They are shown the commercial in a clutter reel of five ads. However, unlike other systems which conduct clutter testing for the sake of clutter testing, Ameritest uses the clutter test as the first indication of brand linkage. This top-of-mind mention is the handle for retrieving recall of the advertising. Ameritest has found that "the top-of-mind measure of brand linkage is more discriminating across commercials and more predictive of in-market results." In addition to brand linkage, attention and motivation are also measured following this initial exposure.

Again, like other copytesting systems, respondents are shown the commercial a second time. Here, a battery of both open- and closed-ended questions and attitudinal statements provide additional learning. As testament to the sensitivities of its agency origins, Ameritest shuns "reports cards" and instead reports results under "What's Working" and "Opportunities for Improvements." The focus of the system is on actionable diagnostics, highlighted by visual communication. Ameritest has a slew of case histories demonstrating how ads with problems were re-edited for improvement.

Finally, respondents are shown the commercial a third time, but this time the ad has been "sliced" in picture-frame segments. Viewing the commercial over a computer, the respondent "clicks through" the degree to which each and every picture is attention-getting - which leads to Flow of Attention scores - and emotion-generating - which leads to Flow of Emotion scores. These two charts literally become a map of what is working within the execution. "It is a diagnostic tool designed to help you understand how well the viewer has processed the visuals in your commercial," the company's literature states.

The "deconstruct" technique is also applied to print advertising. However, this time, the ad is divided into smaller boxes. The respondent is exposed to the ad three more times - in half-second, one-second and four-second increments - and again he or she indicates what individual parts of the ad are attention-getting and emotion-generating.

The ARS Group

The ARS Group of rsc, the quality measurement company, is the largest copytesting organization in the United States. Over the past 25 years, they have tested over 40,000 television commercials! The ARS Group is passionate about copytesting. Their overall philosophical approach is that the advertising process is an investment that can be managed and should ultimately pay out in an identifiable ROI to the advertiser. Moreover, The ARS Group views itself as a "copy management" partner whose goal is to help the advertiser and the agency improve the odds of success.

The ARS Group recruits a nationally representative sample of approximately 800-1,000 men and women to come to a central location. Before viewing two half-hour television pilots, respondents are asked to select the branded products that they would like to receive, should they win the prize drawing. This measurement becomes "Pre-Choice." Prize winners are then drawn, and respondents are shown the television pilots (with the commercials, of course).

After viewing the shows, respondents are asked a series of questions to critique the television material, and a second set of product choices are made. This measurement becomes "Post-Choice." The difference in brand preference between "pre" and "post" is known as the ARS Persuasion metric. This measurement has been validated to actual business results more than any other advertising measurement in the business.

Finally, after exposure to the television material, a sample of respondents is called back to obtain the ARS Related Recall and Key Message Communication measurements. A validated diagnostics profile is provided along with the ARS Persuasion score.

The ARS Group offers a complete line of services as part of their "Best Practice Approach" to advertising development and management. Measurement is applied at pivotal stages to ensure success. These services help advertisers make decisions in such diverse areas as the selection of a selling proposition, projection of wearout effects (using proprietary outlook planning software), the monitoring of competitive activity, and the evaluation of storyboards and concepts (using the Interactive Diagnostics service).

Diagnostic Research

Diagnostic Research (DR) believes that, "Advertising research should focus on the integration of copy and execution in order to provide insight about the relevancy of the creative idea." DR is philosophically opposed to copytesting systems which are strictly evaluative (single numbers-oriented) since they believe that "the best advertising research technique must provide a comprehensive assessment across a variety of performance areas as a means of optimizing the advertising."

DR always recommends testing in some rough/pre-finished format as it affords the best opportunity to test multiple ideas and apply insights when they can be most useful. However, many of their clients test in finished form, and apply the learning to future executions. For both TV commercials and print ads, qualified respondents are recruited via a camouflaged screener in mall locations and escorted to the interview location, where a one-on-one in-depth interview is conducted. For TV tests, respondents are exposed to a clutter of seven television commercials, all in the same level of finish as the test commercial. The test commercial is always in fourth position. Immediately after this first exposure, brand name recall and main idea is gauged. Then, respondents are re-exposed to the test advertising and questions take on a "diagnostic" focus. This begins with questions regarding viewers' comprehension of the advertising (message takeaway, relevance, clarity), then on to evaluative measures that address viewers' emotional response to the stimuli (thoughts and feelings, likes/dislikes, credibility, tonality and distinctiveness). Ultimately, the interview focuses on consumers' response to the advertising (brand imagery, personality, opinion and purchase interest).

Print testing follows much the same procedure. Again, respondents are recruited at a mall location. This time, they see the test ad within a clutter of 19 print executions of noncompetitive products. The 20 ads in the clutter portfolio are rotated from respondent to respondent to avoid order bias. There is no editorial content except for the constant filler copy that appears opposite single-page ads in the print portfolio. After that, the interviewing format is essentially the same as for television testing.

Ipsos-ASI

Ipsos-ASI's copy testing portfolio includes multiple products marketed under the brand name "Next." Ipsos-ASI believes "Next represents the most comprehensive set of pre-testing tools in the industry."

The flagship product is Ipsos-ASI Next*TV‚ a copy test system that measures television advertising's impact among both general and targeted adult audiences. The Ipsos-ASI Next*TV system exposes test advertising to consumers in their homes embedded in a 30-minute TV program under the guise of television program research. The method also includes a forced exposure to the ad at the end of the interview for in-depth diagnostic assessment.

Respondents are pre-recruited via telephone and asked to participate in a TV program evaluation. The program is held constant to control content effects. Test materials are mailed to consumers, including instructions and a patented, self-erasing videotape. At a specified time, consumers view the program. The test ad appears in the program, providing a natural exposure to the advertising. The next day respondents are called back. After the Recall interview, they participate in the diagnostic part of the study.

Ipsos-ASI Next*Print is the in-home, in-magazine ad testing system. This system's comprehensive measures and testing environment parallel Next*TV in terms of general approach, sample, and key measures. Tests are conducted in current issues of general distribution magazines (such as People, Newsweek, or Better Homes and Gardens) that are purchased prior to newsstand distribution. Consumers are recruited to evaluate a magazine, in which ads have been "tipped-in." Consumers read the magazine and the advertising in their homes, providing a natural exposure to the ad. Their responses are collected one day after exposure. Finally, a forced exposure is also provided for diagnostic questions.

Other members of the Next family include Next*Print‚ Next*Kids, and ASÍ Es (Hispanic pre-test methodology), along with Next*Idea and Next*Print Express for early-stage ideas.

Mapes and Ross

Mapes and Ross (MR) provides two complementary copy testing services. For both television commercials and print ads, MR offers Natural Exposure, which "tests advertising in the most realistic environment available" and CoreSearch, "a forced-exposure, immediate response method." Both systems can accommodate various production stages, including reels of all animatics, all photomatics, and all finished commercials.

Natural Exposure determines how one's message is perceived under "realistic conditions." Either 150 or 200 respondents are recruited in geographically dispersed markets via door-to-door interviewing and asked to read and give reaction to the current issue of a magazine or by telephone to watch a prime-time television program. At the time of recruitment, pre-exposure levels of preference for the client's category and five other categories are taken. All respondents are recontacted by telephone the next day and then interrogated about the advertising. This allows MR to compute Day-after Preference Change, Day-after Recall and Idea Communication.

CoreSearch resorts to good old-fashioned mall intercepts. One hundred respondents (n = 100) are recruited off the mall and taken to a "secure location." Clients have the option of including both clutter exposure as well as a subsequent re-exposure to just their own advertisements. Following ad exposure, respondents are asked intensive diagnostic questions. CoreSearch is designed "to provide a strong and varied diagnostic assessment of how respondents perceive and react to a message at the time the communication takes place."

Both Natural Exposure and CoreSearch can be modified to include customized questions to address specific concerns. In addition, MR offers a proprietary approach, Profile, to determine the impact an ad has on a brand's image, and EquiMax, a proprietary method for evaluating the effect of advertising on brand loyalty. Finally, MR has applied their techniques to other media, including radio, newspapers, newsletters, etc.

Millward Brown

Because Millward Brown (MB) recognizes that copy research must do several different jobs, they have a variety of products, all of which are very diagnostic in nature. MB evaluates both animatic and finished ads using their LINK copy testing system. MB is adamant that "LINK is not a one-number system." Test elements combine to produce strong feedback on how the ad will perform in real life, giving evaluative and diagnostic information on 1) creative breakthrough, 2) brand linkage, 3) communication, and 4) how the ad accomplished what it did. "In this way the LINK copy test provides pass/fail sorts of insights with deeply diagnostic feedback on possible improvements for the ad," the company says.

MB's TV LINK is a central location test conducted in malls across the country. A total of 150 interviews are conducted with a standard questionnaire. Respondents are shown a clutter reel of four commercials, in any stage of production from animatic through produced execution. A "practice ad" is shown and "practice questions" are asked so that respondents get comfortable. The test ad is then shown again. After the second exposure, the full interview is conducted. Finally, the respondent sees the ad a third time and then completes the "interest trace."

MB's Print LINK methodology consists of a customized sample of 100 respondents. The ads are placed in a portfolio designed to replicate the type of environment where they will ultimately be seen. Tests are typically conducted on a national basis (which translates into eight sites across four geographic regions). The interview lasts 20 minutes, during which time a myriad of information is collected from the ad's ability to "pull the reader in" to its "relationship with the brand."

MB has been around for over 20 years and obviously knows its stuff. They preach that, "Branded memorability is the key." Brand memorability is "different from traditional 'recall' or 'persuasion' measures," since "it has been correlated with short-term and longer-term sales effects using statistical, sales allocation modeling."

RoperASW

Until being acquired last September and then later merging with a third company, RoperASW was known as Roper Starch. Roper Starch had been testing advertising for so long that its name has evolved into a verb. Starch invented print ad testing in 1923, a heritage which has caused many a product manager to instruct its agency to "Starch" an ad. This would consist of a through-the-book interview with issue readers, in which respondents were asked whether or not they had seen or read each ad. Then, three degrees of reading were recorded: 1) noted, 2) associated, and 3) read most. They still do this.

But now RoperASW has evolved into bigger and better things. Unlike other services, RoperASW believes that television commercials should be tested monadically, to "determine respondents' top-of-mind response, which can only be achieved in a non-clutter environment." Their pretest methodology, ADD+IMPACT, is designed to go beyond "win" or "lose" test results.

For both television and print testing, respondents are recruited to a central location. The actual interview itself is a one-on-one, semi-structured in-depth interview with many open-ended questions. These responses are tape recorded to ensure complete fidelity of recording and complete transcripts are provided as part of the final report. The interview concludes with an in-depth, self-completed questionnaire relating to attitudes and feelings about both the creative and the advertising product.

ADD+IMPACT uses a modified norm which compares test results with those of effective ads. The thinking behind this approach is that the most important questions that an advertiser wants answered are: Will my ad attract attention and hold the audience's attention? And will it increase/maintain brand use?

While norms are available by ad type, country, and product type, this system downplays the use of traditional norms because an ad may perform considerably above a given norm, but still not be effective (particularly if the norm is low to begin with). This "effective norm" does not change by product category, medium, or country.

MSW Group

Finally, there's at least one other company out there, the MSW Group (formerly McCollum Spielman Worldwide). According to A.B. Blankenship and George Edward Breen in State of the Art Marketing Research, 400 respondents are recruited to a central location under the guise of reviewing a proposed television show. Respondents answer questions about brand/product usage before viewing a half-hour variety show (including a station-break clutter sequence of commercials with the test commercial). Then questions are asked about reaction to the show and unaided brand recall and copy recall. Next, the test commercials are shown alone, followed by specific questioning about a "market basket choice" or a constant-sum question where the respondent allocates a given number of points to one or more brands.

At least, I think that's the way it works. The MSW Group never returned my phone calls.

The next generation

What I have presented is an overview of the major players specializing in testing advertising copy. Interestingly, a few hinted that they are currently investigating "other venues." The one which came up most frequently is testing over the Internet. A few companies have even tried it, but most are not yet ready to "go public." I personally feel that's where things are headed, given the ongoing problems with quality control in mall intercepts and/or the challenge of luring people out of their homes to review proposed television shows with a bunch of strangers. Maybe Internet copytesting is a methodology which just needs more championing from the client side. Perhaps next year I'll be writing the same review of major players who perform copytesting over the Internet!

Table 1- click to enlarge