A definite impact

Editor’s note: Ann L. Breese is marketing research director for Seattle-based Starbucks Coffee Company. Donald E. Bruzzone is president, Bruzzone Research Company, Alameda, Calif. This article is adapted from a presentation made before the ARF/ESOMAR Worldwide Audience Measurement conference in June 2003.

In the summer of 2002, Starbucks wanted to answer this question: Does Starbucks’ out-of-home media (such as billboards, kiosk ads, vehicle wraps) reach and affect people as efficiently as Starbucks’ investments in television, radio and print advertising? As our research showed, the answer was yes, it was as efficient as the other media.

To manage advertising effectively - and make the right decisions on how much and what kinds - you need to know how advertising is really performing. To do this, Starbucks uses a recognition-based tracking program1, which we feel avoids the elements of risk inherent in alternative methods:

Dissimilar audience measures: Traffic by the site is good to know, but how many of those people actually notice your billboard? How does the value of a million people driving by your billboard compare to the value of a million people sitting in front a TV set while your commercial is playing? We know it is different, but how different? And as everyone is aware, the accuracy with which the industry measures either audience is open to some debate.

Overlap: A second advertising medium makes a lot more sense, and is a lot more valuable, if it reaches people who were not reached by the first medium. The readily available audience data doesn’t tell you anything about how much of what you are buying is duplicated overlap and how much is new and un-reached.

Inaccurate measures of reach and affect: Those who take the first two blind spots seriously will usually turn to tracking studies, which are custom studies to determine the number reached and affected by the advertising they are conducting. But what kind of study? Recall-based telephone trackers are still the most popular. But we found when you ask if people recall any of your recent advertising, and they say yes, they may well be thinking of advertising you conducted last year, or they could even be thinking of your competitor’s advertising. But when you show them the advertising and ask, “Do you remember seeing THIS before?” you get a massive increase in accuracy.2 Added to that was a situation that may be unique to firms in Starbucks’ situation. The Starbucks brand is so large in this country, and the coffee shops are so widespread, people expect Starbucks to advertise, and we find substantial numbers saying they recall Starbucks advertising during periods when there has not been any.

Online study

The tracking study conducted by Bruzzone Research was designed to measure the effect of Starbucks advertising during the summer of 2002. The Starbucks name had not been closely associated with summertime drinks. One of the objectives of the summer advertising was to change that.

It was done with a before-and-after study conducted online. The online population was getting close to matching the total population, but still with somewhat of a younger, more upscale skew. That made it a good fit for most Starbucks products. Doing it online enabled us to show each respondent virtually every piece of advertising Starbucks used over the summer. We did it as illustrated in Figure 1. A total of 24 items were shown in the same manner. A feeling for the other items in the campaign is provided by Figures 2 and 3. For the radio commercials respondents clicked on a speaker symbol to hear an excerpt. When respondents recognized any of the ads they were asked additional questions. More about that in a moment. But first, how do we use that recognition data?

Recognition is what tells us if a respondent noticed an advertisement. We use that to see if those who noticed a particular advertisement showed higher levels of awareness, favorable impressions and buying behavior. But we needed to avoid that classic problem of post-only studies. People who always had more favorable reactions are the most likely to notice the advertising. So we surveyed almost 800 individuals across the country both before and after the advertising. That met the need to see if an actual change had taken place among those who noticed an advertisement. However, that first interview can sensitize people and make them more likely to notice that brand’s advertising in the future. So we also surveyed another 800 after the advertising who had never been interviewed before. It was the results from this control group that were used to determine how much change had taken place overall in awareness, perceptions and behavior. In short, we used full experimental design to get the most conclusive evidence possible.

As for response rates, 1,453 of the first-wave respondents were invited to participate in the second wave and 53 percent did. Another 4,708 invitations were sent to a cross-section of the online population drawn from Survey Sampling’s SurveySpot panel. Eighteen percent completed the survey, providing 851 replies from people who had not been surveyed before. The response rate was well within the range normally encountered in this type of online research. Validation of this approach is included in the references describing the four years of parallel studies Bruzzone Research conducted in its annual testing of all Super Bowl commercials.3

What we learned when respondents were asked additional questions about the advertisements they recognized is illustrated in Figure 4.4   For one set of products it compares how the best-remembered and the least-remembered advertisement compared in the words respondents checked as describing them. It provides a picture of the strengths and weaknesses that account for the difference in attention-getting power. The advertising shown for this set of products was masked. Anything that could identify the advertiser was retouched, bleeped or removed so we could ask “Do you remember who that was for?” That also revealed wide differences. These diagnostics provide creative feedback on the approaches to use again in the future - and the ones to avoid.

The impact of the advertising was measured at a variety of levels ranging from changes in awareness and improvements in reactions to the number that bought the product in the latest period. For calculating return on investment in the most meaningful way, there is no better base than the number of respondents buying the product. So for the decisive measure of effectiveness, we looked to see if the number of recent buyers was higher among those who recognized the ad. Different amounts were spent to run each of the 24 items, and some of them were run in different areas. To cover that we used two key measures for evaluating all of them:

  • The number reached (per 1,000 respondents) for each cent spent (per capita).
  • Additional buyers (per 1,000 respondents) for each cent spent (per capita).

The key point to be made here is that these measures are directly comparable for all media - including out-of-home (OOH). People either recognized the advertising or they didn’t. And among those who did, there was either a higher number reporting recent purchases or there was not.

Figure 5 shows both measures for each of the 24 items. They are ranked by the number of additional buyers. The long bars show the number reached was much greater than the number affected. Comparisons of the number affected - the short bars - are critical, so Figure 6 shows the same data plotted on a log scale that magnifies the short bars and compresses the long ones.

The chart shows there were wide differences in the return on the investment made in each piece of advertising. How did out-of-home fare in this comparison? It is the red bars on the chart. Three of the OOH items in the test bunched up near the top. The rest spread out down the bottom half of the chart. The blue bars show where the eight magazine ads fell. The seven radio commercials that were tested are green, and the light yellow bars show where the three TV commercials fell. The van is the brown bar near the bottom. OOH was competitive. No other media was clearly better.

Ignore at your own risk

What we had found at this point was that we could measure OOH and all the other media in terms of additional buyers produced and the cost to do it, and we found OOH compared quite favorably.

However, these charts may be conveying an even more important message. Whatever media you are in, the most effective execution reached and affected a lot more people than the least effective execution. This evidence shows that you ignore differences in the quality of the creative at your own risk. We found those differences to be of a magnitude that could completely offset the differences we found between media.

Figure 7 shows another, simpler way of comparing the efficiency of the media in reaching people. When we added together all the times respondents reported recognizing Starbucks advertising the pie chart on the right shows how the media compared in getting noticed. It shows their share of the items recognized. The pie chart on the left shows how the amount spent on each media compared. Comparing the two shows OOH was the most efficient at reaching people. It accounted for 27 percent of the advertising recognized, but only 16 percent of the cost.

Finally, let’s look at some of the information we obtained on a factor the usual audience measures don’t tell you anything about: overlap. Figure 8 shows which did the best job reaching people who were not reached by other media. It has three sets of bars showing which media did the best job reaching those not reached by TV, radio and print respectively. In each case, it was OOH that reached the greatest number of those not reached by other media.

Looking at overlap another way, which of the media contributes the most when it is added to a mix of other media? Are there differences in synergy? Figure 9 shows one of the ways we measured that. It also shows another strength of the OOH Starbucks used in the summer of 2002, and the large sample that enabled us to look at sub-groups that had exposure to various combinations of media. When each of the media was added to a mix of the other media, one that included all possible combinations of the remaining media, the one that contributed the most, measured in terms of additional buyers, was OOH.

The advertising worked

So what did all of this mean to Starbucks and what conclusions can be reached from this type of research? First, it showed that virtually all of the Starbucks advertising worked. When people noticed any of it they ended up buying more of the summer drinks being advertising than people who didn’t notice the advertising. It also showed some of the advertising worked a lot better than other advertising. Here are a few examples of the conclusions we reached looking at the differences in how the advertising performed, and the diagnostics we had on that advertising.

  • Reminding people of the appropriateness of Starbucks’ cold drinks during the summer worked well.
  • Simply announcing what products were now available, and where, did not work as well.
  • Simple illustrations of the drinks with palm trees, beaches and blue sky worked well.
  • More complex “what is this?” graphics did not work as well.
  • The depiction of gratification was critical. The results showed specifically what conveyed gratification and what didn’t.
  • Starbucks’ green straws, featured in some advertising, produced mixed results.
  • Limits need to be set on the amount spent on a single execution, and the results helped show where to set the limits. We found a number of cases where spending more did not produce more buyers.

In summary, we not only found we could directly compare the ROI of out-of-home with other media, we also found many of the factors driving the effectiveness of advertising in other media were having the same effects on our out-of-home advertising.

Footnotes and references

1 For more on advertising tracking studies at Starbucks, see:

Breese, Ann L. and Donald E. Bruzzone. “Brand Extensions: Their Impact On Overall Brand Equity.” The Experts Report on...October 18, 2000, ARF Week Of Workshops (Proceedings).

2 Studies and articles documenting the limitations of recall:

Haley/Baldinger. “ARF Copy Research Validity Project.” Journal of Advertising Research, April/May 1991.

Lodish, Leonard, et al. “How TV Advertising Works: A Meta-Analysis of 389 Real World Split Cable TV Experiments.” Journal of Marketing Research, May 1995.

Gibson, Lawrence L. “If The Question Is Copy Testing, The Answer Is Not Recall.” Journal of Advertising Research, February/March 1983.

Studies and articles documenting the superiority of recognition:

Singh, Rothschild and Churchill. “Recognition Versus Recall as Measures of Television Commercial Forgetting.” Journal of Marketing, February 1988.

Krugman, Herbert E. “Memory Without Recall, Exposure Without Perception.” Journal of Advertising Research, August 1977.

Krugman, Herbert E. “Low Recall and High Recognition of Advertising.” Journal of Advertising Research, February/March 1986.

Zielske, Hugh. “Does Day-After-Recall Penalize ‘Feeling’ Ads?” Journal of Advertising Research, February/March 1982.

Schaefer, Wolfgang. “Recognition Reconsidered.” Marketing and Research Today (ESOMAR). May 1995.

3 For background on the consistency of recognition-based tracking conducted online with forms of interviewing used earlier see the following on the results of parallel testing of Super Bowl commercials for three years:

Bruzzone, Donald E. “Track the Effects of Advertising Better, Faster and Cheaper Online.” Quirk’s Marketing Research Review, July 2000. (www.quirks.com )

Bruzzone, Donald E. “Tracking Super Bowl Commercials Online.” A talk at the Advertising Research Foundation Week of Workshops in Chicago, October 30, 2001.

Bruzzone, Donald E. “How to Keep Respondents Interested in Long Online Surveys.” A talk at the IIR Conference on Web-Based Surveys in San Francisco, June 21, 2002.

4 For background on the battery of diagnostic questions that were used and what they can show about the performance of advertising see:

Aaker, David A. and Donald E. Bruzzone. “Perceptions of Prime Time TV Advertising.” Journal of Advertising Research, October 1981.

Aaker, David A. and Donald E. Bruzzone. “Causes of Irritation in Television Advertising.” Journal of Marketing, 49, 47-57.

Stayman, Douglas, David A. Aaker and Donald E. Bruzzone “Types of Commercials Broadcast in Prime Time: 1976-1986.” Journal of Advertising Research, June 1989.

Biel, Alexander L. and Carol A. Bridgwater. “Attributes of Likable TV Commercials.” Journal of Advertising Research, June/July 1990.

Wells, William, John Burnett, Sandra Moriarty. Advertising Principles and Practice: Second Edition. Pages 605-613. New York: Simon and Schuster, 1992.

Bruzzone, Donald E. and R. Paul Shellenberg. “Super Bowl Advertising: What Really Works?” Quirk’s Marketing Research Review, March 1996. (www.quirks.com )

Bruzzone, Donald E. and Debora J. Tallyn. “Linking Tracking to Pretesting with an Advertising Response Model.” Journal of Advertising Research, June 1997.

Horovitz, Bruce. “Super Bowl Myths, Realities.” USA Today, January 5, 1998.

Bruzzone, Donald E. and Lizabeth L. Reyer. “Using Recognition-Based Tracking to Compare the ROI of Print, Radio And TV.” Quirk’s Marketing Research Review, March 1999. (www.quirks.com )

Shimp, Terence A. Advertising Promotion: Fifth Edition. Pages 481-484. Fort Worth: The Dryden Press, 2000.

Rosen, Dan and Don Bruzzone. “All the Right Moves: Advertising Movies During the Super Bowl.” Quirk’s Marketing Research Review, April 2001. (www.quirks.com )

Bruzzone, Donald E. “Why Tracking Should Replace GRPs.” A talk at the Advertising Research Foundation Annual Convention in New York, March 6, 2001.

Papazian, Ed. “How Consumers Respond to Commercials: Positive/Negative Evaluations and the Heavy Viewer Effect, Past and Present” and “How TV Commercial Awareness and Attribute Ratings Vary By Product Class.” TV Dimensions. Pages 344-355. Media Dynamics, Inc. 2003.