Making sense of the chatter

Editor’s note: Thomas Malkin is president of GeeYee, a Chicago social media analysis firm.

For some researchers, information generated by social media has become a worthwhile source of complementary data. When the directional insights from social media are integrated with the representative opinions garnered from traditional marketing research, they give full voice to the needs, wants and viewpoints of the customer and can help achieve optimal decision-making. But for many in our industry, the task of making sense of the flood of words generated by the social media outlets is daunting, to say the least.

One way to extract insights from the torrent of consumer opinions is to apply the methodology market researchers use to analyze open-ended survey questions: coding. The coding approach to analyzing unstructured comments involves readers quantifying positive, negative or neutral opinions after they’ve matched the subjects (i.e., “Cell Phone Product X”) in the text with categories or classifications. These categories typically represent consumer passions or issues that drive purchasing decisions (i.e., opinions around apps or multimedia for the cell phone product category).

The challenge in adapting coding to social media is that reading through the vast amount of constantly-changing data on the Internet is overwhelming and expensive. Technology is thus needed to automate coding so the high volume of historical and daily social media data is not a barrier to obtaining insights from this valuable trove of data.

For automated coding to achieve accuracy, especially in social media where people communicate with familiarity in blogs, forums and social utility sites like Facebook, one approach is to go beyond keyword-based technology and take into account the implicit subject, implicit issue and context-dependent sentiment. An example of an implicit subject is, “This phone has a great battery life,” in which the type of phone isn’t explicitly mentioned. An example of an implicit issue is, “This camera is too large,” in which the category or classification - size - is not explicitly mentioned. And an example of a context-dependent sentiment is, “This battery life is long,” in which “long” makes this opinion a positive sentiment whereas long means something negative in the opinion, “This movie line is long.”

This type of rubric allows researchers to compare brands or products to each other based on the issues that resonate with consumers rather than just brand buzz. For example, if Brand X has an 83 percent positive sentiment on the topic of customer satisfaction (the 83 percent representing a ratio of the overall positive and negative opinions), that could very well meet the prior expectations of Brand X. However, if Brand X’s competitors are achieving a positive sentiment on customer satisfaction greater than 87 percent, Brand X now has a decision to make as to whether it wishes to do something about this disparity. Doing a competitive trend analysis over 12 months on customer satisfaction would give direction to Brand X and its competitors as to when each brand experienced more or less positive and negative opinions (i.e., daily, weekly, monthly, etc.) and where in social media such opinions were expressed. The advantage that social media provides over other data sources is that Brand X can efficiently learn why consumers consider its competition to be better on customer satisfaction by simply drilling down to read the actual comments about Brand X or all of its competitors from their original online sources.

See the correlations

With the availability of social media-generated input, a starting point for decision-making is to get a measurable, story-style read on a product’s or brand’s positioning relative to a competitive set on the issues that drive purchasing decisions. Researchers can now not only see how many opinions have been expressed on three dimensions of data - the subjects, issues and sentiment - but also see the correlations among those three variables as well. In addition, issues or subjects that may not be talked much about and thus typically overlooked by decision makers can become directionally insightful if they are spoken of in the context of or relative to something else.

Let’s say a beer brand in the craft beer category is trying to win market share by going green. In a 12-month study on the category, the issue or code “green initiative” is not spoken of very much among the several brands in the competitive set. But when opinions on this issue are expressed, they are usually done so in the context of a couple of brands in particular, and both brands have distinctly different volumes of social media. Being able to visualize how the opinions correlate (see Figures 1 and 2 in sidebar) and then drill down to understand the actual story around “green initiative” provides directional insights for any brand in the craft beer category. This is especially true if the brand in question is not being acknowledged for its efforts by consumers across millions of social media outlets. Additionally, insights can be obtained by seeing stories emerge amongst the other issues measured in that study.

Once a historical read on a product category is obtained, monitoring becomes more impactful. One new form of measurement is the monitoring of changes in social media thought leadership on issues that resonate with consumers before and after an event. Emerging trend analysis and crisis management also becomes more thoughtful and less reactionary because researchers can get a better read on what course of action to take once they know the issues discussed, how they’re evolving over time and how consumers are talking about those issues (especially in the context of and relevant to other issues or subjects). Such granular monitoring can also provide greater insights for communication strategies since opinions are quantified and weighted by source based on discussions on both the subject and issue.

More breadth

Gaining an understanding of product categories based on the voice of the crowd in such a granular manner creates many applications for market researchers. They can benchmark the insights obtained against prior quantitative and qualitative research and see more breadth in the data that may otherwise have been missed. Typically, many more questions than answers are raised, leading to further exploration using traditional marketing research and, one hopes, a clearer and more complete picture of the market segment in question. 

Using social media to track the Tiger Woods saga

For a more detailed example of how a marketing research firm has used social media research, Thomas Malkin spoke to Jon Last, president of Sports and Leisure Research Group, White Plains, N.Y.

How have you used social media in conjunction with your traditional marketing research?

“We tracked the magnitude and tonality of Web conversation and opinion about Tiger Woods over 1,100 disparate and relevant Web sites from January 2009 through mid-March 2010, right after Woods’ public statement in February, to see if it was consistent with our attitudinal survey research that was part of our winter 2010 omnibus study. In the omnibus study, a national sample of nearly 1,000 avid golfers agreed that the rancor regarding the transgressions of golf’s greatest player would dissipate significantly by the summer months. Further, this study suggested that for the most engaged and passionate fans, Tiger’s on-course achievements far outweighed any personal shortcomings.”

What were your findings upon benchmarking the insights obtained from social media with those from your prior attitudinal research?

“As one might expect, the level and tonality of buzz regarding Tiger was at its peak immediately after his November accident. But this chatter quickly and precipitously dropped in the first month of the new year, spiking again, though at nowhere near the level of November, around his mid-February statement. By March, the level of online conversation was back to pre-scandal levels.

“Upon further assessment, we looked at the tonality of the conversations pertaining to those opinions on Tiger. We developed codes that could be used to measure the tonality relative to some of the findings of our prior attitudinal research, selecting ‘admiration,’ ‘apology accepted,’ credibility,’ ‘disappointment,’ ‘doubtful,’ ‘inspiration,’ ‘marketability’ and ‘trustworthiness,’ among others. Our analysis demonstrated that conversations in social media were on par with the conclusions that we drew from the traditional quantitative study.

“From an illustrative standpoint, if you take a look at an issues trend analysis, you’ll see that before the crisis emerged on November 20, 2009, conversations on Tiger’s character were focused on two issues: admiration and inspiration. From November to January, new conversations emerged on the issues of character, namely trustworthiness, marketability and credibility. Conversations around disappointment and doubt, which previously included opinions solely about his golf game, now included those about his character as well. In the second peak period in February, disproportionally fewer conversations were occurring under inspiration and trustworthy with new conversations emerging under apology accepted. Thus we were able to use the social media analysis to validate the earlier hypotheses drawn by our quantitative research. Golf fans were beginning to quickly move past the issue!

“Another illustration of the findings can be seen in the heat maps [Figures 1 and 2], both prior to the crisis and at the tail end of it. Prior to November 20 [Figure 1], you can visualize the intensity of conversations on Tiger himself, the issues or pre-determined codes and sentiment by looking at both the size and the color of the circles. [The smaller the circle and the lighter the color, the lower the frequency of conversation; the larger the circle and darker - i.e., green - or hotter the color - i.e., yellow - the higher the frequency of conversation, with the reddish colors showing the highest intensity.] You can also see the statistical correlations of the three data dimensions - subject, issue and sentiment - as indicated by how close the issues are to each other [The closer they are to each other, the higher the correlation.] Thus, it’s apparent that conversations around admiration and inspiration are closer to positive sentiment and spoken of in the same context relative to other issues before the crisis emerged. As you can see on the second heat map [Figure 2], by March 22, the story was quite different.”

Figure 1: Tiger Woods Social Media Story Before November 20, 2009

Figure 2: Tiger Woods Social Media Story From March 10 to March 22, 2010

What are your conclusions about benchmarking insights from social media with those from your prior attitudinal research?

“While such analysis is not as representative as a well-designed quantitative attitudinal study, it does yield strong directional insights and provides breadth to our earlier findings. So, while I won’t go as far as Ad Age did in suggesting the imminent demise of traditional survey research, I will assert that this capability presents us with a valuable new tool to enhance our understanding of fan sentiment. At the risk of further propagating the sound-byte society, I’d maintain that social media in concert with formal research is a conversation that all market researchers should be paying attention to.”