Skip to: Main Content / Navigation

  • Facebook
  • Twitter
  • LinkedIn
  • Add This

Analyzing the content of social media data

Article ID:
October 2013, page 74
Ann Veeck

Article Abstract

The author provides a framework for analyzing social media data, drawing comparisons to qualitative data analysis while also outlining how and where social media data requires a specialized approach.

Beyond monitoring

Editor's note: Ann Veeck is a professor of marketing in the Haworth College of Business at Western Michigan University, Kalamazoo, Mich.

Among the many new sources of consumer information that have emerged in the last decade, social media data are among the most potent and game-changing for effective marketing research. Social media platforms offer a powerful opportunity to gain immediate access to the unfettered opinions of consumers. Many companies are aware of the value of using social media data to gain marketing insights. But there is so much information out there. How can businesses tap this source to obtain deep, actionable insights?

A number of excellent programs and services – some free and some commercial – have been developed for the analysis of social media data. Yet, the focus of the vast majority of these tools is to provide summary statistics of the data. Web analytics – for example, word counts, reach, word clouds, volume, sentiment analysis – can provide valuable, up-to-the-minute snapshots of Web content. Still, no algorithm is an adequate replacement for the in-depth analysis of consumer-generated feedback that can be conducted by a skilled analyst with a deep understanding of a brand and its challenges and opportunities.

So, how can analysts move beyond reporting superficial summary data to acquire strong and actionable insights from consumers? Fortunately, a model for the in-depth analysis of social media data already exists in the example of the best practices that have been used by research analysts for decades to analyze qualitative data. With an understanding of the differences between social media data and traditional forms of qualitative data, qualitative data analysis can be applied with modifications to the analysis of social media data.

Important differences

While the process for analyzing the content of social media data is similar to that used for qualitative analysis, important differences must be taken into consideration. The following are the steps for analyzing social media data.

Step 1: Develop a problem definition and research objectives.

For most research, developing focused research objectives is usually the most important step. What decisions will be made with this information? This guideline holds particularly true for social media analysis where a clear direction is needed to make sense of the copious amount of data. Limiting the focus to a defined topic and specific objectives will make the analysis more manageable. Still, to take full advantage of social media data analysis, the research objectives should also allow for an element of discovery. The data may lead to unexpected places.

The following are examples of objectives that social media analysis is particularly suited to address: competitive analysis; product extensions; product strengths and weaknesses; new uses of products; and reactions to advertising and promotions.

Step 2: Identify key search terms

The identification of the proper key search terms is a crucial step to the successful analysis of social media data. The process is often an iterative one, with broader searches being followed by searches using combinations of terms or newly discovered synonyms or tangential phrases. Obvious terms to start a search include the product’s brand name, competitors’ brand names and the product class. More exploratory analyses might investigate activities, events and emotions related to a brand.

Step 3: Identify social media data sources

The identification of the most useful data sources is another important step to social media data analysis. Online aggregator tools, such as TweetDeck and Scout Labs, can aid in this process. Still, sometimes these tools can miss some important types of social media platforms.

Depending on the research objectives, some types of social media sites that can provide consumer-generated data include the following:

  • social network sites (e.g., Facebook),
  • video-sharing sites (e.g., YouTube),
  • photo-sharing sites (e.g., Flickr),
  • product and service review sites (e.g., Yelp),
  • Web-based communities (e.g., Chowhound),
  • blogs (e.g., Gardenista), and 
  • microblogs (e.g., Twitter).

Finding the most current and germane sites is a moving target, since social media-oriented data sources ebb and flow in popularity. While this makes the task of identifying the best sites from which to gather data more difficult, it also means that new forms of exciting and relevant consumer-generated feedback are always emerging and can be uncovered with a bit of persistence.

Step 4: Organize data

Some of the most important consumer-generated data will not necessarily be in the form of text. Photos, videos, artwork, literature and other forms of data might provide new insights into product feedback. As a result, organization of the data should be flexible and allow for diverse forms of media. A number of commercial services (e.g., HootSuite, Radian6) and software (e.g., NVivo) are available to assist in this process, as well as free online tools (e.g., SocialMention, Google Alerts). However, some analysts will prefer to replace or supplant these options with more of a do-it-yourself approach to organizing data to ensure versatility and comprehensiveness. Analysts will also need to decide whether to view the data online, via hard copy or through a combination of paper and electronic sources when conducting the analysis, based on personal preferences and on to what extent the data analysis will involve collaboration among team members.

With the abundance of data available on the Web, and with all the twists and turns that can be encountered in the process of organizing data, it is important to know when to stop seeking new sources. The rule of thumb is that when a saturation point is reached – that is, when little new information is being acquired relative to the effort – it is time to end the searches.

Step 5: Analyze data

Once the social media data have been gathered and organized, the best practices for analyzing social media data are the same as those used for traditional qualitative data. First the analysts should review the data thoroughly. As with all research, insightful analysis depends on a comprehensive knowledge and understanding of the data. Then the analysts should begin identifying key themes that emerge from the findings – beliefs, ideas, concepts, definitions, behaviors. The data should be coded according to themes, either by hand or via software (e.g., NVivo) and then compared and integrated. To repeat: This step parallels content analysis of traditional types of qualitative data.

Step 6: Present findings

Following analysis of the data, the findings will be presented via oral and written presentation, using concrete examples and illustrations. Here is where social media data really stands out. Quotes can be presented from Twitter, reviews and blogs, just as verbatim quotes would be used to illustrate findings from focus groups and interviews. But consumer-generated social media data offer much more. Photos found online can illustrate exactly where, when and how a consumer is using a product or service. Consumer-produced videos can demonstrate perceived advantages and disadvantages of products. Even textual quotes praising or criticizing products can be much more colorful when found online with the opinions offered spontaneously and not prompted by a moderator.

Step 7: Outline limitations

When using social media data, it is at least as, and probably even more, important than with other research methods to outline the limitations of the data. Explicitly stating the problems and gaps encountered when gathering and analyzing the data helps to provide a more complete understanding of the findings.

The following are some of the limitations that are most commonly encountered with social media data:

  • The online consumers are not necessarily demographically representative of the product’s target consumers.
  • Self-selection bias is inherent with social media data.
  • Advocates and detractors can distort online conservations.
  • The demographic and geographic information of the consumers is often not traceable.

Step 8: Strategize

As with all research, the final and most important step of the analysis is to use the finding to develop research-based, actionable recommendations related to the research objectives. Then, based on the project’s results, the next stage of research should be planned.

Challenges and opportunities

Many of the basic steps used for the content analysis of text from structured data collection methods – such as interviews, focus groups, diaries, and managed online communities – can be generalized to social media data. However, social media data is different in a number of fundamental ways, representing both challenges and opportunities for analyses. It is useful to consider these differences.

Overwhelming amount of data. Traditional interviews or focus groups offer a discrete amount of material to organize and present. Social media data, on the other hand, is available in abundance. Often much more social media data related to a topic exists than can be reasonably analyzed. Analysts must place limits, by topics or time periods, on their search efforts.

Unrestricted comments. With focus groups, interviews and even online communities, participants are responding to directed questions. The users of social media state whatever is on their minds. This represents a great opportunity to gain new understandings about consumers’ motives, needs, behaviors and emotions. It also means that the problem definitions and research objectives that researchers identify prior to analysis may miss the mark and require revision.

Much more noise. Because social media data is not generally managed, many, if not most, of the comments that analysts sort through will be useless. For every insightful comment found, there are likely to be numerous useless posts, such as sales pitches (“My friend made $1,200 at home last month…”), empty comments (“So true.” “What he said.” “Yes.”), and non-contextualized obscenities (no examples necessary).

Multiple languages. Because social media is on the World Wide Web, relevant comments are frequently posted in multiple languages. As a result, depending on the objectives of the research, it may be beneficial to assemble a multilingual team for targeted projects.

Multiple forms. Consumer-generated data found online can take many different forms. In addition to text, data might appear as videos, audios, photos, artwork, slideshows and other structures.

Lacks context. Traditional qualitative methods allow quotes to be identified with specific individuals, providing key information such as gender, age, location and income. It is much more difficult to ascribe demographics to social media quotes. Even if the information can be traced to a user profile, there are no assurances that the profile is factual.

Cannot ignore

Social media allows access to up-to-date, candid consumer insights as never before. Companies seeking to make sound, data-driven decisions cannot ignore social media as a data source. Conducting a content analysis of online consumer-generated data, guided by targeted objectives, can yield actionable strategic recommendations. Used in conjunction with ongoing monitoring of Web analytics, and as a supplement to traditional research methods, social media content analysis can provide new strategic directions for companies. 

Comment on this article

comments powered by Disqus

Related Glossary Terms

Search for more...

Related Events

August 8-10, 2016
RIVA Training Institute will hold a course, themed 'Fundamentals of Moderating,' on August 8-10 in Rockville, Md.
August 19 at 12:15 p.m. CST, 2016
L&E Research will hold a one-hour Webinar, titled 'The What, When, Why & How of Unobtrusive Observation,' on August 19 at 12:15 p.m. CST.

View more Related Events...

Related Articles

There are 2328 articles in our archive related to this topic. Below are 5 selected at random and available to all users of the site.

General Mills marketing research decides cookbook cover
"Betty Crocker's Cookbook" has sold over 22 million copies, but as the flagship of their publishing line, General Mills Marketing experts needed to figure out a cover that could keep the book selling strong. A variety of techniques were used to figure out what book cover would sell best.
Rating scales can influence results
A summarized excerpt of a U.S. Department of Commerce study testing the merits of a seven-point rating scale versus a 10-point rating scale.
Singles' lifestyles explored in JCPenney study
A recent survey by JCPenney explored the lifestyles and tendencies of the singles population. The consumer study, conducted by the Public Issues and Consumer Programs department of the JCPenney Co., helped the retail giant to better understand the approximately 77 million singles living in the United States.
Quest research pays off for United Way
In the past, marketing research was too expensive for many United Way organizations. But all that has changed, thanks to a new research program called Quest. By utilizing innovative survey techniques and technology, Quest allows United Way organizations to improve communications, identify key services and improve fundraising easily and inexpensively.
JCPenney pinpoints its customers
In order to fully understand the needs of their customers, JCPenney has initiated a series of studies called Consumer Feedback. These studies give JCPenney a clear picture of the needs, attitudes and behaviors of their customers.

See more articles on this topic

Related Suppliers: Research Companies from the SourceBook

Click on a category below to see firms that specialize in the following areas of research and/or industries



Conduct a detailed search of the entire Researcher SourceBook directory

Related Discussion Topics

Hybrid methodology
04/25/2016 by Dora Kicsi
Survey Incentives
03/23/2016 by Alex Hales
Confidence interval Definition
03/03/2016 by Alex Hales
Survey Incentives
02/21/2016 by Blaine Ung
TURF Simulator with Shapley Value
02/10/2016 by Amit Zaveri

View More