Skip to: Main Content / Navigation

  • Facebook
  • Twitter
  • LinkedIn
  • Add This

Two ways to visualize 'small data'



Article ID:
20130925-2
Published:
August 2013
Author:
Michael Lieberman

Article Abstract

Data visualization tools designed for big data can also work well with primary research. This article details two different approaches.

Editor's note: Michael Lieberman is founder and president of Multivariate Solutions, a New York research firm. He can be reached at 646-257-3794 or at michael@mvsolution.com. This article appeared in the September 9, 2013, edition of Quirk's e-newsletter.

As big data becomes more accessible, visualization services have grown exponentially. These have produced extremely agile, open-source graphics platforms that can graph consumer transaction data and huge clusters of preferences (e.g., which movies on Netflix a customer might like given his past choices) and provide visual representation of the social network activity.

This is all terrific news for the marketing research industry.

Today, powerful, free, open-source software packages like the R Project for Statistical Computing give marketing research companies the same capabilities to mine large client databases as expensive data-mining labs, which are primarily deployed in the financial, insurance and national security fields.

 

What that means, simply, is that we can now service our corporate CPG, food service, advertising or political clients with database insights without five-figure outlays for statistical modeling packages.

Advances in automated visualization have produced a boon for the visualization of small data (i.e., primary research). Marketing research output has never been easier to display. Below are two simple yet sophisticated examples. One involves a two-step cluster analysis and the second is a data-mining example that takes existing consumer purchase data and reports the items in which the client should specialize to connect to the greatest number of other products.

Latent class segmentation and media strategy

The segmentation visual shown in Figure 1 is the product of a multidimensional, two-step cluster analysis using latent class analysis and then K-means to cluster clothing style preferences, shopping behavior, money spent and self-image into three shopper segments. The size of the segment rectangle represents its relative size in the population. The thickness of the directed lines with labels such as "Most Lucrative" illustrate the profitability and strength of association for each brand to the shopper segments.



Our brand, which we'll call Purple Diamond, can now clearly see its market position vis-à-vis our segments. Purple Diamond's C-suite wants to be highly associated with Wealthy Fashionistas or Prudent Upscale Shoppers. Though these are smaller segments percentage-wise, they are more confident consumers, shop more frequently and, most importantly, spend more.

The graph also reveals that Purple Diamond competes most directly with Gmart and GoJane - bargain brands in the Value Shoppers segment space. The new advertising strategy, which resulted from this graphic, indicates that Purple Diamond wants to move closer to Prudent Upscale Shoppers and has only a tenuous foothold among the Wealthy Fashionistas. Purple Diamond should be vying for customers from Burberry or Dior to move up the high-end value-brand consumer chain.

Supermarket data mining

Data mining-specific project results are most effectively delivered visually. Figure 2 is an example of a data mining output for a major supermarket chain that we'll call Big Food. It wants to know which items it should put on special.



When examining Big Food's data mining visual, there are a few things to keep in mind:

  • Items are clustered by transactions. Items with the same shade tend to be purchased together.
  • The thickness of the red line indicates the number of transactions between its linked items. Heavy red lines indicate a big seller. A dotted connection indicates a weaker item.
  • The size of the oval represents between-item connectivity - not the transaction rates. This, essentially, is what we are looking for: product influencers. 

Open-source network graphing software (used to map Twitter, Facebook and hyperlinks) has been applied to this transaction-network graph. We received the data from Big Food's database and after some treatment, we placed it into our graphing software. The transactional spheres are part of the graphic calculation, which also includes clustering (we specified for the software to color the clusters differently) and eigenvalue centrality - a measure of the influence of a product in the visual, which is the statistic we are searching for.

Examining the graph, we see that there are four clusters. Peppers are purchased with shrimp; onions with carrots; broccoli with chicken; roast beef with mustard, etc.

The managers at Big Food though, would like to know which items they need to promote to increase sales overall. Best-selling items are not necessarily the correct answer. Hot dogs, for example, are a popular item but they are connected mainly to ketchup and mustard - not to other item clusters.

We want to report to Big Food the most influential items, which are:

  • lettuce,
  • turkey breast,
  • whole-wheat bread,
  • chicken, and
  • tomatoes.

These items by themselves may not be the biggest sellers but they are the items in our graph that most stimulate the purchase of other items. Simply put, if you sell lettuce and turkey breast, chances are you will raise sales in other clusters, advancing Big Food sales overall.

New visual capabilities

Our Big Food example is quite small. New visual capabilities can graph literally thousands of items and find the most influential product-connector items among thousands sold by Big Food - or by Frito-Lay or Disney or Pfizer. Using emerging open-source technologies, marketing research firms can now take on the same projects as the predictive analytic leaders.

Comment on this article

comments powered by Disqus

Related Glossary Terms

Search for more...

Related Events

RIVA COURSE 303: ADVANCED MODERATING
October 27-29, 2014
RIVA Training Institute will hold a course, themed 'Advanced Moderating,' on October 27-29 in Rockville, Md.
MOBILE SHOPPING 2014
October 28-30, 2014
Worldwide Business Research will hold its mobile shopping conference on October 28-30 at The Wigwam, Litchfield Park, in Phoenix.

View more Related Events...

Related Articles

There are 2185 articles in our archive related to this topic. Below are 5 selected at random and available to all users of the site.

Trade Talk: The Web drives sales, but not all by itself
A Pew study shows that while consumers do rely on the Web for information searches when shopping for products and services, in many cases they also consult in-store salespeople as part of the process.
Trade Talk: On-line research: where do we go from here
Five research service providers weigh in on the Internet's impact on the industry.
The multicultural world of social media marketing
The authors present findings from a survey of social media usage among ethnic groups and argue that as these outlets become more and more a part of multicultural consumers’ lives, companies that incorporate them into their marketing plans will earn the loyalty of these segments.
System shows promise as tool for segmenting Indian market
India is a diverse and complex country. This article discusses the socioeconomic classification system used in India today, which was developed nearly a decade ago, including the basic reasons for developing the system.
How to do focus groups using Google+ Hangouts On Air
An overview of the pros, cons and best practices associated with conducting focus groups Using Google+ Hangouts On Air.

See more articles on this topic

Related Suppliers: Research Companies from the SourceBook

Click on a category below to see firms that specialize in the following areas of research and/or industries

Specialties

Industries

Conduct a detailed search of the entire Researcher SourceBook directory

Related Discussion Topics

Hi Giovanni
10/17/2014 by Dohyun Kim
request
06/06/2014 by Monika Kunkowska
TURF excel-based simulator
04/17/2014 by Giovanni Olivieri
XLSTAT Turf
04/10/2014 by Felix Schaefer
TURF excel-based simulator
03/25/2014 by Werner Mueller

View More