Editor's note: Holly Heline Jarrell is chief client services officer of GfK Consumer Experiences North America, New York. Bitsy Bentley is vice president of data visualization for GfK, New York.

The best practices of data integration can be reduced to a single question – one that precedes any thoughts about fusion techniques, dashboards and sample sources: What problem are you trying to solve?

Nothing could be more elemental and yet we frequently encounter very smart marketers and researchers who have been thanklessly tasked with “making sense of all that data” but who were never told why, for whom or with what goal. Marketers are under pressure to do more with less and to use the data the company has on hand whenever possible. But being frugal is a tactic, not a strategy.

The transition to a world of big data represents a paradigm shift for everyone in marketing, especially market researchers. In the past, marketers asked questions, which researchers turned into survey questionnaires and then answered. In the past, it was clear who “owned” the data; the research supplier collected information, wrote a report interpreting the data and handed it off to the researcher client, who might or might not share it with a broad array of internal or external stakeholders.

Today, boatloads of data are accessible without any questions being asked. The formulation of a questionnaire once served as a focusing exercise for all involved but now that focusing must come after the fact. And today, nearly everyone in an organization “owns” (or has access to) some data. To get the big picture of what is going on and where brands should be headed, someone needs to pull all of this information together and make it serve a purpose.

Hence the job of integrating data, a task of huge import that is often handed to researchers who may quickly find themselves out of their depth. Data integration is in many ways a separate science (some might call it a black art) and one that the industry quickly needs to master.

Curious minds are sorely tempted

The allure of big data is undeniable; curious minds – which marketers and researchers tend to possess – are sorely tempted. The desire to wade in and “noodle around” is powerful and healthy in small doses but without a clear business imperative guiding one’s search, the potential for wasted time and irrelevant conclusions is great.

The fact is, most data does not analyze itself – though some big data practitioners would disagree. The wealth of data that is available today becomes most useful when we focus our efforts around problems and issues that matter to our clients. And that is good news for researchers, because they are desperately needed to point all of this data in useful directions. In some respects, this requires some of the same skills researchers use to design a great questionnaire: focusing on the important questions and issues, creating a hierarchy and making sure that the output will be salient.

But the researcher now needs to be as much a curator of information as a creator. Survey research is still an important element of the data mix; it adds color and homes in on attitudinal and emotional questions that rarely can be found in transactional and passive databases. But researchers need to expand their purview and think bigger and broader.

Clearly distinct

We see data integration as clearly distinct from insight integration – a process with which many researchers are familiar. In insight integration, we analyze data sets separately and then integrate the results. For example, a researcher may analyze customer satisfaction tracking data, then obtain context from a consumer confidence index, Dow Jones averages, consumer trends and social media analyses. There is often one central piece of research, which is then “enhanced” with information from other sources.

In data integration, we integrate before we analyze. We act as a curator to choose meaningful data streams that should ultimately produce results to help clients make decisions and act on them.

So how do we get our heads around a data integration effort? In some ways, the timeline will be very much like a typical research project, with touchstones such as:

  • Defining the business issues. Wisely identifying our clients’ business issues has to be our first priority.
  • Choosing your data – which requires that you first identify all your potential data sources, before selecting the streams that are relevant and needed.
  • Picking your team – cross-functional is the operative word.
  • Pulling it all together – potentially as much an IT challenge as an MR one.
  • Bringing it to life – throughout the organization!

During this process, researchers will find there are some unfamiliar people in the room – folks from operations, IT, finance and elsewhere. These people may be so comfortable with their familiar, fairly niche tasks that referring back to broader business goals will require some guidance. So we can add another “C” descriptor to our new role: first curator and now coach.

To help clients home in on the business needs that should drive data integration, some leading questions may come in handy – inquiries like

  • How is your company integrating data from different sources right now?
  • Who or what are the main forces driving data integration in your organization?
  • What initiatives will need to be informed by this integration project?

We also must carefully identify the data that the client has in hand, as well as important information that may reside elsewhere or need to be developed from scratch. There may be proprietary trackers and sales data available, plus information from syndicated sources. To that we may choose to add more quant and qual data, social media analysis or other new sources. And as a curator, we’ll need to cull down to only relevant streams of data to keep our effort focused on the business issues – and, frankly, useful.

The researcher may need to learn a new vocabulary to stay afloat in this world of data. Key types of information that will crop up repeatedly are:

  • Structured – files with data of a known format and placement within the file.
  • Unstructured – data with no specific structure applied, often from social media sources.
  • Semi-structured – similar to unstructured but with some minimal tagging and analysis.
  • Synthesized – information in report format, rather than provided as data.

Never have imagined using

As we can see, being a good data curator requires a variety of skills that researchers 20 years ago might never have imagined using. How do we bring data from different software and hard drives into one interface? How do we help the call-center manager articulate his or her business challenges and data needs? How do we make the same project serve the CEO, CMO, CTO, CIO and HR director?

In the end, the value of our work will be judged as much by how we communicate our findings as what we find. That is why communications planning needs to be a key element of any data integration project; once we have identified the key information that different stakeholders need, we have to make sure they actually get it – in a form they can understand and at a frequency that makes sense. A CMO may need to see a variety of metrics only once a month but a restaurant or hotel operator may have to have essential customer satisfaction information delivered in real time or very close to it. These are not separate projects – just different windows on the same data set.

When in doubt, simplicity should be the operative word in data communication. Keep it visual, keep it clean. With so much information at our fingertips, the risk of clutter is palpable, especially for researchers who have grown accustomed to looking at PowerPoint charts where all white space has been filled. We have different clients now, with lower tolerance for complexity and powerful reactions to simple shapes and colors; we need to be sure their first glance at our interface or report does not turn them off to our findings in an instant.

Feel more confident

We know that an integration project has been successful when our internal clients can say they feel more confident in their decision-making. If we refer back to our touchstone question – What problem are you trying to solve? – greater confidence tells us that we found the problem and have helped them solve it. Our insights are not just interesting but actionable. When people fear that there are things they should know but do not, they cannot be confident. But when their needs have been heard and met, they feel complete and strong in our decisions.

Another key element of confidence is trust in the information being used. Managing data quality in an integration project can be tricky, because data integration can be, well, downright messy. It is the responsibility of a good data curator to help clients recognize the ambiguities and inconsistencies that come with data integration and to help them understand that this messiness is okay. Nevertheless, we need to make quality control one of our priorities; we need to keep an eye out for inconsistencies, nonsensical findings, obvious glitches and, of course, response rates and the like where survey research is involved. Where serious issues arise, it is up to the curator (aka researcher) to inject a “proceed with caution” caveat, if not sound an alarm outright.

So how do we round out our portrait of the newly-minted data curator? In the end, the role requires just a few basic skills:

  • Collaboration – so that you can leverage the skills of others when you are out of your depth.
  • Listening – so you can hear the needs that your efforts must fill.
  • Discernment – so you can separate the useful and reliable data from the superfluous and suspect.
  • Flexibility – so that you can roll with the punches when things change in an instant.

We welcome all of our trusted market researchers to the new world of data curation!