Weighting data: a look at misconceptions and design | Articles

Editor’s note: Susan Frede is vice president of research methods and best practices at Cincinnati, Ohio-based market research firm Lightspeed Research. This is an edited version of two posts that originally appeared here and here titled, “Debunking weighting misconceptions” and “Appropriate design and evaluation of a weighting scheme.”

With the U.S. presidential election in full swing there has been a lot of talk about the validity of political polls. This includes discussion on how to appropriately weight data. In this post, I will unlock the truths behind these weighting myths and misconceptions.

A lot of hesitation surrounding weighting data is due to fear based on some misunderstandings:

First, there is a misunderstandings that data manipulation is bad. Weighting does involve data manipulation, but so does setting quotas on survey completes. In both cases the sample is being forced to look a certain way.
Weighting does reduce base sizes but there are ways to minimize this impact. For example, instead of weighting using five age breaks, weight using three age breaks.
Outliers can influence data in weighting. For example, if one respondent spends $5,000 on shoes annually and they receive a high weight, data may be inflated. Spending a little time analyzing outliers before weighting data can help minimize this problem.
Weighting does add some complexity and time to the study. However, weighting is built into data processing programs and automatically produces diagnostics which make it easy to access the results and adjust. Spending a little time now can save hours in analytic time trying to make sense of the data. Also, the cost and time for weighting usually falls short of the cost and time involved in a bad business decision.

Weighting is no replacement for appropriate sampling but it can help bring imbalances in line. There are several reasons why weighting can be beneficial:

First, it helps make the sample more representative of the target population.
It can help adjust for differential response rates by weighting up groups who under-respond and weighting down groups who over-respond.
It can also allow for comparisons across samples by driving consistency. This can be important in trackers and even concept and ad tests. For example, if the objective is to pick a winning idea among several and there are differences in the samples, the wrong decision could be made.
Weighting is a cost-effective method compared to others – no additional completes are needed.
Finally, it may be the only proven tool left to assure a representative sample. If everything that has been done at the sampling and fielding phases still results in some imbalances, weighting is the one thing that can be used.

There are two key types of weighting – cell and RIM weighting. Generally, RIM weighting is preferred over cell weighting. RIM weighting is an iterative process that is designed to attempt to weight all variables by iteratively adjusting the weights for one variable, then for another, etc., until an adequate solution is yielded that brings all variables into line with their respective targets. Since RIM weighting uses non-interlocking variables, more variables can be used because there are fewer cells. This means a smaller sample size is needed and can also provide more stable results. RIM weighting only requires knowledge of the distribution of each variable separately while knowledge of the relationship between variables is not required.

When considering weighting, it is important to consult a marketing scientist. The marketing scientist can:

Determine if weighting is necessary. In some cases, there may be skews in the data but it is on a variable that doesn’t impact key measures so weighting won’t change the business decisions.
Advise on the appropriateness of weighting small sample sizes. When sample sizes are small there is less stability in the weights.
Raise cautions around extremely unbalanced samples. Weighting is not a good option when imbalances are large. A general rule of thumb is that weighting should not be used to increase the proportion of a sub-group more than double or decrease it by more than half.
Help develop the weighting scheme by identifying the appropriate variables, breaks and target quotas to use. Just as it is important to accurately profile the target population for survey quotas the same is also true for weights. An inaccurate profile will result in weighted data that may not be representative of the target and may lead to incorrect business decisions.
Examine the weighting diagnostics to make sure they meet certain criteria and there is no need for adjustments.

Evaluating the weighting scheme

It is extremely important to evaluate the weighting scheme to make sure it is valid and does not violate any statistical rules. To do this the weighting efficiency is evaluated by comparing the effective base size to the original base size. An effective base size is used to reduce the likelihood of the statistics producing significant results simply because the weighting has made adjustments to the data. There are no hard and fast rules for weighting efficiency because it always depends on the circumstances of the weighting, however, any time the effective base size is less than 70 percent of the original base size the weighting should be carefully examined. If necessary, the number of weighting variables or breaks might be reduced to increase the weighting efficiency.

In addition to weighting efficiency, it is also important to look at the actual size of the weights. There are several basic rules:

No weights should be above 5.0. If there are just a few weights above 5.0 then the weights should be capped at 5.0. If there are many weights above 5.0 then the weighting scheme needs to be reevaluated.
The percent of respondents with weights 2.0 or greater should not exceed 10 percent of original base and/or when weighted those with weights of 2.0 or greater should not exceed 30 percent of the effective base.
The average weight for outliers (weights of 2.0 or greater) should not exceed 3.0.
Any weights close to zero (less than .01) suggests there is something wrong with the choice of variables to weight on and the weighting scheme should be reexamined.

After the weighting scheme passes the statistical validity tests mentioned above, the weighted data needs to be examined. First, make sure that the weighting has the desired effect on demos and habits. Second, key measures should be carefully examined to understand if absolute and relative results have changed due to the weighting.

Remember, weighting is no replacement for appropriate sampling but it can be extremely beneficial for research by making the sample more representative of the target population, adjusting for varying response rates, comparing across samples, reducing sample costs and assuring a representative sample. Following the rules outlined above will help assure a quality weighting scheme and correct business decisions.