Skip to: Main Content / Navigation

Data Use: Nonparametric tests: sturdy alternatives



Article ID:
20020509
Published:
May 2002
Author:
William Bailey

Article Abstract

The current economic conditions have affected strategies of consumer research. This article discusses alternative strategies that are often overlooked: nonparametric tests.

Editor's note: William M. Bailey is principal of WMB & Associates, an Orlando, Fla., statistical services firm.

Does this situation sound familiar? "I can't afford the research plan you advise! Is there a way we can do fewer surveys but still get usable and reliable results?" The current economic conditions have affected strategies of consumer research. As a result, more and more clients are trying to find ways to cut costs while at the same time delivering to business objectives.

As market researchers, we tend to focus on crosstabulations that offer paired tests of proportions and generally take the results right to the portion of the final report that details the statistical results. This is not necessarily intended to be a criticism; it's just the way we typically do consumer research. While this works in many cases, this author is finding that clients are asking somewhat different questions: "How do these two products differ in comparison to these other two?" "Is there a difference in opinion by product within gender or age or...?" They also ask, "How do these product features rank as they apply to the respondent's overall opinion of my company?" As you can see, these questions begin to move things beyond the realm of basic data evaluation.

The preferred research plan is to interview a sufficient number of consumers to make the results statistically reliable at the 90 percent or 95 percent level of confidence with a certain margin of error, e.g., ±5 percentage points. Why? Because that is what we have always done! Depending on how one sets the constraint parameters, this works out to be from 250 to 350 completed interviews at the base level of analysis and then we work up from there. With this base we can apply standard analysis tools such as paired t-tests, analysis of variance, and factor or regression analysis with reasonable comfort. Further, for this response base there usually is marginal violation of the implied assumptions; the data approaches a normal distribution and homogeneity of variance. But is this always the case? Depending on the response scales used, more likely not; there is some violation we could overlook. I am not suggesting that we have done a bad job, we just haven't done an appropriate job for the data's characteristics.

Back to the statement: "I can't afford the research plan you advise! Is there a way we can do fewer surveys but still get usable and reliable results?" Not to worry. There are alternatives available that are often overlooked. These approaches fall into the general category of sturdy or distribution-free statistics or, more specifically, nonparametric statistics.

Sturdy statistics
Most market researchers automatically use procedures that assume that the measurements are drawn from a normal distribution and then proceed to test hypotheses on parameters such as the mean or the variance (usually the standard deviation, which is the square root of the variance). Useful tests include but are not limited to the Student's t or the Z statistic, various forms of regression analysis, and/or analysis of variance to help understand a study's result and/or differences between product or control/treatment sets. These tools are a part of what is called parametric statistical tests.

While some of these statistical tests do work well even if the assumption of normality is violated, extreme violations of this assumption can affect the interpretation of the results. There are technical reasons behind this, such as the fact that the effect of violating the assumption of normality is to decrease the Type I error (a conclusion is drawn that the null hypothesis is false when, in fact, it is true), but that is beyond the scope of our intent here.

If a violation of an assumption is realized, or, as is often the case, if the sample size desired for the analysis base is small, e.g., under 20 or 30 observations - when "traditional" statistical tests become questionable, there is a collection of tests that do not depend that much on the precise shape of the distribution. This class of statistical tests bases themselves on the signs of differences, ranks of measurements, and/or counts of objects falling into categories. Such methods may not rest heavily on the specific parameters of the distribution, and for this reason are called nonparametric or distribution-free tests. They do not make any or as stringent assumptions about the distribution from which the numbers were sampled.

However, the term nonparametric is somewhat misleading, since these statistics do in fact deal with parameters such as the median of a distribution or the probability of success p in a binominal distribution. The main advantage to many of the methods described herein is that they defend themselves against distribution outliers and "off normal distributions" and failures of assumptions. Statisticians use adjectives such as "robust," "resistant" and "sturdy" to describe them.

Specifically, and more importantly, sturdy statistical techniques provide comparable test results to traditional tests when the samples are from asymmetric or skewed distributions. Here the term "power" is usually introduced. While there are transformations available such as taking logarithms or square roots of the data to bring them more in line with appropriate parametric assumptions, sturdy or distribution-free tests are a worthwhile alternative.

Further, sturdy statistical methods are useful in cases when the researcher knows nothing about the parameters of the variable of interest in the population (hence the name nonparametric).

A comparison
This section provides a comparison between tests in these two classifications (called parametric and nonparametric in the table) based on some popular study scenarios. It is not meant to be all-inclusive.

Most parametric tests have their nonparametric analogues. In other words, nonparametric tests exist for most situations a market analyst commonly uses: two independent groups, two matched groups, and multiple groups. The primary difference is that the data is no longer interval; instead it is ordinal (or is treated as ordinal). The table summarizes several "crossover" tools. It offers a very simple comparison between several parametric tests with their analogues.

Parametric Tests

Nonparametric Tests


Independent t-Test

  • Mann-Whitney

  • Median

  • Matched Pairs t-Test

  • Wilcoxon

  • Sign Test

  • One-Way ANOVA
     


  • WilcoxonKruskal-Wallis
     
  • While nonparametric tests make fewer assumptions regarding the nature of distributions, they are usually less powerful than their parametric counterparts. However, in cases where assumptions are violated and interval data is treated as ordinal, not only are nonparametric tests more proper, they can also be more powerful.

    This section highlights the applicability of the nonparametric tests noted above. For more detailed information the reader is directed to a statistical resource, the Internet, or software packages such as (but certainty not limited to) SPSS, SAS, and Prophet. (The author is not endorsing any of these packages, and no rank order is implied.)

  • The Mann-Whitney U test is the most popular of the two-independent-samples tests. It is equivalent to the Wilcoxon rank sum test and the Kruskal-Wallis test for two groups. Mann-Whitney tests whether two sampled populations are equivalent in location. The observations from both groups are combined and ranked, with the average rank assigned in the case of ties. The number of ties should be small relative to the total number of observations. If the populations are identical in location, the ranks should be randomly mixed between the two samples. The number of times a score from Group 1 precedes a score from Group 2 and the number of times a score from Group 2 precedes a score from Group 1 are calculated. The Mann-Whitney U statistic is the smaller of these two numbers.

  • The Median test tests whether two or more independent samples are drawn from populations with the same median using the chi-square statistic. This test should not be used if any cell has an expected frequency less than one, or if more than 20 percent of the cells have expected frequencies less than five.

  • The Wilcoxon test is used with two related variables to test the hypothesis that the two variables have the same distribution. It makes no assumptions about the shapes of the distributions of the two variables. This test takes into account information about the magnitude of differences within pairs and gives more weight to pairs that show large differences than to pairs that show small differences. The test statistic is based on the ranks of the absolute values of the differences between the two variables.

  • The Sign test is designed to test a hypothesis about the location of a population distribution. It is most often used to test the hypothesis about a population median, and often involves the use of matched pairs, for example, before and after data, in which case it tests for a median difference of zero. In many applications, this test is used in place of the one sample t-test when the normality assumption is questionable. It is a less powerful alternative to the Wilcoxon signed ranks test, but does not assume that the population probability distribution is symmetric. This test can also be applied when the observations in a sample of data are ranks; that is, ordinal data rather than direct measurements.

  • The Kruskal-Wallis test is used to test the null hypothesis that "all populations have identical distribution functions" against the alternative hypothesis that "at least two of the samples differ only with respect to location (median), if at all." It is the analogue to the F-test used in analysis of variance. While analysis of variance tests depend on the assumption that all populations under comparison are normally distributed, the Kruskal-Wallis test places no such restriction on the comparison. It is a logical extension of the Wilcoxon-Mann-Whitney test.

  • The Spearman Rank Correlation Coefficient bases itself on the rank ordering of each variable. It may also be a better indicator that a relationship exists between two variables when the relationship is non-linear.

  • Kendall's tau-b is a measure of association for ordinal or ranked variables that takes ties into account. The sign of the coefficient indicates the direction of the relationship, and its absolute value indicates the strength, with larger absolute values indicating stronger relationships.

    Validate, validate, validate
    While in most cases, we are able to be "traditional," there are alternatives if the situation warrants. Regardless, the analyst has a basic responsibility: validate, validate, validate, and then analyze and interpret with confidence.

  • Page Tools
    Bookmark and Share

    Related Suppliers: Research Companies from the SourceBook

    Click on a category below to see firms that specialize in the following areas of research and/or industries

    Specialties

    Conduct a detailed search of the entire Researcher SourceBook directory

    Related Articles

    There are 756 articles in our archive related to this topic. Below are 5 selected at random and available to all users of the site.

    10 tips on tracking research
    With the exception of competitive sales data, researchers probably spend more money on tracking research than anything else. This article provides ten tips for optimizing the program and achieving maximum value, including 1) identifying the real purpose of the research, 2) basic focus, 3) research scope, 4) continuous versus “dipstick” interviewing, 5) criteria for choosing a research firm for your tracking study, 6) interviewing mode, 7) the questionnaire, 8) preliminary analytic plan, 9) mining the data, and 10) dress rehearsal. Also stressed is the key element of planning.
    Multi-mode research dos and don'ts
    Automating market segmentation
    Classification tree software scans through survey-type data sets to automatically identify the key multi-dimensional attributes that define a customer segment. This article describes how Market Facts of Canada Limited applied this tool to analyze its Market Facts' Household Flow of Funds survey. It explores advantages of using classification trees, including the ability to summarize multi-dimensional data relationships and evaluate all possible segmenting variables at any level of the tree.
    Data Use: Understanding conjoint analysis in 15 minutes
    Marketers use conjoint analysis determine what features a new product should have and how it should be priced. This article details the basics of conjoint analysis using a simple example.
    Qualitatively Speaking: Is the quantitative follow-up an endangered species?
    The Phase 2 follow-up study, which takes the findings from qualitative research and examines them using a quantitative study, seems to be in danger of disappearing. The author explores the threats facing it and outlines strategies to save it.

    See more articles on this topic

    Related Events

    DATA MATTERS CONFERENCE
    February 17, 2010
    Research Magazine will hold a conference, themed 'Data Matters,' on February 17 at the Mayfair Conference Centre in London.
    RIVA COURSE 241: QUALITATIVE ANALYSIS AND REPORTING
    February 18-19, 2010
    RIVA Training Institute will hold a course, themed 'Qualitative Analysis and Reporting' on February 18-19 in Rockville, Md.

    View more Related Events...

    Related Discussion Topics

    TURF Simulator
    01/11/2010 by William Bailey
    TURF Simulator
    01/08/2010 by Manmit J. Shrimali
    TURF in Excel
    07/14/2009 by William Bailey
    TURF excel-based simulator
    07/13/2009 by Kris Kumar
    Stat testing / Bonferroni correction
    05/06/2009 by Ian L. Straus

    View More

    Related Glossary Terms

    Search for more...