Skip to: Main Content / Navigation

  • Facebook
  • Twitter
  • LinkedIn
  • Add This

Estimating sample size for a descriptive study in quantitative research



Article ID:
19990603
Published:
June 1999
Author:
Gang Xu

Article Abstract

Sample size often must be calculated in quantitative marketing research, which requires knowing the variable of interest. Using two cases, this article discusses the variable of interest.

Editor’s note: Gang Xu is a senior research consultant in statistics at Brintnall & Nicolini, Inc., a Philadelphia, Pa., health care consulting and marketing research firm.

In quantitative marketing research, we frequently need to calculate the sample size in order to make inferences about the parent population with a given level of confidence. In general, the larger the sample size is, more precise your estimation is. However, more subjects in the study also leads to a higher cost. Therefore, we need to calculate the minimum number of subjects that are required for a study.

In calculating the required sample size, we need to know the characteristic of the variable of interest. Is that a continuous variable (e.g., mean) or a dichotomous variable (e.g., proportion)? In a descriptive quantitative research study, the sample size varies depending on this characteristic of the variable of interest. We’ll concentrate on the variable of interest as the focus of our discussion on the following two sections.

A. Variable of interest is a continuous variable

Case study one
A pharmaceutical company is interested in knowing the average weekly working hours of primary care physicians. You, as a researcher, want to be 95 percent confident that the true population mean of the working hours is within a specified number of units of the estimated mean you calculate from your sample. For instance, after the data is collected from your survey, you find that the average weekly working hours in your sample is 60. You want to be 95 percent confident that the population mean is within a 10 unit interval, that is, 60±10.

Here, the average working hour is the variable of interest. It is a mean. In estimating the sample size, the variability of the data in the parent population needs to be taken into consideration. Assuming that the distribution of the sample is approximately normal, the following formula can be used to calculate the size of the sample:

              Z2 S2
  n   =   --------
                d2

Where:
n is the size of sample;
Z is the z-statistics for the desired level of confidence;
S is the population standard deviation;
d is the half width of the desired interval.

Z is a fixed value set by you, the researcher. When we say "a desired level of confidence," we usually refer to two levels: 95 percent and 99 percent level of confidence. Holding other variables constant, a higher level of confidence (e.g., 99 percent) requires a larger sample size than a lower level of confidence (e.g., 95 percent). For 95 percent confidence level, Z = 1.96 and for 99 percent confidence level, Z = 2.58. In this example, you have chosen a 95 percent confidence level.

D is also the fixed value at your estimate and choice. In simple terms, d can be thought of as a measure of the precision of sample estimates. A narrow interval (say 55 to 65 with a mean of 60) is more precise than a wider one (say 50 to 70). The former requires a larger sample size than the latter. In this example, you have chosen d = 10.

We usually don’t know the population standard deviation (S). However, you may make educated guesses about it and calculate the size of the sample based on the guesses. For instance, you may guess that the population standard deviation is 30 and then the required size of the sample will be:

              1.962 * 302
  N   =   --------------   =   34.6
                      102

Rounding up the number of 34.6, you need a sample size of 35 to be 95 percent confident that the true mean of physicians’ weekly working hours is within a half width of 10 hours. In other words, you are 95 percent confident that the true population mean ranges from 10 hours lower to 10 hours higher than the mean you obtain from the survey of the sample of 35 physicians.

Note that a higher confidence level would require a larger sample size. In the example above, if you want to increase the confidence level from 95 percent to 99 percent, you then substitute 1.96 for 2.58. You would need a sample size of 60. The precision of the estimate of the variable is inversely related to the size of sample. Thus, a decrease of value of d (a higher precision) requires a larger sample size. Since the variability of population is positively related to the sample size, an increase of value of S increases the sample size.

Suppose now we have d = 5, S = 40 and the confidence level of 99 percent. Put these values into the formula, we find that the required sample size is:

              2.582 * 402
  n   =   ---------------   =   426
                      52

B. Variable of interest is a dichotomous variable

Case study two
A company is interested in knowing the percent of market share of drug X prescribed by primary care physicians for the treatment of diabetes Type I patients. You are asked by the company to conduct a survey among primary care physicians to find out the percent of these physicians’ prescription of drug X. Based on a pilot study, 10 percent of patients with diabetes Type I were prescribed drug X by primary care physicians. You want to be 95 percent confident that the true population percent of market share of drug X is no more than .05 greater or less than the proportion you estimate from your survey. What is the required sample size?

Here, the proportion of market share of drug X is the variable of interest. It is a dichotomous variable.

The formula of calculating the sample size is:

              Z ( p ( 1-p))
  n   =   -----------------
                      d2

Where:

n is the size of sample;
Z is the z-statistics for the desired level of confidence;
p is the estimate of expected proportion with the variable of interest in the population;
d is the half width of the desired interval.

Again, Z = 1.96 for the 95 percent confidence level and 2.58 for the 99 percent confidence level. In the example above, p = .10 and d = .05. Put these values into the formula, we have a required sample size:

              1.962 (.1 (1-.1))
  n   =   --------------------   =   138.3
                     .052

Thus you need to have 138 physicians in your sample to be 95 percent confident that the true proportion of market share for drug X in the population is within .05 of the proportion you estimate.

Here, p refers to the proportion you estimate from the survey about the market share for drug X. Since p (1-p) is positively related to the required sample size, the maximum value for p (1-p) is when p = 0.5. For that reason, when you have no prior knowledge or assumption about the market share for that drug, you can calculate the sample size based on a worst-case scenario when p = .50; d in this case equals .05:

              1.962 (.5 (1-.5))
  n   =   --------------------   =   384.16
                     .052

You thus need 384 physicians in your survey.

It should be noted that, in this article, sample size is calculated for descriptive study. For studies that may involve inference statistical tests such as t-test, analysis of variance, correlation or regression, separate estimations of sample size are needed.

Summary

1. For a descriptive study, the calculation of a sample size largely depends on whether the variable of interest is a mean or a proportion.

2. When the variable of interest is a mean, we need to estimate the population standard deviation, whereas the other values in the formula are fixed.

3. When the variable of interest is a proportion, we need to give an estimate of the expected proportion with such a variable of interest. A conservative approach to this estimate is to give an estimate of 50 percent, meaning that the sample size is estimated in a worst-case scenario.

For a study that may requires inference statistics, the calculation of a sample size may be based on a particular statistical test as needed.

Comment on this article

comments powered by Disqus

Related Glossary Terms

Search for more...

Related Events

THE RESEARCH CLUB NETWORKING EVENT - SYDNEY, AU
December 4, 2014
The Research Club will host a networking event on December 4 in Sydney, Australia, in conjunction with the IIEX.
The Quirk"s Event
February 23-24, 2015
The Quirk’s Event is a two-day experience that flips the traditional conference model and centers around the exhibit hall. In Brooklyn, February 23-24.

View more Related Events...

Related Articles

There are 1525 articles in our archive related to this topic. Below are 5 selected at random and available to all users of the site.

6 essential steps for successful global segmentation
This article looks into the challenges faced in developing market segmentations that work both globally and in diverse local markets and explores ways of overcoming these challenges.
Use online surveys to increase the Web's communication potential
The Web can be a great communication tool, but most corporate communication via the Web goes in only one direction - from the brand to the customer. This article discusses conducting online surveys, including objectives, reporting and developing an online customer panel.
Four steps to improve customer satisfaction and loyalty
The author explains how to interpret customer satisfaction scores and how to use these scores to improve customer satisfaction in four steps.
Maps add value to research
Given the voluminous amount of numbers researchers work with, some statistics may be overlooked in traditional rows and columns of data. Significant research findings may be found more quickly and easily when spatial relationships of data are viewed via maps. This article discusses maps, including how they can present and analyze survey results and how to get started mapping.
An analysis of the past 20 years of client-side research buying
Two decades’ worth of data from the Quirk’s circulation database is examined to discover what shifts have taken place in the research industry - including the advent of online and the latest economic crisis - and to predict where it might be headed.

See more articles on this topic

Related Suppliers: Research Companies from the SourceBook

Click on a category below to see firms that specialize in the following areas of research and/or industries

Specialties

Industries

Conduct a detailed search of the entire Researcher SourceBook directory

Related Discussion Topics

Hi Giovanni
10/17/2014 by Dohyun Kim
request
06/06/2014 by Monika Kunkowska
TURF excel-based simulator
04/17/2014 by Giovanni Olivieri
XLSTAT Turf
04/10/2014 by Felix Schaefer
TURF excel-based simulator
03/25/2014 by Werner Mueller

View More