Skip to: Main Content / Navigation

  • Facebook
  • Twitter
  • LinkedIn
  • Add This

Estimating sample size for a descriptive study in quantitative research



Article ID:
19990603
Published:
June 1999
Author:
Gang Xu

Article Abstract

Sample size often must be calculated in quantitative marketing research, which requires knowing the variable of interest. Using two cases, this article discusses the variable of interest.

Editor’s note: Gang Xu is a senior research consultant in statistics at Brintnall & Nicolini, Inc., a Philadelphia, Pa., health care consulting and marketing research firm.

In quantitative marketing research, we frequently need to calculate the sample size in order to make inferences about the parent population with a given level of confidence. In general, the larger the sample size is, more precise your estimation is. However, more subjects in the study also leads to a higher cost. Therefore, we need to calculate the minimum number of subjects that are required for a study.

In calculating the required sample size, we need to know the characteristic of the variable of interest. Is that a continuous variable (e.g., mean) or a dichotomous variable (e.g., proportion)? In a descriptive quantitative research study, the sample size varies depending on this characteristic of the variable of interest. We’ll concentrate on the variable of interest as the focus of our discussion on the following two sections.

A. Variable of interest is a continuous variable

Case study one
A pharmaceutical company is interested in knowing the average weekly working hours of primary care physicians. You, as a researcher, want to be 95 percent confident that the true population mean of the working hours is within a specified number of units of the estimated mean you calculate from your sample. For instance, after the data is collected from your survey, you find that the average weekly working hours in your sample is 60. You want to be 95 percent confident that the population mean is within a 10 unit interval, that is, 60±10.

Here, the average working hour is the variable of interest. It is a mean. In estimating the sample size, the variability of the data in the parent population needs to be taken into consideration. Assuming that the distribution of the sample is approximately normal, the following formula can be used to calculate the size of the sample:

              Z2 S2
  n   =   --------
                d2

Where:
n is the size of sample;
Z is the z-statistics for the desired level of confidence;
S is the population standard deviation;
d is the half width of the desired interval.

Z is a fixed value set by you, the researcher. When we say "a desired level of confidence," we usually refer to two levels: 95 percent and 99 percent level of confidence. Holding other variables constant, a higher level of confidence (e.g., 99 percent) requires a larger sample size than a lower level of confidence (e.g., 95 percent). For 95 percent confidence level, Z = 1.96 and for 99 percent confidence level, Z = 2.58. In this example, you have chosen a 95 percent confidence level.

D is also the fixed value at your estimate and choice. In simple terms, d can be thought of as a measure of the precision of sample estimates. A narrow interval (say 55 to 65 with a mean of 60) is more precise than a wider one (say 50 to 70). The former requires a larger sample size than the latter. In this example, you have chosen d = 10.

We usually don’t know the population standard deviation (S). However, you may make educated guesses about it and calculate the size of the sample based on the guesses. For instance, you may guess that the population standard deviation is 30 and then the required size of the sample will be:

              1.962 * 302
  N   =   --------------   =   34.6
                      102

Rounding up the number of 34.6, you need a sample size of 35 to be 95 percent confident that the true mean of physicians’ weekly working hours is within a half width of 10 hours. In other words, you are 95 percent confident that the true population mean ranges from 10 hours lower to 10 hours higher than the mean you obtain from the survey of the sample of 35 physicians.

Note that a higher confidence level would require a larger sample size. In the example above, if you want to increase the confidence level from 95 percent to 99 percent, you then substitute 1.96 for 2.58. You would need a sample size of 60. The precision of the estimate of the variable is inversely related to the size of sample. Thus, a decrease of value of d (a higher precision) requires a larger sample size. Since the variability of population is positively related to the sample size, an increase of value of S increases the sample size.

Suppose now we have d = 5, S = 40 and the confidence level of 99 percent. Put these values into the formula, we find that the required sample size is:

              2.582 * 402
  n   =   ---------------   =   426
                      52

B. Variable of interest is a dichotomous variable

Case study two
A company is interested in knowing the percent of market share of drug X prescribed by primary care physicians for the treatment of diabetes Type I patients. You are asked by the company to conduct a survey among primary care physicians to find out the percent of these physicians’ prescription of drug X. Based on a pilot study, 10 percent of patients with diabetes Type I were prescribed drug X by primary care physicians. You want to be 95 percent confident that the true population percent of market share of drug X is no more than .05 greater or less than the proportion you estimate from your survey. What is the required sample size?

Here, the proportion of market share of drug X is the variable of interest. It is a dichotomous variable.

The formula of calculating the sample size is:

              Z ( p ( 1-p))
  n   =   -----------------
                      d2

Where:

n is the size of sample;
Z is the z-statistics for the desired level of confidence;
p is the estimate of expected proportion with the variable of interest in the population;
d is the half width of the desired interval.

Again, Z = 1.96 for the 95 percent confidence level and 2.58 for the 99 percent confidence level. In the example above, p = .10 and d = .05. Put these values into the formula, we have a required sample size:

              1.962 (.1 (1-.1))
  n   =   --------------------   =   138.3
                     .052

Thus you need to have 138 physicians in your sample to be 95 percent confident that the true proportion of market share for drug X in the population is within .05 of the proportion you estimate.

Here, p refers to the proportion you estimate from the survey about the market share for drug X. Since p (1-p) is positively related to the required sample size, the maximum value for p (1-p) is when p = 0.5. For that reason, when you have no prior knowledge or assumption about the market share for that drug, you can calculate the sample size based on a worst-case scenario when p = .50; d in this case equals .05:

              1.962 (.5 (1-.5))
  n   =   --------------------   =   384.16
                     .052

You thus need 384 physicians in your survey.

It should be noted that, in this article, sample size is calculated for descriptive study. For studies that may involve inference statistical tests such as t-test, analysis of variance, correlation or regression, separate estimations of sample size are needed.

Summary

1. For a descriptive study, the calculation of a sample size largely depends on whether the variable of interest is a mean or a proportion.

2. When the variable of interest is a mean, we need to estimate the population standard deviation, whereas the other values in the formula are fixed.

3. When the variable of interest is a proportion, we need to give an estimate of the expected proportion with such a variable of interest. A conservative approach to this estimate is to give an estimate of 50 percent, meaning that the sample size is estimated in a worst-case scenario.

For a study that may requires inference statistics, the calculation of a sample size may be based on a particular statistical test as needed.

Comment on this article

comments powered by Disqus

Related Glossary Terms

Search for more...

Related Events

2014 PHARMA CI CONFERENCE AND EXHIBITION
September 9-10, 2014
The 2014 Pharma CI Conference and Exhibition will be held on September 9-10 at the Hilton Parsippany Hotel in Parsippany, N.J.
THE PMRG INSTITUTE - 8TH ANNUAL MEETING - HEALTHCARE MARKETING RESEARCH - A 360 VIEW
October 19-21, 2014
The Pharmaceutical Marketing Research Group will hold its annual meeting of The PMRG Institute on October 19-21 at the Hyatt Regency in New Brunswick, N.J. The theme of the meeting is 'Healthcare Marketing Research - a 360 View.'

View more Related Events...

Related Articles

There are 1499 articles in our archive related to this topic. Below are 5 selected at random and available to all users of the site.

Using mind maps in qualitative research
The author explores the use of mind maps during qualitative research, which can help in the classification and analysis of respondent comments during focus groups. By creating mind maps during or after the group, moderators and clients can explore and group the psychological tangents that participants embark on, adding a sense of clarity to an otherwise murky process.
Data Use: Neural networks part III: Using the past to forecast the future
This third article of a three-part series on neural networks addresses forecasting the future using knowledge of the past.
Research lays foundation for hospital’s non-traditional ad campaign
Akron Children’s Hospital used telephone research and focus groups to determine the most effective messages to communicate in a new ad campaign. The hospital’s clinical excellence in high-acuity service areas was chosen as one of the attributes to promote.
Meta-analysis offers research on research for MR
Meta-analysis is a relatively new research-on-research tool that marketing researchers can use to examine a collection of results across multiple primary studies. This article covers the basics of meta-analysis and addresses some criticism toward it.
Tacos in Tel Aviv? Research advises Mexican restaurant chain against opening franchises in Israel
The authors discuss the variable aspects of conducting a feasibility study and detail the market research aspect of the effort using the proposed launch of Mexican dining franchises in Israel.

See more articles on this topic

Related Suppliers: Research Companies from the SourceBook

Click on a category below to see firms that specialize in the following areas of research and/or industries

Specialties

Industries

Conduct a detailed search of the entire Researcher SourceBook directory

Related Discussion Topics

request
06/06/2014 by Monika Kunkowska
TURF excel-based simulator
04/17/2014 by Giovanni Olivieri
XLSTAT Turf
04/10/2014 by Felix Schaefer
TURF excel-based simulator
03/25/2014 by Werner Mueller
I would like Turf Macro too!
03/06/2014 by Neelam Hinduja

View More