Skip to: Main Content / Navigation

  • Facebook
  • Twitter
  • LinkedIn
  • Add This

All Forums > Statistical Analysis Questions

Confidence Interval

I am looking at a report exec summ from a study conducted several years ago under different management which states two confidence intervals and I am trying to understand how the confidence levels were calculate.

Confidence level 1:
Email invitations sent to 3803 addresses with 346 completes. They say this represents a 95% confidence level with a 5% margin of error

Confidence level 2:
Email invitations sent to 61,931 addresses with 3126 completes. They say this represents a 99% confidence interval and a 2.2% margin of error

My question is what is the formula they used to calculate the confidence intervals and margins of error?

CI Site

They bot are close to being "right" depending on assumptions. Check this site, there are others, for further information.


And probably the biggest assumption is that the responses are randomly distributed -- which isn't terribly likely.

Basic sampling theory alert

Sampling error (frequently referred to as "margin of error") can be estimated ONLY for probability samples of a sample population of interest. E-mail invitations for survey participation rarely (if ever) produce probability samples of any relevant sample population due to validity threats to adequate coverage and to reliable sampling control. (This is one of the major issues threatening the scientific value of web surveys.) So, in practice, surveys done using e-mail invitations virtually never can legitimately claim anything regarding sampling error.

Thus, the person who wrote such claims in the report was very likely making an erroneous and inappropriate claim.

If, however, you are fortunate to have something that approximates a true simple probability sample (as opposed to one arrived at through a multi-stage sampling protocol), the equation for calculating a biased estimate for sampling error is this:

(+/-) sampling error = (+/-) z * SQRT (p * q/n)
z is the z value for any given level of confidence (e.g., z = 1.96 for the 95% confidence level
p is the observed sample proportion
q = 1 - p
n = number in sample associated with p
(+/-) indicates the positive or negative value of ...

(Note: The term n - 1 may be substituted for n to produce the unbiased estimator.)

You should note that the sampling error depends on the observation itself as well as the confidence level one chooses to use. You also should note that, for any given pair of values of z and n, sampling error will be maximized when p = 50% (i.e.., when the product of p and 1-p = 0.25) and will grow increasingly smaller as p moves toward the extremes of 100% and 0%. Therefore, when sampling error is reported, it typically is reported under the assumption that the error is maximized at this 50% proportion.

By the way, ASSUMING THAT A SIMPLE PROBABILITY SAMPLE IS BEING DISCUSSED, a maximum margin of error of 5% is obtained at the 95% level of confidence with a sample size of approximately 400.

Jonathan E. Brill, Ph.D.