A confidence interval is a range of values that represent an estimate for an unknown population parameter. Confidence intervals often vary from sample to sample as they are calculated directly from observations. Some confidence intervals actually contain the true population parameter while some may not.
A confidence level gives the percentage of all samples that are expected to contain the population parameter. Common confidence levels that are used in statistics are 99%, 95% and 90%, which respectively correspond to a 99%, 95%, or 90% probability that the confidence interval contains the true population parameter.
To estimate a confidence interval for an unknown population parameter, such as the mean, an approximation for the population mean, μ, is carried out at first by using the sample mean, , as an estimator.
= .
When taking several samples, each sample can produce different values for the sample mean, but most of the sample means should be relatively close to one another. The endpoints for the confidence interval can be determined by considering that the sample mean from a normally distributed sample is normally distributed, with a standard error of:
Standard Error = ,
where σ is the standard deviation and n is the number of observations in our sample.
Then, standardizing the sample mean gives:
Z = .
When standardizing the sample mean, , the mean, μ, is subtracted to center the sample mean, then divided by the standard error to scale the value. The resulting value is a so-called standard score, or z-score, Z, which corresponds to values in a standard normal distribution.
If there is a significance level of α = 0.05 (corresponding to a Confidence Level of 1 - α = 95% ), it can be used to determine values for -z and z which form the lower and upper endpoints for the confidence interval:
The value z is derived from the cumulative normal distribution function, represented by :
,
Substituting this back into the above:
,
,
.
From this, the endpoints can be determined:
Lower Endpoint = ,
Upper Endpoint = .
Generalizing this, the mean confidence interval can be calculated in the following manner:
,
where α is the significance level, σ is the standard deviation, n is the sample size, and μ is the mean.
|