Note From the Editor: This is the fourth article in our “Spotlight on Statistics” series, which aims to clarify statistical practices used in research.
The term confidence interval (CI) can be confusing partly because it differs from the commonly known p value. But with practice and an understanding of what its analysis demonstrates, interpreting the CI gets easier. Type I errors (in which the researcher concludes that a difference exists between the value of the parameter and some other value of interest when none actually exists) frequently are addressed by significance levels, which traditionally are set at a p value of 0.05 or less. This significance level indicates there are 5 chances in 100 that the finding of a difference is incorrect. A calculated p value less than 0.05 indicates fewer than 5 in 100 chances that the finding is incorrect.
Research studies report CIs related to various statistics. Our efforts to obtain data and calculate statistics are an attempt to measure the actual value of the parameter we could determine for certain only by sampling the entire population—generally an impossible task. The CI yields information on how confident we are as to how the parameter’s true value relates to the value of interest.
Let’s say we wish to determine whether the true values of odds ratios and risk ratios differ from 1. A true value of 1 would mean no difference exists between the risk of an occurrence in the groups studied. Similarly, suppose we wish to find out whether the true values of correlations differ from zero; the closer the value is to zero, the weaker the relationship between the two variables. We also may want to know whether the true value of the difference between changes in groups differs from zero.
In each case, we obtain data, calculate a statistic, and ask whether the statistic is statistically significantly different from the relevant value of interest. If being statistically significantly different means having less than a 5% chance of a Type I error, we want to know if the values discussed above (1 for odds and risk ratios, zero for correlations or differences in means) are within the 95% CI.
Interpreting the CI correctly
Many people think a 95% CI indicates a 95% chance that the actual value is between the upper and lower bounds of the interval. But the correct interpretation
is that if we draw multiple samples from all possible members of the population and calculate a 95% CI for each of the samples, 95% of the intervals would include the parameter’s actual value.
Thus, a 95% CI is related to the 5% chance of a Type I error. For example, in the analytical portion of our previous study testing an intervention’s effect on breastfeeding duration among low-income women, our estimate of the actual value of the odds ratio comparing the intervention group with the usual care group for breastfeeding at 6 weeks was 1.72. (See “Grasping the all-important concept of risk” in the January 2011 issue of American Nurse Today.)
In Table 1 (view by clicking the PDF icon above), we show the CI as a combination of the lower and upper bounds. The 95% CI for the odds ratio at 6 weeks has a lower bound of 1.07 and an upper bound of 2.76. For some statistics, the CI generally isn’t symmetrical because of the distribution of the statistic; this is perfectly acceptable.
Given the CI, we concluded that our estimate of the odds ratio is statistically significantly different from 1 (that is, the actual value of the odds ratio isn’t 1) and that less than a 5% chance of a Type I error exists. In other words, because we designed the data collection and statistical calculation to obtain a 95% CI, we concluded there’s a 95% chance that the interval contains the actual odds ratio; if it does, the actual odds ratio isn’t 1. However, a 5% chance exists that the CI does not contain the actual odds ratio; in that case, it might still be 1.
This logic takes some getting used to. There’s not a 95% chance that the estimate is the actual value; nor is there a 95% chance that the actual value isn’t zero. In other words, the actual value either is or is not 1. Here’s the key: The calculated value of the statistic is an estimate expected to be the actual value of the odds ratio, and a 95% chance exists that the interval (which in this case doesn’t include 1) includes the actual value.
In contrast, the 95% CI for the odds ratio at 24 weeks ranges from 0.69 to 1.87. In this case, the interval includes 1. Thus, we can’t conclude that only a 5 in 100 chance exists that the value isn’t 1. So we conclude that the difference from 1 isn’t statistically significant.
When outcomes have continuous values
Imagine the possibility of a different breastfeeding intervention study and all the ways a sample of 50 may be drawn from a target population of 1,000 women. Suppose a new study is designed to look at the change in breastfeeding duration among women given access to a breastfeeding support team for their second child after lacking support for their first child. In that case, some women’s breastfeeding duration may increase while others’ may decrease. If we intervene with all possible women (the hypothetical 1,000 in a local target population), the mean increase might be 2 weeks; that would be the actual population mean. If we were to sample just 50 women at random and calculate a 95% CI, for some combinations of 50 women in the study the CI would contain zero weeks of increase and suggest no change.
The next table illustrates two hypothetical groups—women younger than age 30 and women age 30 and older. For the younger group, the CI doesn’t include zero; the estimate is statistically significantly different from zero. For the older group, the CI does include zero. Because 95% of the possible intervals include the actual value, the result suggests the value may be zero and the intervention had no effect.
When we calculate a 95% CI, even if the experiment is done correctly and the math for the statistical calculation is correct, we might still draw an incorrect conclusion about whether the interval about which we are confident includes the value of interest. And if the CI doesn’t include the actual value, the actual value could be any other value. In fact, it may be the value of interest.
Minimizing CI width
Researchers design studies to minimize the width of the CI and provide the greatest degree of certainty about the statistic that’s calculated to estimate the actual value of the parameter of interest. CI size can be reduced by sampling more individuals or developing an intervention with less variation in the outcome.
When a 95% CI for the risk or odds ratio doesn’t include 1 or the CI for the mean difference doesn’t include zero, this is commonly interpreted to mean the new intervention or treatment is successful; as a result, the intervention or treatment may be recommended. Realistically, though, the recommendation also may depend on cost. We can never measure the entire population on all outcomes, so some uncertainty about findings will always exist. The CI interval is a useful way to quantify that uncertainty.
Visit www.AmericanNurseToday.com for a complete list of references.
Kevin D. Frick is a professor of health policy and management at Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. Renee A. Milligan is a professor of nursing at George Mason University School of Nursing in Fairfax, Virginia. Linda C. Pugh is a professor of nursing at York College of Pennsylvania in York.