Two Statistical Concepts You Should Understand

Confidence intervals

A confidence interval refers to the statistical likelihood that a score falls within a given range around an estimate. This is important because nearly all health care quality scores are developed using a statistical sampling method, which means that there is some uncertainty about whether the sample reflects the population.

The confidence interval tells you how confident you can be that the score for the sample represents the score for the entire membership or population. For instance, a 99 percent confidence interval means that, if you drew numerous samples, 99 percent of the estimates would fall within the given range. A narrower range (e.g., a 90 percent confidence interval) would give you less certainty about the estimate. A 95 percent confidence interval is a standard used in scientific research, because it creates a high degree of certainty that the sample reflects the population. It also tends to be the compromise position between the high degree of certainty desired by providers and the lower confidence interval desired by purchasers, who often want to see a more even distribution of scores across the tiers.

Statistical significance

Statistical significance tells you whether two scores are really different from each other. If one health plan has a mammography rate of 77 percent and another has a rate of 80 percent, is one truly better than the other? We determine statistical significance by checking to see if the confidence intervals around the scores overlap; confidence intervals reflect the uncertainty of the estimated rate, which is based on a sample drawn from the larger population. If they do overlap, the scores are basically comparable. If they don't, you can say they are statistically different.

However, statistical significance is not the same as practical significance. If you have enough data points (i.e. if the sample size is large enough), every difference may be statistically significant, but that doesn't necessarily mean that they are worth drawing attention to. For example, satisfaction scores often have a small range of variation, maybe 10 points. If you were to try to compare plans based on these scores, you would need to consider whether a very small difference (e.g., 91 percent versus a 92 percent) really matters. Will consumers get noticeably better care? Similarly, sponsors have to question whether differences are clinically meaningful. For instance, are consumers likely to experience a substantial difference in outcomes because of minor differences in rates? Maybe yes, maybe no, but you should know that before you present the information to consumers.

Return to Document

  Learn More About the Workgroup