Back to chapter

4.10:

Chebyshev’s Theorem to Interpret Standard Deviation

JoVE 핵심
통계학
JoVE 비디오를 활용하시려면 도서관을 통한 기관 구독이 필요합니다.  전체 비디오를 보시려면 로그인하거나 무료 트라이얼을 시작하세요.
JoVE 핵심 통계학
Chebyshev’s Theorem to Interpret Standard Deviation

Languages

소셜에 공유하기

Chebyshev's theorem helps interpret the value of a standard deviation. It is applicable for almost all the datasets with normal, unknown, or skewed distributions.

In contrast, the empirical rule only applies to normally distributed data.

Consider the dataset of the lifespan of animals in a zoo, with a mean of 13 years and a standard deviation of 1.5 years.

According to Chebyshev's theorem, the proportion of animal ages within K standard deviations is at least one minus one divided by K squared. Here, K is any positive number greater than one.

For K equal to two, at least 75 percent of the animals' ages are within two standard deviations of the mean. Similarly, for K equal to three, at least 89 percent of the animal's ages fall within three standard deviations of the mean.

Although Chebyshev's theorem has wide statistical applications, it only provides lower limit approximations for standard deviations greater than one. It's important to note that  Chebyshev's theorem provides only approximations.

4.10:

Chebyshev’s Theorem to Interpret Standard Deviation

Chebyshev’s theorem, also known as Chebyshev’s Inequality, states that the proportion of values of a dataset for K standard deviation is calculated using the equation:

Equation1

Here, K is any positive integer greater than one. For example, if K is 1.5, at least 56% of the data values lie within 1.5 standard deviations from the mean for a dataset. If K is 2, at least 75% of the data values lie within two standard deviations from the mean of the dataset, and if  K is equal to 3, then at least 89% of the data values lie within three standard deviations from the mean of that dataset.

Interestingly, Chebyshev’s theorem estimates the proportion of data that will fall inside (minimum proportion) and outside (maximum proportion) a given number of standard deviations. If K is equal to 2, then the rule suggests a possibility that 75% of the data values lie inside two standard deviations from the mean and 25 % of the data value lie outside the two standard deviations away from the mean. It is important to understand that this theorem provides only approximations and not exact answers.

One of the advantages of this theorem is that it can be applied to datasets having normal, unknown, or skewed distributions. In contrast, the empirical or three-sigma rule can only be used for datasets with a normal distribution.