## Sample Variance vs. Population Variance: Bessel’s Correction

Consider that you have a database of items. This database forms the whole population of the statistical operations that comes. If you calculate the mean, variance, and standard deviation of these items, then you are actually computing the *population *mean (), the *population* variance (), and the *population* standard deviation ().

But if you draw some random samples out of the population, then you are actually sampling the population, and estimating the true statistics using those samples (maybe because it is expensive to do the calculations for the whole population). Statisticians usually use different names and notations for the values calculated from samples, e.g., the *sample* mean ().

The sample variance which is calculated using the same formula of calculating population variance is *biased* towards the sample items. More formally its expected value does not equal the population variance:

To solve this problem, the sample variance is *corrected *by multiplying it by or simply using instead of when calculating the mean of squared deviations, i.e.:

This value is called the *unbiased sample variance* (), for it is proved that [+]:

To have different notations, the *biased sample variance* is shown by .

Using instead of in the formula for variance is called *Bessel’s correction*.

## Some Notes about Expected Values

Expected value of a continuous random variable is given by:

where is the probability density function of the random variable . Now the question is how do we calculate , e.g., ? Do we know for ? The answer is that we don’t need to. No matter what we do with , by applying to it, we have:

therefore:

.

1comment