Back to top

Variance and Fluctuations

Back to top
Imagine you have a new list of numbers, like this: $4,5,4,3,4,6,4,5$. There are 8 numbers, 4 of which are unique, so the 4 probabilities are $P_3=1/8$, $P_4=4/8$, $P_5=2/8$, and $P_6=1/8$, and of course $P_3+P_4+P_5+P_6=1$. The mean will be given by

$\bar{x} = 3\cdot (1/8) + 4\cdot (4/8)+ 5\cdot (2/8) + 6\cdot (1/8)=4.375$

and the variance (square root of equation $\ref{evar}$) is $0.86$.

Now add another point to this distribution, one that is far from the mean, e.g. $x_9=15$, and recalculate the mean and variance. The new mean will be $\bar x=5.56$ and new variance will become $\sigma = 3.44$. The mean changed by a small amount compared to the old mean ($\sim 25\%$) however the variance changed by huge amount: from $0.86$ to $3.44$! That's a big percentage change.

If we look at all the individual points one by one and form the variance using equation $\ref{evar}$, each term we would find the following values in the sum:

Value4543464515$\bar x=5.56$
$\sigma^2_i$2.420.312.426.532.420.20 2.420.3189.20$\sigma = 3.44$
As you can see, each of the first 8 terms contribute some small amount to the variance, but nothing like the last term, which is more than 10 times bigger than all but one other term. What this tells you is that the variance is a lot more sensitive to an "outlier", than is the mean.

To consider how signficant each contribution to the variance is, we should look at things on the same scale. Each term in the variance involves the square of the difference between the particlar value ($x_i$) and the mean. If we instead take the square root of each of the terms, it would look like this:

Value4543464515$\bar x=5.56$
$\sqrt{\sigma^2_i}$1.560.561.562.561.560.44 1.560.569.44$\sigma = 3.44$
There are many good ways to characterize how much each term contributes to the variance, and that leads us to the concept of "significance". Here's one proposal that is commonly used: take each term that contributes to the variance ($(x_i-\bar{x}))^2$), and find out how much each contributes to the and divide it by the final variance ($sum(x_i-\bar{x})^2$): $$S_i\equiv \frac{(x_i-\bar{x})^2}{\sum(x_i-\bar{x}))^2}\label{esig}$$ Each of these ratios should be dimensionless, tells you how much it contributes to the variance, and should sum to 1.0. For our example, you should get the following table for $S_i$:
Value4543464515
$S_i^2$0.0230.0030.0230.0610.0230.02 0.0230.0030.840
As you can see, the last point (15) is very significant compared to the other points, using this measure: it contains $84\%$ of the variance right there. Again, this is just telling you that the variance is very sensitive to outliers. Hence its value.

Below, you will see a table that consists of an initial list of some values ($x_i$), the contribution to the variance for each piece ($\sigma_i\equiv\|x_i-\bar{x})$), and the mean and variance as given by equations $\ref{emean}$ and $\ref{evar}$. You can use the number widget and enter any integer and see how it changes the mean and variance. Try adding small and large numbers and see how it moves the mean by a little, but the variance by a lot.

$\bar{x}$$=$ 
$\sqrt{\sigma^2}$$=$ 

You can play the same game and calculate the moment of inertia about the mean, and you would get the variance. In mechanics, the first moment is the center of mass, and the 2nd moment is the moment of inertia, and so on. Hence the correspondence to the mean and variance in the names.