## Understanding the Monster “Variance” part-1

Amrendra RoyThis is one of the ways of calculating the variability in the data set. Variance helps us in understanding how the data is arranged around the mean. In order to do so, we need to calculate the deviation of each observation from the mean in the data set .

For example: following is the time taken by me during the week to reach the office. The deviation of each observation from the mean time is given below.

Now next step is to calculate the average deviation from the mean using well-known formula

Note that* the sum of all positive deviations = sum of all negative deviations *which indicates that the mean divided the data set in two equal halves. As a result the sum of all deviation becomes zero, hence we need some other way to calculate this average deviation about the mean.

In order to avoid the issue, a very simple idea was used

*Negative number → Square of negative number →** positive number → **square root of this number → **parent number*

Hence square of all the deviations are calculated and summed-up to give **sum of squares** (simply **SS**) __[1]__. This SS is then divided by total number of observations to give *average variance **s² around the mean*.__[2]__ The square root of this variance gives *standard deviation **s*, the most common measure of variability.

What it physically means is that on an average data is deviating 7.42 units or simply one standard deviation (±1s) in either of the directions in a given data set.

Above discussion about the sample standard deviation represented by s. For population, variance is represented by σ² and standard deviation by σ.

The sample variance s² is the estimator of the population variance σ². The standard deviation is easier to interpret than the variance because the standard deviation is measured in the same units as the data.

* *__[1]__ Popularly known as sum of squares, this most widely term used in ANOVA and Regression analysis

__[2]__ SS divided by its degree of freedom → mean sum of squares or MSE, these concepts would appear in ANOVA & Regression analysis.

**Related articles:**

Why it is so Important to Know the Monster “Variance”? — part-2

You just can’t knock down this Monster “Variance” —- Part-3

Is this information useful to you?

Kindly provide your feedback