Understanding the Monster “Variance” part-1

    Amrendra Roy

    for posts

    This is one of the ways of calculating the variability in the data set.  Variance helps us in understanding how the data is arranged around the mean. In order to do so, we need to calculate the deviation of each observation from the mean in the data set .

    For example: following is the time taken by me during the week to reach the office. The deviation of each  observation from the mean  time is given below.

    Picture3a

    Now next step is to calculate the average deviation from the mean using well-known formula

    Picture21

    Note that the sum of all positive deviations = sum of all negative deviations which indicates that the mean divided the data set in two equal halves. As a result the sum of all deviation becomes zero, hence we need some other way to calculate this average deviation about the mean.

    In order to avoid the issue, a very simple idea was used

    Negative number → Square of negative number → positive number → square root of this number → parent number

    Hence square of all the deviations are calculated and summed-up to give sum of squares (simply SS) [1]. This SS is then divided by total number of observations to give average variance s² around the mean.[2] The square root of this variance gives standard deviation s, the most common measure of variability.

    Picture4

    What it physically means is that on an average data is deviating 7.42 units or simply one standard deviation (±1s) in either of the directions in a given data set.

    Picture5

    Above discussion about the sample standard deviation represented by s. For population, variance is represented by σ² and standard deviation by σ.

    Picture40

    The sample variance s² is the estimator of the population variance σ². The standard deviation is easier to interpret than the variance because the standard deviation is measured in the same units as the data.


     [1] Popularly known as sum of squares, this most widely term used in ANOVA and Regression analysis

    [2] SS divided by its degree of freedom → mean sum of squares or MSE, these concepts would appear in ANOVA & Regression analysis.

    Related articles:

    Why it is so Important to Know the Monster “Variance”? — part-2

    You just can’t knock down this Monster “Variance” —- Part-3

    Is this information useful to you?

    Kindly provide your feedback

    (Visited 29 times, 1 visits today)
    You can share this Post By:Share on Facebook
    Facebook
    Share on Google+
    Google+
    Tweet about this on Twitter
    Twitter
    Share on LinkedIn
    Linkedin

    Comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *