## Now it’s important to understand the concept of sigma or the standard deviation

Amrendra Roy

We have seen that we need to restrict the width of the car for a given width of the garage. This is analogous to the with of the process (**voice of customer, VOP**) Vs the width of the customer’s specification (**voice of the customer or VOC**).

The width of the process is measured in terms of **standard deviation** denoted by σ (sigma).

*The target of the 6sigma methodology is to reduce this variance (width of the car) to such an extent that even by mistake it should not cross the customer’s specification (or should not hit the wall of the garage).*

Before we work towards reducing the σ, we should know about this monster very well as we will be encountering him at every step during the 6sigma journey.

There are two very important characteristics of any data set

**Location** and the **spread** of the data set.

**Location** represents the point in the data set where there is maximum clustering of the data –> Mean and median.

**Spread** represents the **variability** in the data set, there will be some observations that will be above the mean and there will be some that will be below the mean. **Standard deviation σ **measures the average spread of the data from the mean in either direction of the mean.

Office arrival time for last 5 days with average time are given below, deviation of each observation from the mean is also captured.

Let’s calculate the average deviation

Note that* sum of all positive deviations = sum of all negative deviations *which indicates that the mean divided the data in two equal halves.

Sum of all deviation itself becomes zero, hence we need some other way to calculate this average deviation about the mean.

In order to circumvent the issue, a very simple idea was used

*Square of negative number →** positive number → **square root of this number →** ±parent number*

Hence square of all the deviations are calculated and summed-up to give sum of squares (simply SS) __[1]__. This SS is then divided by total number of observations to give *average variance** around the mean*.__[2]__ The square root of this variance gives *standard deviation **s*, the most common measure of variability.

What it typically means that “on an average data is 7.42 units (= 1 standard deviation ±1σ) in either direction of the mean in the given data set. Mean of the data set is at ZERO standard deviation.

If process a stabilized and normally distributed then following holds true

i.e. 99.7 % of the observation in the data set would be between ±3σ.

*Now we can understand whey we have taken 12σ as the width of the garage and 6σ as the width of the car!*

*The concept of ‘σ’ is the most important concept in understanding 6sigma. If we can understand it, downstream we wouldn’t be having any problem in understanding other topics. At this moment one important point to be noted here is that the calculation of σ depend on the type of data or data distribution we are handling.*

*Calculation of mean and σ would be different depending on whether we are dealing with normal distribution, binomial distribution, Poisson distribution etc. The importance of this would be realized when we would be studying the various types of control charts. At that time we just have to remember that “we must calculate mean and σ according to the distribution”.*

__[1]__ Popularly known as sum of squares, this most widely term used in ANOVA and Regression analysis

__[2]__ SS divided by its degree of freedom → mean sum of squares or MSE, these concepts would appear in ANOVA & Regression analysis.

Is this information useful to you?

Kindly provide your feedback