7QC Tools — How to Interpret a Histogram?Amrendra Roy
Different data set would give histograms of different shapes as shown below. However, always remember to draw the histogram of your dataset first before taking any decision or starting a continuous improvement.
Symmetrical histogram: A given process is stable and is normally distributed around the mean. If this kind of histogram is seen during a six-sigma project then probably you are required to reduce the variance in the process to reduce the width of the histogram so that you move away from the customer’s specifications.
Left or right skewed histogram: In both of these case, maximum number of data is scattered around the median (caution: not the mean!) which represents the real measure of central tendency instead of mean in these cases of skewed data. Efforts in six-sigma should be on identifying the cause of skewedness and eliminating those causes.
Taking mean as the measure of central tendency in these case would be misleading as mean is affected by the extreme values in the data set.
If you are supposed to work on this type of data, find the reasons for those extreme observations and eliminate them. These are the low hanging fruit which we must take advantage of.
Let’s take an example of the yield if a process is represented by following left skewed histogram
Median gives an estimate that 50% of the batches are having yield at some value (say 85%) however, if we present mean as the measure of central tendency and tell to management that mean yield of the process is just 80%. Is it the right approach? We then just found the reason for those extreme left values and tell the management that by applying 6sigma, you have increased the yield to 85%!
Bimodal histogram: It means that the data set contains observations from two different populations. If it happens during any process then it must be assumed that there are two processes are running or two operators are working differently (which they are not supposed to do).
Note: If number of classes are less (<5) you can’t see the bimodal histogram. Just increase the number of classes to 7, 9, 11 etc. it would be evident if bimodal process is running.
Be it management or a 6sigma practitioner, always draw the histogram of your data OR by default take median as the measure of the central tendency as it is not affected by the extreme values. Also see the effect of increasing the number of classes on the shape of the histogram for any deviation in the standard operating procedure.
Lesson learnt: while analyzing your data, always ask?
Where is the center? Which measurement of the central tendency I should consider for my data set based on the histogram?
Kindly do provide feedback for continuous improvement