“p-value” What the Hell is it?Amrendra Roy
|When ever we go to the supermarket, say to buy tomatoes, we go to the vegetable section and by merely looking at them, we make a hypothesis in our mind that all tomatoes must be of good or bad quality. What we are doing is, we are intuitively providing a qualitative limit on the quality and we can call it as theoretical limit. Now we go to the shelf and pick a sample of tomatoes and press then between our fingers to check the hardness, we can take it as the experimental value on hardness. If this experimental value is better than the theoretical value we end up in buying the tomatoes.
In business decisions when we want to compare two processes, we have a theoretical limits represented by α and a corresponding experimental value represented by p-value. If p-value (experimental or observed value) is found to be less then the α (theoretical value), then two processes are different.
You must have understood the following “normal distribution” after having gone through so many blogs on this site. Let’s revise what we know about the normal curve
If a process is stable, it will follow the bell shaped curve called as normal curve. It means that, if we plot all historical data obtained from a stable process – it will give a symmetrical curve as shown above. The distance from the mean (μ) in either direction is measured in the terms of σ. The σ represents the standard deviation (a measurement of variation)
The main characteristic of the above curve is the proportion of the population captured in-between any two σ values. For example μ±2σ would contain 95% of the total population and μ±3σ would contain 99.73% of the total population.
The normal curve doesn’t touches the x-axis i.e. it extend from – to + . This information is very much important for understanding the p-value concept. The implication of this statement is that, there is always a possibility of finding an observation between – to + or in other words “even a stable process can give a product with specification anywhere between – to + ”. But, as you move away from the mean, the probability of finding an observation decreases, for example, the total probability of finding an observation beyond μ±2σ is only 5% (2.5% on either side of the normal curve). This probability decreases to 0.3% for an interval μ±3σ.
Now, let’s understand this: I am manufacturing something and my process is quite robust and it follows the normal distribution. As per point number-2 (see above) there is always a possibility that the specification of the product can fall anywhere between – ∞ to +∞ . But, I can’t go to my customer and make this statement. The point that I want to make here is that, there has to be a THRESHOLD DISTANCE (control limits) from the mean (say μ ± xσ) as the acceptance criterion for the product and if specification falling beyond this threshold limit, would be rejected (will not be shipped to the customer).
In other words, if my process is giving me the products with sampling distribution of mean beyond the threshold limits, then I will assume that my process has deviated from the SOP (standard operating procedure) due to some assignable cause and now the current process is different from the earlier process! Or simply there are two processes that are running in the plant.
Generally μ±3σ is taken as the threshold limit. This threshold is represented by alpha (α) or the % acceptable error. In present case (μ ± 3σ), α = 0.3% or 0.003.
From the above point, it is clear that, as long as the process is giving me the sampling distribution of mean within μ±3σ, we would say that the products are coming from the same process. If the sampling distribution of mean of a batch of the manufactured product is falling beyond μ±3σ then, it would represent a different process.
Till now we have defined a theoretical threshold limit called as alpha (α). Now consider two sampling distribution of mean of two processes described below
In case-3, we can confidently say that the two processes are different as there is a minimum overlap of two sampling distribution of means. But, what about case-2 and case-1? In these two cases, taking decision would be difficult because there is significant overlap of two distributions! (At least appears to be). In these circumstances, we need a statistical tool to access whether the overlap is significant or not, in other words a tool is required to ascertain that the sampling distribution of mean of two processes are significantly apart to say that the two processes are different.
In order to do that we need to collect some data from both the processes and then subject them to some statistical tests (z-test, t-test, F-test etc.) to check whether the difference between the mean of two processes is significant or not. This significance obtained by collecting a samples from both the population followed by a statistical analysis, the result is obtained in the form of a probability term called as p-value. The point to be noted here is that, the p-value is generated from some statistical test (equivalent to an experiment value).
We can say that the α is the theoretical threshold limit and the p-value is the experimentally generated threshold limit and if the p-value is less than or equal to the theoretical threshold limit α then we would say two processes are really different.
When we say the p-value < α, it means that the sampling distribution of mean of the new process is significantly different from the existing process.
- In general a = 0.05, there is only 5% chance that two processes are same.
- If p-value (experimental or observed value) is < α (theoretical value), then new process is different.
More details would be covered in hypothesis testing