Let’s assume that we are in the business of fish farming. We want to make sure that the mean weight of all the fishes in the pond must be 2 Kg before we sell them into the market. It is not possible to take out all of them (we don’t know how many fishes are there!) from the pond and measure their individual weight. So what we do is to take out a sample of say, 25 fishes and measure their weight. If the mean weight of the 25 fish is near to 2 Kg (more precisely, if it is between 2 ± 0.15 Kg), we assume that lot is ready to be sold in the market. This is hypothesis testing. We are trying to estimate the population parameter (mean weight of all fishes in the pond) based on the mean weight of the sample of 25 fishes with an acceptance criterion in our mind (between 2 ± 0.15 Kg). |

In Inferential statistics we try to estimate the population parameter by studying a sample drawn from that population. It is not always possible to study the whole population (called as census). Take an example of rice being cooked in a restaurant. The entire lot of rice taken for the cooking may be considered as the population. Let’s further assume that there is a set protocol for cooking and after a predetermined cooking time, chef hypothesizes that the whole lot might have been cooked properly. In order to check his hypothesis, he takes out few grains of the cooked rice (sample) and then he test his hypothesis by subjecting the sample to some test (pressing them between the fingers). Finally based on the sample’s result, chef takes the decision whether the whole lot of rice is cooked or not.

We have executed following steps in order to make an estimate about the degree of cooking of the entire lot of rice based on the small sample drawn from the pot.

- We select the
**population**to be studied (the entire lot of rice under cooking) - A cooking protocol is followed by the chef and he makes a
**hypothesis**that the whole lot of rice (population) might have been cooked properly →**trying to make an estimate about the population parameter** - Then we draw a sample from the pot →
**sample** - We have a criterion in our mind to say that the rice is overcooked or undercooked or properly cooked as per expectation → we set a
**threshold limit (confidence interval CI) within which we assume that rice is properly cooked**. Less than that, it is undercooked and more than that, it is overcooked. - We test the sample of the cooked rice by pressing them between our fingers →
**Test statistics** - The results of the test statistics is compared with the threshold limit set prior to conducting the experiments and based on this comparison, decision is taken whether the whole lot of rice is cooked properly or not →
**inference about the population parameter.**

It must be somewhat clear from the above discussion that, we use hypothesis testing to challenge whether some claim about a population is true or not utilizing the sample information, for example,

- The mean height of all the students in the high school in a given state is 160 Cm.
- The mean salary of the fresh MBA graduates is $65000
- The mean mileage of a particular brand of car is 15 Km/liter of the gasoline.

All of the above statement is some kind of population parameter that we hypothesized to be true. To test these hypothesis, we take a sample (say height of 100 students selected at random from the high school or salary data of 25 students selected at random from the MBA class) and subject it to some statistical tests called as test statistics (equivalent to pressing the rice between the fingers) to conclude the statement made about the population parameter is true or not.

But, before we go any further, it is important to understand that we are using a sample for estimating the population parameter and the size of the sample is too less (relative to the size of population). As a result, estimating population parameter based on the sample statistics would involve some uncertainty or the error. This is represented by following equation

*Population Parameter = Sample statistics ± margin of error*

The above equation gives an interval (because of ± sign) between which a population parameter is expected to be found. This interval is called as confidence interval (CI). *This means every sample drawn from the population would give different Confidence Interval!!*

Suppose we are trying to estimate the population mean (which is usually unknown) and we draw 100 samples from the population, all those 100 samples would give 100 different CI because all samples would have different sample mean and different “margin of error”. Now question arises “does all 100 CI thus obtained would contain the population parameter?” As stated earlier, because of the sampling error, we can never accurately estimate the population parameter, in other words, we understand that there will be some degree of error in estimating the population parameter based on the sample statistics. Hence, we should be wise enough to accept an inherent error rate prior to conducting any hypothesis testing. Let’s say that if I collect 100 samples from the population and obtained 100 different CI, then there are chances that 5 CI thus obtained might not contain the population parameter. This is called as error α or type-I error. This α represents the **acceptable** **error or the level of significance** and it has to be determined prior to conducting any hypothesis testing. Usually it is a management decision. For more detail see “__Is it difficult for you to comprehend confidence interval?__”

Based on the above discussion, a 7-Step Process for the Hypothesis Testing is used (*note: step-2 is described before step-1, this is done because it helps us in writing the hypothesis correctly*)

**Step 2: State the Alternate Hypothesis.**

This is denoted by H_{a} and this is the real thing about the population that we want to test. In other words, H_{a} denotes what we want to prove.

**For example:**

- The mean height of all students in the high school is 160 Cm.
- H
_{a}: μ ≠ 160 Cm

- H
- The mean salary of the fresh MBA graduates is $65000
- H
_{a}: μ ≠ $65000

- H
- The mean mileage of a particular brand of car is greater than 15 Km/liter of the gasoline.
- H
_{a}: μ > 15 Km/liter

- H

**Step 1: State the Null Hypothesis.**

This is denoted by H_{o}. We state the null hypothesis as if we are extremely lazy persons and we don’t want to do any work! For example if new gasoline is claiming to have an average mileage of greater than 15Km/liter then my null hypothesis would be “it is less than or equal to 15 Km/liter” hence, by doing so, we would not take any pain in testing the new gasoline. We are happy with status quo!

So the null hypothesis in all of the above cases are

- The mean height of all students in the high school is 160 Cm.
- H
_{o}: μ = 160 Cm

- H
- The mean salary of the fresh MBA graduates is $65000
- H
_{o}: μ = $65000

- H
- The mean mileage of a particular brand of car is greater than 15 Km/liter of the gasoline.
- H
_{a}: μ ≤ 15 Km/liter

- H

*Therefore, if you want me to work, first you make an effort to reject the null hypothesis!*

**Step 3: Set **α

But, before we go any further, it is important to understand that we are using a sample for estimating the population parameter and the size of the sample is very less than then the size of the population. And because of this sampling error, estimating population parameter would contain some uncertainty or error. There is two types of error that can occur that we can make in hypothesis testing.

Following is the contingency table for the null hypothesis**. **We can make two errors, first rejecting the null hypothesis when it is true (α error) accepting the null hypothesis when it is false (β error). Hence, the acceptance limit for both the error is decided prior to hypothesis testing.

Using z-transformation or t-test, we determine the critical value (threshold) corresponding to error α.

The level of significance α is the probability of rejecting the null hypothesis when it is true. This is like rejecting a good lot of material by mistake.

Whereas the β is the called as type-II error and it is the probability of accepting the null hypothesis when it is false. This is like accepting a bad lot of material by mistake.

Let’s understand null and alternate hypothesis graphically Following are the distribution of H We also have an error term α, representing a threshold value on the distribution of H Now the issue that is to be resolved is “how we can say that the two distributions represented by H It is usually done by measuring the extent of overlap between the two distributions. This we do by measuring the distance between the mean of the two distributions (of course we need to consider the inherent variance in the system). There are statistical tools like z-test, t-tests, ANOVA etc. which helps us in concluding, whether the two distributions are significantly overlapping or not. |

**Step 4: Collect the Data**

**Step 5: Calculate a test statistic.**

The *test statistic* is a numerical measure that is computed from the sample data which, is then compared with the critical value to determine whether or not the null hypothesis should be accepted. Another way of doing is to convert the test statistics to a probability value called as p-value, which is then compared with α, to conclude whether the hypothesis that was made about the population is to be accepted or rejected.

Also See __“p-value, what the hell is it?”__

__Conceptualizing “Distribution” Will Help You in Understanding Your Problem in a Much Better Way__

__Is it difficult for you to comprehend confidence interval?__

**Step 6: Construct Acceptance / Rejection regions.**

** **

The *critical value* is used as a benchmark or used as a threshold limit to determine whether the test statistic is too extreme to be consistent with the null hypothesis.

**Step 7: Based on steps 5 and 6, draw a conclusion about H _{0}.**

The *decision*, whether to accept or reject the null hypothesis is based on following criterion:

- If the
**absolute**value of the test statistic exceeds the**absolute**value of the critical value in, the null hypothesis is rejected. - Otherwise, the null hypothesis
**fails to be rejected**(or simply H_{o}is accepted) - Simplest way is to compare α and the p-value. If p-value is < α, reject the H
_{o}.

**Summary:**

The null and alternative hypotheses are competing statements made about the population based on the sample. Either the null hypothesis (H_{0}) is true or the alternative hypothesis (H_{a}) is true, but not both. Ideally the hypothesis testing procedure should lead to the acceptance of H_{0} when H_{0} is true and the rejection of H_{0} when H_{a} is true. Unfortunately, the correct conclusions are not always possible because hypothesis tests are based on sample information therefore, we must allow or we must have a provision for the possibility of type-I and type-II errors.