How to provide a realistic range for a CQAs during product development to avoid Unwanted OOS-2 Case StudyAmrendra Roy
Suppose we are in the process of developing a 500 mg Ibuprofen tablets (actual specification is 497 to 502 mg). A tablet contains many other ingredients along with 500 mg of the active molecule (ibuprofen). Usually these ingredients are mixed and then compressed into the tablets. During the product development, three batches each of 500 tablets were prepared with 15 minutes, 25 minutes and 35 minutes of blending time respectively. A sample of 10 tablets were collected from each batch and analyzed for the ibuprofen. The results are given below
The regression analysis of the entire data (all three batches) provides a quantitative relationship between blending time and the expected value of ibuprofen content for that blending time.
Expected ibuprofen content = 493±0.242×blending time
Now, we want to estimate the average ibuprofen content of the entire batch of 500 tablets, based on the sample of 10 tablets for a given mixing time (say 15, 25 and 35 minutes). Let’s calculate the 95% and 99% confidence interval (CI) for each of the mixing time.
In reality, we can never know the average ibuprofen content of the entire batch unless we do the analysis of the entire batch, which is not possible.
We can see that the 99% CI is wider than the 95% CI (hope you are clear about what 95% CI means?). The 99% CI for a mixing time of 35 minutes seems to be closer to my desired strength of 497 to 502 mg. Hence, in developmental report, I would propose a wider possible assay range of 499.6 to 502.57 for a mixing time of 35 minutes with 99% CI.
This means that, if we take 100 samples, then the CI given by 99 samples would contain the population mean.
Now if we look at this 99% CI i.e. 499.6 to 502.57 mg which is narrower than the specifications (407 to 502 mg). Hence, I want to estimate the some interval (like CI) with a mixing time of say 32 minutes (note: we have not conducted any experiments with this mixing time!) to check if we can meet the specification there itself. We can do it, because we have derived the regression equation. What we are doing is to predict an interval for a future batch with a mixing time of 35 minutes. As we are predicting for a future observation, this interval is called as prediction interval of a response for a given value of the process parameter. Usually prediction intervals are wider than the corresponding confidence intervals.
Using the equation discussed earlier, we can have expected average value of the mean strength for a mixing time of 32 minutes.
Expected ibuprofen content for a blending time of 32 minutes = 500.74
Till now, what we have learnt is that CI can estimate an interval that will contain the average ibuprofen content of the entire batch (already executed) for a given value of blending time. Whereas, the prediction interval estimates the interval that would contain the average response of a future batch for a given value of blending time.
In present context,
For a blending time of 35 minutes, a 95% CI indicates that the average strength of the entire batch of 500 tablets (population) would be between 499.99 and 502.18.
Whereas a 95% PI helps in predicting the average strength of the next batch would be between 499.6 and 502.57 for a blending time 35 minutes.
Now question is: can we propose any of these intervals (95% CI or 95% PI) as the process control limits?
What I think is, we can’t! Because above control limits doesn’t tell me anything about the distribution of the population within this interval. What I mean to say that we can’t assume that entire 500 tablets (entire batch) would be covered by these interval, it’s only the average mean of the entire batch would fall in this interval.
For me it is necessary to know the future trend of the batches when we transfer the technology for commercialization. We should not only know the interval containing the mean (of any CQA) of the population but also the proportion or percentage of the total population falling in that interval. This will help me determining the expected failure rate in future batches if all CPPs are under control (even 6sigma process has a failure rate of 3.4ppm!). Once I know that, it would help me in deciding when to start investigating an OOS (once number of failures would cross the expected failure rate). For this statement, I am assuming that there is no special cause for OOS.
This job is done by the tolerance interval (TI). In general TI is reported as follows
A 95% TI for the tablet strength (Y) containing 99% of the population of the future batches for a blending time of 35 minutes (X).
It means that, whatever TI is calculated at 95% confidence level would encompass 99% of the future batches that will be manufactures with a blending time of 35 minutes. In other words, there is 1% of the batches would fail. Now, I will start investigating an OOS only if there are two or more failures in next 100 batches (assuming that there are no special causes for OOS and all process parameters are followed religiously).
The TI for the batches at different blending time is given below
Tolerance Interval type: two sided
Confidence level: 95%
Percentage of population to be covered: 99
Above discussion can be easily understood by following analogy described below
You have to reach the office before 9:30 AM. Now tell me how confident you are about reaching the office exactly between
9:10 to 9:15 (hmm…, such a narrow range, I am ~90% confident)
9:05 to 9:20 (a-haa.., now I am 95% confident)
9:00 to 9:25 (this is very easy, I am almost 100% confident)
The point to be noted here is that as the width of the time interval increases, your confidence also increases.
It is difficult to estimate the exact arrival time, but we can be certain that mean arrival time to office would be in between
Average arrival time on (say 5 days) ± margin of error