Related Articles
Learn and Confirm: A Push for Adaptive Clinical Trial Design

Optimizing Drug Development Strategies

Pathway Analysis in Drug Development

Tabletting Tips

The development of a pharmaceutical product is a resource-draining endeavor which includes scientists performing pre-formulation studies, stability studies and formulation development, as well as process development studies. In addition to the formulation scientist, there are team members from regulatory affairs, project management, analytical chemistry, quality control and many others depending on the philosophy of the company. The development of a pharmaceutical product, generic or new chemical entity (NCE) can take anywhere from six months to a few years. The end result in most cases is a pharmaceutical dosage form that is robust, uniform, stable and can be manufactured reproducibly. But what resources do technical services scientists have at their disposal after the launch of the product? This article details various statistical techniques that may be employed during trouble-shooting or as part of a process improvement activity. Also presented in this article is a case study of process optimization utilizing the statistical techniques discussed.

Statistics is a branch of mathematics that provides many tools for studying the conditions of formulations and processes and enables us to optimize the same while being able to minimize our experimentation.

It was once said that there are three kinds of lies: lies, damned lies and statistics. This quote is often attributed to Marc Twain. In fact it was British Prime Minister Benjamin Disraeli who uttered these now famous (or infamous) words. It is important to keep in mind that as scientists, we bear the responsibility to ensure that our analysis of data remains true and accurate. Another quote that sends a similar albeit more positive message is: "It is easy to lie with statistics, but it is easier to lie without them." Having said that, the rest of this article focuses on the useful and productive ways we use statistical tools.

To build a solid foundation for what is discussed later in this article, a few definitions must be spelled out. In fact the fodder for statisticians is numbers, more specifically in this instance, variables. Variables provide points of control for our formulas/processes as well as measurement tools to evaluate that control. There are two main types of variables that we will consider. First are independent variables. These are the tools for our control of formulas/processes. These variables are manipulated by the scientist. In our world of solid dosage forms, the variables include blend times, target tablet weights and formulation ingredients as well as many others. On the other hand we have dependent variables. These are the numbers that allow us to measure the control of our system. In our world, examples of these include compaction pressure, blend uniformity and outlet temperature.

So as we set out on our journey as scientists looking for answers, our ultimate goal is to find relationships between independent and dependent variables. This is the case not only for the technical services scientist and formulation development scientist, but the ultimate goal of any scientist in any field. Albert Einstein spent a significant portion of his life looking for the ultimate relationship between independent and dependent variables in what he called the Unified Field Theory or what the layperson can call The Theory of Everything. Unfortunately, he never reached his goal, but the search continues.

Magnitude and Reliability

Finding the relationship between independent and dependent variables has two very important components. The first component is called magnitude. It is not sufficient for us to determine that there is a relationship. It is just as important and practical to study the magnitude of that relationship. So we ask ourselves, how significant is the effect of the independent variable on our dependent variable? As we proceed on our way to develop and optimize formulations and processes, we need to maintain a practical perspective as the prices of pharmaceutical products is already high enough.

The second component is called reliability. Simply put, this is a measure of the reproducibility of the relationship, or the answer to the question; "How probable is it that a similar relation would be found if the experiment was replicated with other samples drawn from the same population?"

One more definition is critical for us to be able to proceed with our statistical analyses. What is Statistical Significance? This term is often expressed as the p-value or (prob>F). The statistical significance of a result is the probability that the observed relationship (e.g., between variables) or a difference (e.g., between means) in a sample occurred by chance and that in the population from which the sample was drawn, no such relationship or differences exist. Another thing to keep in mind when looking at p-values is than when the value is less than or equal to 0.05, the magnitude of the p-value represents a decreasing index of the reliability of a result. P-values greater than 0.1 indicate the data is not sufficient to determine significance.

The Tools Available

So we are now ready to look at the tools we have available to us as technical services scientist. One the most useful tools is Analysis of Variance (ANOVA). There are two main results of using ANOVA for our analysis. First, it allows us to test for significant differences between means. Secondly, it uncovers the main and interaction effects of independent variables (called "factors") on a dependent variable. This second result will play a crucial role when we discuss the next tool that we have at our disposal, which we will cover shortly. When we look at our ANOVA results we need to keep in mind that this value depends on three things: the size of the difference between group means, the sample sizes in each group (and that larger sample sizes give more reliable information and even small differences in means may be significant if the sample sizes are large enough), and the variances of the dependent variable in each group.

Another tool available to us is Design of Experiments (DOE). DOE is a tool that allows us to evaluate a multitude of factors concurrently while minimizing experimentation. This technique is a giant step ahead of traditional experimentation or what we now call one factor at a time (OFAT). It is based on the ability of ANOVA to detect and evaluate main and interaction effects of independent variables on the dependent variable, also known as output. It is important that DOE is not confused with trial and error. Trial and error involves the evaluation of one variable at a time. The advantage of experimenting with multiple variables concurrently is that we learn about interaction effects that would be hidden if we only observed one variable at a time. In general, a well-designed DOE will give us very valuable information and can result in identification of cause and effect relationship between variables. It can determine the optimum condition within a system by choosing the ideal parameter specification between a chosen range and can therefore help justify process/formulation conditions and can also simultaneously evaluate the independent variables at low and high levels.


Fig. 1: Mean Tablet Hardness (by batch)


Evaluating Factors

In the first example, we have a low dose drug that is manufactured using a straight forward dry blending technique. This process includes sizing, blending, compaction and coating. The tableting specifications include a hardness range of 9.0-14.0 kP and a dissolution specification of not less than 85% released in 45 minutes. A review of the weight uniformity indicates a relative standard deviation (RSD) of less than 1.0% indicating very good weight control. However, a historical review of the hardness data indicates data within specification closer to the lower specification. See Figure 1.

The technical services scientist was brought in to evaluate and optimize the process. The goals were three-fold: first, to improve the tablet hardness; second, to minimize any negative impact on weight uniformity; and third, to minimize any effect on the dissolution.

As already noted, the first step is the identification of the variables involved. In this instance the dependent variables were identified as tablet hardness, weight uniformity and dissolution. Next is the identification of the independent variables. These variables were separated into two types, formulation and process. The formulation variables included lubrication and dry binder. The process variable identified was blend time on lubrication step. It is important to note that these were just a few of the variables that were identified to play a critical and influential role. It is important to find a balance between good science and practicality. Also critical is the understanding that DOE does not eliminate the requirement of expertise in formulation. The formulation scientist is the person responsible for ensuring that previously learned knowledge and experience are applied.

Once the three independent factors have been identified, the levels at which the experimentation is to occur are determined. Factor 1, the lubrication level, will include two levels, low (0.5%) and high (0.7%). Factor 2, the lubrication blend time, will be evaluated at two levels: low (2 minutes) and high (3 minutes). Factor 3, CMCC, will be looked at two levels as well: low (10.5%) and high (11.0%). It should be noted that the levels were selected after consultation with the regulatory affairs group, as it was management's goal to optimize the formula within SUPAC guidelines.

Having identified three factors as the critical study factors, the scientist can now design the study protocol using any one of the many commercially available statistics programs found in the marketplace. As can be observed from Figure 2, when three factors are being evaluated, only a full factorial study will generate statistically significant data. This full factorial design generates eight experiments.

Once a model has been chosen to execute, the scientist needs to ensure that all steps are taken to minimize bias. This involves randomization. The experiments need to be executed in a random order. Figure 3 is a depiction of how the experimental matrix already described was randomized. Note that there is no specific order in the way any of the three factors are varied. Also important to notice here is that evaluating three factors generated only eight experiments, whereas if one factor at a time was evaluated, 24 experiments would have to be carried out.

Fig. 3: Conducting Experiments in Random Order
Std Run Block Factor 1
Mag. Stearate Level
%
Factor 2
Blend Times
Minutes
Factor 3
CMCC
%
Response 1
Tablet Hardness
kP
Response 2
Wt. Uniformity% RSD
Response 3
Dissolution
% L.C.
1 1 Block 1 0.50 2.00 10.50 3 Factor Full Factorial
Randomization is Critical!
4 2 Block 1 0.70 3.00 10.50
8 3 Block 1 0.70 3.00 11.00
6 4 Block 1 0.70 2.00 11.00
7 5 Block 1 0.50 3.00 11.00
3 6 Block 1 0.50 3.00 10.50
5 7 Block 1 0.50 2.00 11.00
2 8 Block 1 0.70 2.00 10.50

The next step in the process is the actual experimentation. All eight experiments are carried out and samples tested. The data is presented in Figure 4.

Fig. 4: Conducting Experiments in Random Order
Std Run Block Factor 1
Mag. Stearate Level
%
Factor 2
Blend Times
Minutes
Factor 3
CMCC
%
Response 1
Tablet Hardness
kP
Response 2
Wt. Uniformity% RSD
Response 3
Dissolution
% L.C.
1 1 Block 1 0.50 2.00 10.50 13.1 0.81 99
4 2 Block 1 0.70 3.00 10.50 9.80 0.82 95
8 3 Block 1 0.70 3.00 11.00 10.2 0.81 95
6 4 Block 1 0.70 2.00 11.00 10.8 0.89 95
7 5 Block 1 0.50 3.00 11.00 14.2 0.90 98
3 6 Block 1 0.50 3.00 10.50 >12.7 0.79 99
5 7 Block 1 0.50 2.00 11.00 14.8 0.94 99
2 8 Block 1 0.70 2.00 10.50 11.8 0.84 99

Here is the point where we revert back to our previous discussion on statistical tools for analysis of data. At this point we have generated data that can be evaluated using ANOVA. This analysis allows us to look for single factor as well as multifactor interactions. Most statistical analysis software available have ANOVA as a function. Presented below is a typical result of an ANOVA test. In our example, we used Design-Expert version 6 (www.statease.com). This test is conducted for each of the output variables, and each of these analysis will include a value for p-value or (prob>F). We can then make the determination based on this value, whether our results indicate a significant effect or is just a product of random variability. In our case, the (prob>F) values are:

The conclusions that we can make based on our analysis is that weight uniformity was not affected by any of our variables. Tablet hardness was affected by the level of magnesium stearate. The effect of dissolution cannot be evaluated from the limited number of experiments conducted. And lastly, tablet hardness was not affected by the level of microcrystalline cellulose.

In general, DOE is a very powerful tool for generating data as well as giving us the power to evaluate this data. The statistical methods for experimental work discussed in this article dictate that the following scheme is followed. First, the independent variables need to be selected. This decision should not only be a scientific one, but should include a business group that needs to evaluate and dictate the commitment level that the corporation is giving to solve the specific problem discussed and include a specific resource allocation-both financial- and workforce-based. Secondly, experimentation must be carried out. It is very important that the experiments are run randomly. Often an experimenter will unwittingly bias the data generated from an experimental protocol. Thirdly, output variables must be measured. Depending on the variables themselves, these can happen immediately during experimentation, say, for example, when the output is outlet temperature in a coating pan; or they can be measured after the experiments are carried out; i.e., when measuring tablet weight. The fourth step in this process should be the data analysis. This is when the experimenter uses ANOVA or other similar statistical tools to analyze the data. Lastly, once all the data has been analyzed and there is an understanding of the process under study, the optimum condition can be attained. It is important to keep in mind that very often, if not always, the first round of experimentation only gets us closer to the solution and we have to perform more experimentation to get even closer to the optimal condition.

About the Author

Adolfo L. Gomez is currently vice president of scientific affairs at Emerson Resources, Norristown, PA. Emerson Resources specializes in product development and clinical trial manufacturing for the pharmaceutical industry. With over 15 years of experience in the pharmaceutical industry, Mr. Gomez leads a team of scientists that focus on finding the right blend of science and pragmatism in their daily endeavors. Adolfo.gomez@emersonresources.com