Optimizing similarity factor of in vitro drug release profile for development of early stage formulation of drug using linear regression model

The objective of this article is to optimize the similarity factor within immediate release (IR) and modified release (MR) of in vitro drug release profiles. The least square method is used to minimize the difference between empirical and regression curve fitting data of in vitro IR/MR drug release profiles. An estimation of percentage drug release at intermediate timepoints has been done to improve the similarity factor f2 using linear curve fit method. In this study linear regression model is used to analyze the similarity factor f2 for Nitrofurantoin MR Capsules, Venlafaxine HCl MR Tablets and Lurasidone IR Tablets in order to exhibit the significance as well as similarity owing to the consideration of extra intervening timepoints. This linear regression model may help pharmaceutical industries to examine the inside comparison of IR/MR in vitro drug release profile with few modifications in timepoint selection to improve similarity factor f2.


Introduction
The pharmaceutical industry spends large amounts of capital in gathering information from both preclinical (animal) studies and clinical (human) trials. Despite of this initiative, fewer than 14% of the ventures initiating clinical trials end up being registered as medical products [32]. The high drug development failure rate is partially indicative of human physiology's complications and changeability, which rises the expense of effective therapies. It is necessary to carry out the correct experiment, in order to reduce costs in industry and to evaluate it with proper quantitative tools [3]; while for the development of new solid dosage type (tablets, capsules or powder), drug release is required to occur in an effective manner [6]. In vitro dissolution results are used in dosage form development as a guide to optimize the formulation of new drug product and, where appropriate, to compare the effect of different formulations on resulting rates of drug release [22,30].
Mathematical modeling is the most effective bridge connecting mathematics and many disciplines such as physics, biology, computer science, engineering and pharmacy [31]. Mathematical models are useful tools for the design of pharmaceutical formulations, for the evaluation of drug release processes and, in general, for the production of optimal designs for new systems [13,27]. These models assist scientists in not only observing the dynamics of drug release, but also in saving money and time by assisting in the design of more successful experiments [8]. There are many approaches available to calculate drug release similarity such as model dependent, model independent and statistical methods [4], but principally model independent approach, proposed by Moore and Flanner in 1996 is adopted by pharmaceutical industries [28]. In the model independent approach there are two factors, difference factor f 1 and similarity factor f 2 , the values of these two factors are sensitive to the selection of time points of dissolution [7]. Also, the measure to decide the similarity between dissolution profiles is fuzzy [29].
As per guidelines of Center for Drug Evaluation and Research (CDER) [9][10][11] and Human Medicines Evaluation unit of the European Agency for the Evaluation of Medicinal Products (EMEA) [18], it is inevitable to perform in vitro profile comparison whenever a drug product is formulated. Also, in order to achieve bioequivalence and compliance with pharmacopoeia requirements, continuous monitoring is needed to confirm the safety, consistency, and efficacy of marketed generic drugs [24]. Similarity measurement is a requirement for many pharmaceutical industries to develop new drug applications [15]. There are various methods available in mathematics to calculate intermediate values, for example, interpolation for equidistant and non-equidistant points, curve fitting and the least square method in regression analysis [14]. Interpolation methods are mainly used to fit the data precisely to an interpolating polynomial; whereas, the least square method is widely applicable to predict the numerical values of the variables to fit a function to the set of data. This method is chosen over other methods since it minimizes the vertical distance from the data points to the regression line, while providing smallest sum of squares of errors [1].
In this paper, a linear regression model is obtained using least square method, based on this percentage drug release is evaluated at other intermediate time points for the empirical data acquired from reliable resources for this particular research, to study behavior of similarity factor f 2 . Unlike the other studies, in this research more intermediate time points have been added for investigating the improvement in similarity factor f 2 for Nitrofurantoin MR Capsules, Venlafaxine HCl MR Tablets and Lurasidone IR Tablets. Regression analysis demonstrates the effect of timepoints in drug release similarity, this analysis is executed within various in vitro drug release profile of immediate release and modified release. Evidently, in pharmaceutical industry, mathematical methods require both financial and practical investment; therefore, this study points out that the linear regression model may be preferable for industries to compare different drug release profiles with few modifications in timepoint selection to provide similarity between different data sets.

Preliminaries
Regression Analysis [14]: Regression analysis calculates the dependent variable (y) values from the independent variable (x) values. To perform this estimation process, the regression line is used.
where y-intercept and the slope of regression line are represented by a and b respectively.
Standard deviation [1]: The standard deviation of a set of values is a measure of the amount of variance. A low standard deviation means that the values are leaning closer to the fixed average.
where σ is Standard deviation; y i is i th observation of y; y is Mean of the data set; n is Number of observations in the data set.

Coefficient of variation (CV)
[1]: The ratio of the standard deviation to the mean is the coefficient of variance and it is a useful metric for comparing the degree of variation from one test statistics to another, even though the mean varies substantially from each other. The lower the value of the estimated coefficient of variation, more precise is the estimate.
where σ is Standard deviation and μ is Mean.

Root mean squared error (RMSE)
[1]: The root mean square error is a measure of how the residuals are spread out. RMSE is always non-negative. In general, the lower value of RMSE is better since it represents less variation in the data.

Sum of Squared Residuals (SSR)
[2]: The differences between the predicted value and the mean of the dependent variable is known as the sum of squared residuals. If SSR value is less then data fitted for the model is considered as best and if the value of SSR is equal to SST, it means the regression model round up all the observed variability and the predicted data is accurate.

Sum of Squares Error (SSE) [2]:
The variation between the observed value and the expected value at each observation is the sum of squares error. The smaller the error, the greater is the regression's estimated power.

Sum of Squares Total (SST) [1]:
The sum of squares total (SST) measures the difference between the data points and the mean value.
Coefficient of Determination (R square) [14]: Coefficient of Determination is used if models with numerous variables are to be compared. The higher the value of R square, the higher percentage of the points lie on the line when the data point and regression line are plotted.
R square lies between 0 and 1 where, higher R square indicates a better model fit. Adjusted R square [1]: Adjusted R square is better for comparing models that have a different number of variables, it's value increases if there is an addition in independent variable explaining a substantial amount of variation. It simply means that increase in number of variables leads to increase in R square.
where, number of observations and number of independent variables are denoted by n and k respectively. t-statistic [1]: The t-statistic is called test statistic (t) and is defined by where b is the slope of regression line and S.E. is the standard error of the slope.

Methodology
The different methods for the assessment of dissolution rate profiles are [19]: 1. Model Dependent Methods: The various models are presented in Table 1. Zero order [5] Q t = Q 0 + K 0 t First order [5] ln Q t = ln Q 0 + K 1 t Higuchi [16] Q t = K H t 1/2 Hixson-Crowell [17] Q 1/3 0 -Q 1/3 t = K s t Korsmeyer-Peppas [21] Q t /Q ∞ = K k t n 2. Model Independent Methods. The theories and kinetic models relating to drug release forms and drug solubility are the difference factor (f 1 ), similarity factor (f 2 ) and dissolution efficiency (D.E.) [12]. i. Difference factor (f 1 ) [10]: The difference factor is a measurement of relative errors between the two curves.
ii. Similarity factor (f 2 ) [10]: A logarithmic reciprocal square root transformation of the sum of squared error is a similarity factor which provides percentage dissolution between the two curves in order to measure similarity between two drug release profiles.
where, n is the number of sample points, R j is the dissolution value of reference at time t, T j is the dissolution value of test at time t. iii. Dissolution Efficiency [20]: The dissolution efficiency of a drug dosage form is defined as the area under the dissolution curve, up to a certain time t, expressed as a percentage of the area of the rectangle described by 100% dissolution in the same time.
where, y is the percentage of drug dissolved at time t, y 100 is the maximum percentage of drug release. Moreover, Moore and Flanner proposed two indices f 1 and f 2 to compare pairwise dissolution profiles [25]. The characterization of dissolution profiles is compared by dissolution efficiency and the fit factors (f 1 and f 2 ) [26]. The factors f 1 and f 2 offer lucid calculation to measure the similarity between pairs of dissolution profiles [23]. The f 1 and f 2 equations are independent of the uncertainty or correlation structure of the data. The values of f 1 and f 2 are sensitive to the number of time points for the dissolution [28]. These drawbacks of f 1 and f 2 values indicate that some research might be needed to assess the effect of the data variability. According to guidelines of Food and Drug Administration (FDA) range of f 1 and f 2 should be 0 to 15 and 50 to 100 respectively [11].

Result
Intermediate values were predicted and similarity factor f 2 was calculated using linear regression model between two different drug release profiles. Table 2 represents comparison of two drug release profiles of Nitrofurantoin MR Capsules; similarity factor is measured between two formulation and improvement is noted from 40 to 44.44. It is observed that the similarity factor of predicted data is improved by 11.1% from empirical data. Figure 1(a) represents scatter plot and regression line of Nitrofurantoin MR Capsules for formulation -1 (X vs Y 1) and Fig. 1(b) represents scatter plot and regression line  1  18  35  1  18  35  2  31  49  2  31  49  3  43  60  3  43  60  4  52  70  4  52  70  8  8   of Nitrofurantoin MR Capsules for formulation -2 (X vs Y 2). From Table 3, the statistics of Nitrofurantoin MR capsules (coefficients that are intercept (a) and slope (b) of regression line y = a + bx) can be clearly observed. Similar results are observed in other two drug products of IR/MR drug release profiles. Table 4 represents comparison of two drug release profiles of Venlafaxine HCl MR Tablets; similarity factor is measured between two drug products and improvement is noted from 50.34 to 54.98. It is observed that the similarity factor is improved by 9.22% from empirical data. Figure 2(a) represents scatter plot and regression line of Venlafaxine HCl MR Tablets for formulation -1 (X vs Y 1) and Fig. 2(b) represents scatter plot and regression line of Venlafaxine HCI MR Tablets for formulation -2 (X vs Y 2). Table 5 represents comparison of two drug release profiles of Lurasidone IR Tablets; similarity factor f 2 is measured between two drug products and improvement is noted  from 48.87 to 50.65. It is observed that the similarity factor is improved by 3.64% from empirical data. Figure 3(a) represents scatter plot and regression line of Lurasidone IR Tablets for formulation -1 (X vs Y 1) and Fig. 3(b) represents scatter plot and regression line of Lurasidone IR Tablets for formulation -2 (X vs Y 2).

Conclusion
An optimization of the similarity factor within immediate and modified release of in vitro drug release profiles was conducted to examine the significance of timepoints. The least square method was used to minimize the difference between empirical and regression curve fitting data of in vitro IR/MR drug release profiles and a linear relation was established between the data sets. This research work estimates percentage of drug release at intermediate timepoints to improve the similarity factor f 2 using linear regression model and it has been observed that timepoints have significant effect in changing similarity factor for the development of early stage formulation of drug product. To analyze this fact, three drug release profiles were chosen such as Nitrofurantoin MR Capsules, Venlafaxine HCl MR Tablets and Lurasidone IR Tablets and improvement of 11.1%, 9.22% and 3.64% respectively were detected. Though, the results obtained by numerical simulation indicate that the similarity factor is extremely lenient in concluding similarity between dissolution profiles; however, one can extend this study further using other robust methods for achieving more accurate results. In conclusion, the study predicts that industries can use linear regression model to compare different drug release profiles with few modifications in timepoints to provide similarity between different data sets and improve the similarity factor.