SAS/R code of MSD method for comparing dis­so­lution [Dissolution / BCS / IVIVC]

posted by Shuanghe  – Spain, 2014-02-10 18:05 (3699 d 14:22 ago) – Posting: # 12382
Views: 24,625

Hi all,

This gonna be a really lengthy post with some ready-to-use SAS and R code for comparing dissolution profiles with MSD method based on Tsong et al's article. I had the impression that administrators don't like the lengthy post but at least I didn't find such prohibition from forum policy :-)

Over the years many, including myself, have asked the question in this forum of MSD method for compare dissolution profiles using SAS or other softwares. The most relevant thread I knew is this thread. There are some links for reference articles, SAS macro and DDSolver for Excel (and the mistrust of the last).

There're 3 different approaches used in the SAS macro in Detlew's post but none of them was based on Tsong's article and the macro was not specifically designed for dissolution comparison. (By that I by no means saying Tsong et al's method is the best one, or even the better one. The article is mentioned by many is probably because the authors were all from FDA, so it has a kind of "official" status). While the rest of the calculation is fairly easy, obtaining the 90% CI of Mahalanobis distance by Lagrange multiplier method is pretty mysterious to people like me who lacks deep mathematical understanding and I believe it's the only obstacle for those who asked the "how to" questions.

There is another SAS macro by Connolly et al in the following reference: "SAS IML Code To Calculate an Upper Confidence Limit for Multivariate Statistical Distance"*, which can be accessed via this address. This is a macro specifically written for comparing dissolution profiles. In fact, Tsong's article is the only reference cited at the end (though authors use Newton-Raphson iterations to find the upper confidence bound). The shortcoming is that the macro is lengthy, and while it might be suitable for model-independent MSD approach, it might not easy to modify it for model-dependent approach (comparing Weibull parameters for example).

* If you'd like to play the macro please note that there are some error in the result section where 8 points of data were used. The macro was correct though.

I used to be happy to use the modified version of Connolly et al's code by adding various macro parameters to allow comparing several batches of formulations together which the original macro cannot do. However, I recently come across another article "In vitro dissolution profile comparison"** also by Tsong et al in Enclyclopedia of Biopharmaceutical Statistics where the formulas were described for 90% CI of the Mahalanobis Distance and with the help of those formulas the obstacle of getting 90% CI of Mahalanobis distance disappears. In addition to help to simplify the SAS macro, it even allows a R-novice such as I to write R code to do the job.

So, below are some ready-to-use SAS and R code in case anyone's interested. Data used was taken from Tsong's 1996 article so we can compare results.

** There are some printing error so the formulas listed there were incorrect. you have to delete the n/2 from below the square root.

========== SAS IML ==========
/* firstly read data to create matrix for T & R with each column represent dissolution at time tn. so typically you'll have 12 rows and p columns matrices for T and R. The rest is step-by-step approach which can be very easily converted to a macro when you want. PROC IML need to be running */

/* for those has old version of SAS 9.2 and before you need to define a module for calculation of covariance. SAS 9.3 has internal fuctin COV */

  START COV(m);             
    n = NROW(m);              /* assume no missing values */
    d = m - m[:,];            /* subtract mean to center the data */
    RETURN((d`*d)/(n-1));
  FINISH;


/* nt and nr are number of unit, typically 12, Tsong's article only has 6 unit. p is number of time points, dg is mean % of difference in dissolution between each time points. typically 10. dg=15 was used to compare result with that of the article. it's better to define those number firstly to be easier to modify the code */

nt=6; nr=6; p=8; dg=15;

tbar = (mt[:,])`;             /* mean profile of T */
rbar = (mr[:,])`;             /* mean profile of R */
mdiff = tbar-rbar;            /* mean difference between T&R */
st = Cov(mt);                 /* variance-covariance matrix of T & R */
sr = Cov(mr);
sp = (st+sr)/2;               /* pooled variance-covariance matrix */
spinv = INV(sp);              /* inverse of the pooled var-covar matrix */
dm = SQRT(mdiff`*spinv*mdiff);/* Mahalanobis distance between T and R */
df2 = nt+nr-p-1;              /* degree of freedom for critical F value */
fcrit = FINV(0.9,p,df2);      /* critical F value */
k = nt*nr/(nt+nr)*(nt+nr-p-1)/((nt+nr-2)*p);   /* k factor */
mdg = J(p,1)*dg;              /* create a vector with each element equal to dg */
dm_max = SQRT(mdg`*spinv*mdg);/* Maximum Mahalanobis distance. It's the criteria */
H2 = nt*nr/(nt+nr)*dm**2;     /* Hotelling's T2 */

/* The steps above are pretty easy to follow. The follwoing are taken from the 2nd reference from Enclyclopedia mentioned above.
obtain vector on the boundary of confidence region */

bound1 = mdiff*(1+SQRT(fcrit/(k*mdiff`*spinv*mdiff)));
bound2 = mdiff*(1-SQRT(fcrit/(k*mdiff`*spinv*mdiff)));

/* 90% CI of Mahalanobis distance is obtained by following */
ci1 = SQRT(bound1`*spinv*bound1);
ci2 = SQRT(bound2`*spinv*bound2);

dm_lower = min(ci1,ci2);        /* Lower limit of 90% CI of Mahalanobis Distance */
dm_upper = max(ci1,ci2);        /* Upper limit of 90% CI of Mahalanobis Distance */
IF dm_upper <= dm_max THEN conclusion = 'Yes, Similar';
ELSE conclusion = 'No, Not Similar';


By adding various parameters to convert it in macro and define some global macro variables for titles you can get some pretty final results tables with intermediate results to check with output of DDSolver and the Connolly'macro.

In the last month I had to do profile comparisons of > 100 batches of T and R from different projects and I used DDSolver, Connolly's macro, my macro based on steps above for each comparison. The output are same each time. I stronglly suspect that DDSolver use the above approach so same result doesn't really means anything but Connolly's Newton-Raphson search also give same result everytime. So Detlew, I know you have doubt about DDSolve but it seems it indeed works.

========== R Code ==========
I googled R functions when I needed so hopefully I won't embarrass myself too much by the following naive R code. Since there are so many R gurus here I'm sure someone can make it as a simple function such Dm90CI or sth. like that. :-)


#dissolution data for T & R, format in columns t1 t2 t3 ...
mt <- matrix(
c(19.99, 36.70, 47.77, 55.08, 65.69, 81.37, 92.39, 97.10,
  22.08, 39.29, 49.46, 56.79, 67.22, 82.42, 89.93, 95.62,
  21.93, 38.54, 47.76, 55.14, 65.25, 83.49, 90.19, 95.62,
  22.44, 39.46, 49.72, 58.67, 69.21, 84.93, 94.12, 95.51,
  25.67, 42.35, 52.68, 59.71, 71.51, 86.61, 93.80, 96.70,
  26.37, 41.34, 51.01, 57.75, 69.44, 85.90, 94.45, 98.07),
 nrow=6, ncol=8, byrow=TRUE)

mr <- matrix(
c(42.06, 59.91, 65.58, 71.81, 77.77, 85.67, 93.14, 94.23,
  44.16, 60.18, 67.17, 70.82, 76.11, 83.27, 88.01, 89.59,
  45.63, 55.77, 65.56, 70.50, 76.92, 83.91, 86.83, 90.12,
  48.52, 60.39, 66.51, 73.06, 78.45, 84.99, 88.00, 93.43,
  50.49, 61.82, 69.06, 72.85, 78.99, 86.86, 89.70, 90.79,
  49.77, 62.73, 69.77, 72.88, 80.18, 84.20,   88.88, 90.47),
 nrow=6, ncol=8, byrow=TRUE)


# same as above, firstly define those number together so it's easy to change
nt <- 6
nr <- 6
p  <- 8
dg <- 15

# mean dissolution profile and difference
tbar  <- colMeans(mt)
rbar  <- colMeans(mr)
mdiff <- tbar-rbar

#covariance of T & R and pooled covariance matrix and it's inverse
st <- cov(mt)
sr <- cov(mr)
sp <- (st+sr)/2
spinv <- solve(sp)

# Mahalanobis distance
dm <- sqrt(t(mdiff) %*% spinv %*% mdiff)
# k and fcrit as mentioned in the aerticle
df <- nt + nr -p -1
k <- nr*nr/(nt+nr)*(nt+nr-p-1)/((nt+nr-2)*p)
fcrit <- qf(0.9,p,df)


# create column vector with each element equal to predefined limit dg
mdg <- c(rep(dg, p))
# criteria
dm_max <- sqrt(t(mdg) %*% spinv %*% mdg)

#Hotelling's T^2. I added it just to compare with DDSolver
H2 <- nt*nr/(nt+nr)*dm^2

#vector on the boundary of confidence region
bound1 <- mdiff*(1+sqrt(fcrit/(k*t(mdiff) %*% spinv %*% mdiff)))
bound2 <- mdiff*(1-sqrt(fcrit/(k*t(mdiff) %*% spinv %*% mdiff)))


# 90% CI of Mahalanobis distance
ci1 <- sqrt(t(bound1) %*% spinv %*% bound1)
ci2 <- sqrt(t(bound2) %*% spinv %*% bound2)
dm_lower <- min(ci1, ci2)
dm_upper <- max(ci1, ci2)


if (dm_upper < dm_max) conclusion <- "Similar" else conclusion <- "Not similar"

Again, it should be pretty easy from here to print a final table with result and intermediate result after that.

Happy coding.

All the best,
Shuanghe

Complete thread:

UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,636 registered users;
88 visitors (0 registered, 88 guests [including 4 identified bots]).
Forum time: 08:28 CET (Europe/Vienna)

With four parameters I can fit an elephant,
and with five I can make him wiggle his trunk.    John von Neumann

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5