Shuanghe
★★  

Spain,
2014-02-10 18:05
(3699 d 05:02 ago)

Posting: # 12382
Views: 24,624
 

 SAS/R code of MSD method for comparing dis­so­lution [Dissolution / BCS / IVIVC]

Hi all,

This gonna be a really lengthy post with some ready-to-use SAS and R code for comparing dissolution profiles with MSD method based on Tsong et al's article. I had the impression that administrators don't like the lengthy post but at least I didn't find such prohibition from forum policy :-)

Over the years many, including myself, have asked the question in this forum of MSD method for compare dissolution profiles using SAS or other softwares. The most relevant thread I knew is this thread. There are some links for reference articles, SAS macro and DDSolver for Excel (and the mistrust of the last).

There're 3 different approaches used in the SAS macro in Detlew's post but none of them was based on Tsong's article and the macro was not specifically designed for dissolution comparison. (By that I by no means saying Tsong et al's method is the best one, or even the better one. The article is mentioned by many is probably because the authors were all from FDA, so it has a kind of "official" status). While the rest of the calculation is fairly easy, obtaining the 90% CI of Mahalanobis distance by Lagrange multiplier method is pretty mysterious to people like me who lacks deep mathematical understanding and I believe it's the only obstacle for those who asked the "how to" questions.

There is another SAS macro by Connolly et al in the following reference: "SAS IML Code To Calculate an Upper Confidence Limit for Multivariate Statistical Distance"*, which can be accessed via this address. This is a macro specifically written for comparing dissolution profiles. In fact, Tsong's article is the only reference cited at the end (though authors use Newton-Raphson iterations to find the upper confidence bound). The shortcoming is that the macro is lengthy, and while it might be suitable for model-independent MSD approach, it might not easy to modify it for model-dependent approach (comparing Weibull parameters for example).

* If you'd like to play the macro please note that there are some error in the result section where 8 points of data were used. The macro was correct though.

I used to be happy to use the modified version of Connolly et al's code by adding various macro parameters to allow comparing several batches of formulations together which the original macro cannot do. However, I recently come across another article "In vitro dissolution profile comparison"** also by Tsong et al in Enclyclopedia of Biopharmaceutical Statistics where the formulas were described for 90% CI of the Mahalanobis Distance and with the help of those formulas the obstacle of getting 90% CI of Mahalanobis distance disappears. In addition to help to simplify the SAS macro, it even allows a R-novice such as I to write R code to do the job.

So, below are some ready-to-use SAS and R code in case anyone's interested. Data used was taken from Tsong's 1996 article so we can compare results.

** There are some printing error so the formulas listed there were incorrect. you have to delete the n/2 from below the square root.

========== SAS IML ==========
/* firstly read data to create matrix for T & R with each column represent dissolution at time tn. so typically you'll have 12 rows and p columns matrices for T and R. The rest is step-by-step approach which can be very easily converted to a macro when you want. PROC IML need to be running */

/* for those has old version of SAS 9.2 and before you need to define a module for calculation of covariance. SAS 9.3 has internal fuctin COV */

  START COV(m);             
    n = NROW(m);              /* assume no missing values */
    d = m - m[:,];            /* subtract mean to center the data */
    RETURN((d`*d)/(n-1));
  FINISH;


/* nt and nr are number of unit, typically 12, Tsong's article only has 6 unit. p is number of time points, dg is mean % of difference in dissolution between each time points. typically 10. dg=15 was used to compare result with that of the article. it's better to define those number firstly to be easier to modify the code */

nt=6; nr=6; p=8; dg=15;

tbar = (mt[:,])`;             /* mean profile of T */
rbar = (mr[:,])`;             /* mean profile of R */
mdiff = tbar-rbar;            /* mean difference between T&R */
st = Cov(mt);                 /* variance-covariance matrix of T & R */
sr = Cov(mr);
sp = (st+sr)/2;               /* pooled variance-covariance matrix */
spinv = INV(sp);              /* inverse of the pooled var-covar matrix */
dm = SQRT(mdiff`*spinv*mdiff);/* Mahalanobis distance between T and R */
df2 = nt+nr-p-1;              /* degree of freedom for critical F value */
fcrit = FINV(0.9,p,df2);      /* critical F value */
k = nt*nr/(nt+nr)*(nt+nr-p-1)/((nt+nr-2)*p);   /* k factor */
mdg = J(p,1)*dg;              /* create a vector with each element equal to dg */
dm_max = SQRT(mdg`*spinv*mdg);/* Maximum Mahalanobis distance. It's the criteria */
H2 = nt*nr/(nt+nr)*dm**2;     /* Hotelling's T2 */

/* The steps above are pretty easy to follow. The follwoing are taken from the 2nd reference from Enclyclopedia mentioned above.
obtain vector on the boundary of confidence region */

bound1 = mdiff*(1+SQRT(fcrit/(k*mdiff`*spinv*mdiff)));
bound2 = mdiff*(1-SQRT(fcrit/(k*mdiff`*spinv*mdiff)));

/* 90% CI of Mahalanobis distance is obtained by following */
ci1 = SQRT(bound1`*spinv*bound1);
ci2 = SQRT(bound2`*spinv*bound2);

dm_lower = min(ci1,ci2);        /* Lower limit of 90% CI of Mahalanobis Distance */
dm_upper = max(ci1,ci2);        /* Upper limit of 90% CI of Mahalanobis Distance */
IF dm_upper <= dm_max THEN conclusion = 'Yes, Similar';
ELSE conclusion = 'No, Not Similar';


By adding various parameters to convert it in macro and define some global macro variables for titles you can get some pretty final results tables with intermediate results to check with output of DDSolver and the Connolly'macro.

In the last month I had to do profile comparisons of > 100 batches of T and R from different projects and I used DDSolver, Connolly's macro, my macro based on steps above for each comparison. The output are same each time. I stronglly suspect that DDSolver use the above approach so same result doesn't really means anything but Connolly's Newton-Raphson search also give same result everytime. So Detlew, I know you have doubt about DDSolve but it seems it indeed works.

========== R Code ==========
I googled R functions when I needed so hopefully I won't embarrass myself too much by the following naive R code. Since there are so many R gurus here I'm sure someone can make it as a simple function such Dm90CI or sth. like that. :-)


#dissolution data for T & R, format in columns t1 t2 t3 ...
mt <- matrix(
c(19.99, 36.70, 47.77, 55.08, 65.69, 81.37, 92.39, 97.10,
  22.08, 39.29, 49.46, 56.79, 67.22, 82.42, 89.93, 95.62,
  21.93, 38.54, 47.76, 55.14, 65.25, 83.49, 90.19, 95.62,
  22.44, 39.46, 49.72, 58.67, 69.21, 84.93, 94.12, 95.51,
  25.67, 42.35, 52.68, 59.71, 71.51, 86.61, 93.80, 96.70,
  26.37, 41.34, 51.01, 57.75, 69.44, 85.90, 94.45, 98.07),
 nrow=6, ncol=8, byrow=TRUE)

mr <- matrix(
c(42.06, 59.91, 65.58, 71.81, 77.77, 85.67, 93.14, 94.23,
  44.16, 60.18, 67.17, 70.82, 76.11, 83.27, 88.01, 89.59,
  45.63, 55.77, 65.56, 70.50, 76.92, 83.91, 86.83, 90.12,
  48.52, 60.39, 66.51, 73.06, 78.45, 84.99, 88.00, 93.43,
  50.49, 61.82, 69.06, 72.85, 78.99, 86.86, 89.70, 90.79,
  49.77, 62.73, 69.77, 72.88, 80.18, 84.20,   88.88, 90.47),
 nrow=6, ncol=8, byrow=TRUE)


# same as above, firstly define those number together so it's easy to change
nt <- 6
nr <- 6
p  <- 8
dg <- 15

# mean dissolution profile and difference
tbar  <- colMeans(mt)
rbar  <- colMeans(mr)
mdiff <- tbar-rbar

#covariance of T & R and pooled covariance matrix and it's inverse
st <- cov(mt)
sr <- cov(mr)
sp <- (st+sr)/2
spinv <- solve(sp)

# Mahalanobis distance
dm <- sqrt(t(mdiff) %*% spinv %*% mdiff)
# k and fcrit as mentioned in the aerticle
df <- nt + nr -p -1
k <- nr*nr/(nt+nr)*(nt+nr-p-1)/((nt+nr-2)*p)
fcrit <- qf(0.9,p,df)


# create column vector with each element equal to predefined limit dg
mdg <- c(rep(dg, p))
# criteria
dm_max <- sqrt(t(mdg) %*% spinv %*% mdg)

#Hotelling's T^2. I added it just to compare with DDSolver
H2 <- nt*nr/(nt+nr)*dm^2

#vector on the boundary of confidence region
bound1 <- mdiff*(1+sqrt(fcrit/(k*t(mdiff) %*% spinv %*% mdiff)))
bound2 <- mdiff*(1-sqrt(fcrit/(k*t(mdiff) %*% spinv %*% mdiff)))


# 90% CI of Mahalanobis distance
ci1 <- sqrt(t(bound1) %*% spinv %*% bound1)
ci2 <- sqrt(t(bound2) %*% spinv %*% bound2)
dm_lower <- min(ci1, ci2)
dm_upper <- max(ci1, ci2)


if (dm_upper < dm_max) conclusion <- "Similar" else conclusion <- "Not similar"

Again, it should be pretty easy from here to print a final table with result and intermediate result after that.

Happy coding.

All the best,
Shuanghe
Helmut
★★★
avatar
Homepage
Vienna, Austria,
2014-02-10 18:15
(3699 d 04:51 ago)

@ Shuanghe
Posting: # 12384
Views: 19,239
 

 Wow, what a gem!

Hi Shuanghe,

❝ This gonna be a really lengthy post […]. I had the impression that administrators don't like the lengthy post but at least I didn't find such prohibition from forum policy :-)


My own posts have a strong tendency to be verbose as well… The limit in the database is currently set to 15,000 characters. Yours has 10,131 including blanks. So there is still some head room.

THX for the codes!


Edit 2014-07-30: I expect something in the near future from EMA (in the Q&A?). See:
CHMP, Agenda of the 21-24 July 2014 meeting, page 23:

CMDh request to PKWP on acceptability of Mahalanobis test for proof of comparable dissolution (bio­waiver) for additional strengths of a bio-batch strength


Dif-tor heh smusma 🖖🏼 Довге життя Україна! [image]
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
arjunroy
☆    

India,
2014-12-19 11:56
(3387 d 11:10 ago)

@ Shuanghe
Posting: # 14117
Views: 18,609
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi,
  1. R code is working fine, though SAS IML v9.1.3 did not, got problem in calculating Cov matrix. Execution error noted in log despite it can be done manually.
  2. Any idea how dm-max is calculated in a 2 parameter weibull function, I am unable to verify in DDSolver output. For all kind of data it sets dm_max=11.82
    Any further insights will be appreciated.
regards


Edit: Full quote removed. Please delete everything from the text of the original poster which is not necessary in understanding your answer; see also this post! [Ohlbe]
Shuanghe
★★  

Spain,
2014-12-19 18:13
(3387 d 04:54 ago)

@ arjunroy
Posting: # 14119
Views: 19,125
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi arjunroy,

❝ 1. R code is working fine,


So I guess you have nt=nr otherwise the code will give wrong result. I noticed it some time ago so I'll modify the code when I have some time.

❝ though SAS IML v9.1.3 did not, got problem in calculating Cov matrix. Execution error noted in log despite it can be done manually.


Do you add the COV function before the run? As I mentioned in the post, for SAS 9.2 and older version there's no COV function so you should add it manually. The example of COV I gave only work for no missing value which in most of the case it should be. But if not, well, you have to adapt the code.

❝ 2. Any idea how dm-max is calculated in a 2 parameter weibull function, I am unable to verify in DDSolver output. For all kind of data it sets dm_max=11.82

❝ Any further insights will be appreciated.


If I remember it correctly, that's a certain default value in DDSolver. You should use you own criteria instead of 11.82.

I'm not aware any guideline that clearly define how to set similarity limit but in my opinion the criteria can be obtained by, e.g., fitting suitable model at least 2 different reference batches (more batches would be better) and determine the MSD between them (let's call it MSDR). If you have more batches, pair-wise compare the MSDR (A vs B, B vs C, A vs C, ...) and pick the biggest value.

The rational is that the difference between test and reference (upper limit of MSD) should not be greater than the maximum difference observed between reference batches themselves. So the maximum value in your pair-wise comparison of difference between reference can be used as dm_max.

All the best,
Shuanghe
arjunroy
☆    

India,
2014-12-22 06:42
(3384 d 16:25 ago)

(edited by Ohlbe on 2014-12-22 09:39)
@ Shuanghe
Posting: # 14135
Views: 18,340
 

 SAS/R code of MSD method for comparing dis­so­lution

❝ ❝ 1. R code is working fine,


❝ So I guess you have nt=nr otherwise the code will give wrong result. I noticed it some time ago so I'll modify the code when I have some time.


Thank you

❝ ❝ though SAS IML v9.1.3 did not, got problem in calculating Cov matrix. Execution error noted in log despite it can be done manually.


❝ Do you add the COV function before the run? As I mentioned in the post, for SAS 9.2 and older version there's no COV function so you should add it manually. The example of COV I gave only work for no missing value which in most of the case it should be. But if not, well, you have to adapt the code.


Here is SAS IMLv9.1.3 log error for your perusal.
proc iml;
NOTE: IML Ready
21   use one;
21 !          read all into one;
21 !                             close one;
22
23   mt = one[1:6,];
24   mr = one[7:12,];
25
26   START COV(m);
27       n = NROW(m);
27 !                               /* assume no missing values */
28       d = m - m[:,];
29
30       /* subtract mean to center the data */
31       RETURN((d`*d)/(n-1));
32     FINISH;
NOTE: Module COV defined.
33
34   nt=6;
34 !       nr=6;
34 !             p=8;
34 !                  dg=15;
35
36   tbar = (mt[:,])`;
36 !                               /* mean profile of T */
37   rbar = (mr[:,])`;
37 !                      /* mean profile of R */
38
39
40   mdiff = tbar-rbar;
40 !                         /* mean difference between T&R */
41
42   st = Cov(mt);
ERROR: (execution) Matrices do not conform to the operation.
 operation : - at line 28 column 11
 operands  : M, _TEM1001

M      6 rows      8 cols    (numeric)

     19.99      36.7     47.77     55.08     65.69     81.37     92.39      97.1
     22.08     39.29     49.46     56.79     67.22     82.42     89.93     95.62
     21.93     38.54     47.76     55.14     65.25     83.49     90.19     95.62
     22.44     39.46     49.72     58.67     69.21     84.93     94.12     95.51
     25.67     42.35     52.68     59.71     71.51     86.61      93.8      96.7
     26.37     41.34     51.01     57.75     69.44      85.9     94.45     98.07

_TEM1001      1 row       8 cols    (numeric)

     23.08 39.613333 49.733333     57.19 68.053333     84.12     92.48 96.436667

 statement : ASSIGN at line 28 column 5
 traceback : module COV at line 28 column 5

NOTE: Paused in module COV.
42 !                               /* variance-covariance matrix of T & R */
43   sr = Cov(mr);
ERROR: (execution) Matrix has not been set to a value.


Thank you so much.


Edit: Full quote removed. Please delete everything from the text of the original poster which is not necessary in understanding your answer; see also this post! [Ohlbe]
Shuanghe
★★  

Spain,
2014-12-24 12:09
(3382 d 10:57 ago)

@ arjunroy
Posting: # 14159
Views: 18,110
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi arjunroy,

Thank you

No problem.

for pooled variance, instead of using (st + sr) / 2, it's better to use:
sp = ((nt-1)*st + (nr-1)*sr)/(nt+nr-2)
This works regardless if nt = nr.

❝ Here is SAS IMLv9.1.3 log error for your perusal.

❝ ......


I use SAS 9.2 and the COV module defined as above worked. It seems version 9.1.3 didn't support operation for matrix with different dimensions therefore the line
d = m - m[:,];
gave error you had. Weird.

You can try the following code:

  START COV(m);             
    n = NROW(m);             
    d = m - REPEAT(m[:,],n,1);
    RETURN((d`*d)/(n-1));
  FINISH;


The REPEAT function will create a matrix with n rows with each row equal to the mean profile. Then the matrix will be of same dimensions for the subtraction to work.

You can report it back here if it works.

All the best,
Shuanghe
arjunroy
☆    

India,
2014-12-24 12:44
(3382 d 10:23 ago)

@ Shuanghe
Posting: # 14160
Views: 18,117
 

 SAS/R code of MSD method for comparing dis­so­lution

❝ You can report it back here if it works.


Thank you, it works now. I used Repeat function.


Hi
SAS and R code give different results for the same data.
For above said example: (dm_lower, dm_upper) dm_max
SAS results: (19.65, 33.32) 29.14
R (1.66, 5.148) 4.43

Your insights will be appreciated.
Best regards


Edit: Full quote removed. Please delete everything from the text of the original poster which is not necessary in understanding your answer; see also this post! Two posts merged; you can edit the fisrt one for 24 hours. [Helmut]
Shuanghe
★★  

Spain,
2015-01-08 12:16
(3367 d 10:50 ago)

@ arjunroy
Posting: # 14253
Views: 17,913
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi arjunroy,

❝ Thank you, it works now. I used Repeat function.


Good to know.

❝ SAS and R code give different results for the same data.

❝ For above said example: (dm_lower, dm_upper) dm_max

❝ SAS results: (19.65, 33.32) 29.14

❝ R (1.66, 5.148) 4.43


I've no idea how you get this. I did a quick check and both R and SAS give the same result as from your SAS output when dg = 15. Check if you have correct data in R.

All the best,
Shuanghe
wligtenberg
☆    

The Netherlands,
2016-12-12 15:09
(2663 d 07:57 ago)

@ Shuanghe
Posting: # 16839
Views: 21,466
 

 SAS/R code of MSD method for comparing dis­so­lution

First of all thank you for this very useful post!
I hope you are still active, or that other can help me out with this as well.

I have tried to use your code and verify it against the article and so far I have been unable to get the same results.
I think you have a typo in the definition of your mr. You have a value: 78.45, where in the article I think it is: 78.54.
Interestingly, with this typo, you get exactly the same value as in the article on page 6.

And then I started to see if some calculations might have been off, and I checked the mean values calculated with the mean values in the Tsong article.
And mean values calculated in your script are off compared to the ones reported:
tbar
[1] 23.08000 39.61333 49.73333 57.19000 68.05333 84.12000 92.48000 96.43667
rbar
[1] 46.77167 60.13333 67.27500 71.98667 78.08500 84.81667 89.09333 91.43833


TEST MEAN 23.08 37.95 49.73 57.19 68.05 84.12 92.48 96.34
REFERENCE MEAN 46.77 60.21 67.28 71.97 78.05 84.82 89.09 91.43

Even if this could be related to rounding issues, which I think is impossible. Look at the following values:
mean(mt[, 2]) = 39.61 versus 37.95 in the article.
Even if I add 0.01 to all values (or subtract it) I don't get that value:
mean(mt[, 2] + 0.01)
[1] 39.62333
mean(mt[, 2] - 0.01)
[1] 39.60333


Similar thing with the covariance matrices, if I use your data (with typo) I get results that are more similar (excluding some rounding errors) to the Tsong article, than without the typo.
Which is uncanny, so maybe I missed a memo where this is a known typo, and that you actually used the correct data.
(which however does not explain why the mean values are off, in the article)

I just wanted to verify my implementation before I am going to put it into practice.
I might even create a package out of this in the end, because I think it would be nice to have an R package with a couple of methods combined.
Shuanghe
★★  

Spain,
2016-12-19 20:03
(2656 d 03:04 ago)

@ wligtenberg
Posting: # 16858
Views: 14,268
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi wligtenberg,

❝ I think you have a typo in the definition of your mr. You have a value: 78.45, where in the article I think it is: 78.54.


Eagle eye! I never noticed it before. My copy of Tsong's article is old and is of image format so I copied the data from DDSolver (mentioned in my original post) instead to avoid manually entering the data. I double checked it after you'd pointed out the typo and it seems it is the only individual data point that's "wrong".

❝ And then I started to see if some calculations might have been off, and I checked the mean values calculated with the mean values in the Tsong article.

❝ ...


To summarise, you are right. The individual data and the mean values in the article do not agree (4 time points in reference batch and 2 time points in test batch).

❝ Similar thing with the covariance matrices, if I use your data (with typo) I get results that are more similar (excluding some rounding errors) to the Tsong article, than without the typo.


❝ Which is uncanny, so maybe I missed a memo where this is a known typo, and that you actually used the correct data.


It seems that I missed it too. I guess the only way to be sure is to ask Tsong to provide the original data. But I had a feeling that the authors of DDSolver might have done just that otherwise it's difficult to explain why the sample data generated by DDSolver give much better result with this "typo". So an alternative is to contact them.

❝ (which however does not explain why the mean values are off, in the article)


My guess is that the individual data is the correct one (maybe except the unit 4 of reference at 30 min) otherwise we'd see much bigger difference in the variance-covariance table. The article was printed rather sloppily in my opinion. There are other mistakes too, such as reference to tables are all wrong, one time point is missing for T-R mean difference, data in the last table was never referenced... So I guess we should not be shocked about the discrepancy between individual and mean values.

Actually, there some mistakes in the formula I mentioned in the original post (in the article in Encyclopedia of Biopharmaceutical Statistics by same authors) so i don't know if it's the authors or editors who were sloppy. :-D

❝ I just wanted to verify my implementation before I am going to put it into practice.


I believe that the method is correct but that's just my opinion.

❝ I might even create a package out of this in the end, because I think it would be nice to have an R package with a couple of methods combined.


Good! Please let us know when it's available.

Just to be thorough, I paste the roughly coded function (differ very slightly from code above) that I sometimes use.

after read data from external file as data frame...

msd <- function(test, ref, dg=10){
  mt <- as.matrix(test)
  mr <- as.matrix(ref)
  nt <- nrow(mt)
  nr <- nrow(mr)
  p  <- ncol(mt)
  dg <- dg

  mean_t    <- cbind(colMeans(mt))
  mean_r    <- cbind(colMeans(mr))
  mean_diff <- mean_t - mean_r

  st <- cov(mt)
  sr <- cov(mr)
  sp <- ((nt - 1) * st + (nr - 1) * sr) / (nt  + nr -2)
  sp_inv  <- solve(sp)
  m_dg    <- cbind(rep(dg, p))
  dm_max  <- sqrt(t(m_dg) %*% sp_inv %*% m_dg)

  dm <- sqrt(t(mean_diff) %*% sp_inv %*% mean_diff)

  df    <- nt + nr - p - 1
  k     <- nt * nr / (nt + nr) * df / ((nt + nr - 2) * p)
  fcrit <- qf(0.9, p, df)
  h2    <- nt * nr / (nt + nr) * dm^2

  y1 <- mean_diff %*% (1 + sqrt(fcrit / (k * (t(mean_diff) %*% sp_inv %*% mean_diff))))
  y2 <- mean_diff %*% (1 - sqrt(fcrit / (k * (t(mean_diff) %*% sp_inv %*% mean_diff))))

  ci1 <- sqrt(t(y1) %*% sp_inv %*% y1)
  ci2 <- sqrt(t(y2) %*% sp_inv %*% y2)

  dm_lower <- min(ci1, ci2)
  dm_upper <- max(ci1, ci2)

  result <- data.frame(cbind(dm, dm_lower, dm_upper, dg, dm_max, h2))
  names(result) <- c("Dm", "Dm_lower", "Dm_upper", "Dg (%)", "Dm_max", "Hotelling T Square")
  return(result)
}

All the best,
Shuanghe
wligtenberg
☆    

The Netherlands,
2016-12-20 09:14
(2655 d 13:53 ago)

(edited by wligtenberg on 2016-12-20 10:43)
@ Shuanghe
Posting: # 16861
Views: 14,252
 

 SAS/R code of MSD method for comparing dis­so­lution

Thank you very much for your reply.
It would indeed seem that the Tsong paper was not thoroughly checked.

I also started verifying my code with DDsolver, since that is used quite often.
And for most of my tests it went fine, except for one generated data set.
All values are the same apart from the lower part of the confidence region.
In DDsolver it is: -0.361857342251203
In my code (and yours) it is: 0.361857342251203

The rest is all the same, also the conclusion would be the same, but I really dislike the idea that part of the calculation might be wrong. Because (as you know) quite a large bit is the same for the upper and lower part of the confidence region.

Here is my data set (made so you should be able to drop it in your code).
reference <- structure(c(28.7911578671327, 27.7063076528163, 28.2263135554565,
28.3774876815685, 28.2493706626467, 27.9633179734855, 28.8699375412747,
27.9697438683459, 29.1540336099923, 27.9876475987802, 28.7541180117783,
29.3043600382281, 45.5068689363729, 46.5460525217093, 46.682230512727,
47.4023771396442, 46.5409373038651, 44.3202218507459, 44.5224174823224,
48.0428499621152, 46.5199811632234, 45.1394826170731, 46.6460992057861,
47.9441445472879, 61.6499246599356, 58.8871076421716, 59.0928637200424,
57.3039074196922, 59.9450737683508, 58.6381973772281, 59.9395529973835,
60.2358174860198, 60.6281637525813, 58.6751058188512, 59.9983634443677,
57.358737462136, 66.7744608781215, 66.6843050370333, 64.5632571665535,
67.887806253632, 68.118289910638, 67.3489221222566, 68.8674535916327,
66.8528203605585, 65.9823451984023, 68.4260330454663, 66.7379172478314,
69.798117787546, 72.8623052002997, 74.4602574302653, 73.9697085359241,
72.3443125360204, 75.8127101072032, 74.4415179132332, 73.6284427938045,
73.9030111699696, 74.4950079468374, 73.6285489815188, 69.0968679635935,
73.9152589300409), .Dim = c(12L, 5L), .Dimnames = list(NULL,
    c("10", "20", "30", "40", "50")))

test <- structure(c(29.9437005373757, 27.2315750015848, 28.5315897581853,
28.9095250734653, 28.5892325261607, 27.8741008032577, 30.1406497227308,
27.8901655404087, 30.8508898945248, 27.9349248664945, 29.8511008989897,
31.2267059651144, 43.5566152658779, 46.154574229219, 46.4950192067632,
48.2953857740562, 46.1417861846084, 40.5899975518104, 41.0954866307517,
49.8965678302338, 46.0893958330041, 42.6381494676286, 46.4046909394111,
49.6498042931656, 65.0270736686448, 58.1200311242348, 58.6344213189118,
54.1620305680362, 60.7649464396827, 57.4977554618761, 60.7511445122644,
61.4918057338553, 62.4726714002591, 57.5900265659338, 60.898170629725,
54.2991056741457, 65.1779582396222, 64.9525686369017, 59.6499489607023,
67.9613216783985, 68.5375308209135, 66.61411134996, 70.4104400234003,
65.3738569457147, 63.1976690403243, 69.3068886579843, 65.086599163897,
72.7371005131835, 71.9110116134174, 75.9058921883315, 74.6795199524783,
70.616029952719, 79.287023880676, 75.8590433957511, 73.8263555971792,
74.5127765375922, 75.9927684797615, 73.8266210664652, 62.4974185216518,
74.5433959377705), .Dim = c(12L, 5L), .Dimnames = list(NULL,
    c("10", "20", "30", "40", "50")))


If you have any idea why these results would differ, you would really help me out.
I will see if I can figure out what happens in DDsolver.

OK, I now know why these values are different.
DDsolver basically calculates the lower and upper bounds as follows:

df    <- nt + nr - p - 1
k     <- nt * nr / (nt + nr) * df / ((nt + nr - 2) * p)
fcrit <- qf(0.9, p, df)

dm_lower <- dm - sqrt(fcrit/k)
dm_upper <- dm + sqrt(fcrit/k)


That is quite a bit simpler, is it also correct?
Shuanghe
★★  

Spain,
2016-12-22 10:23
(2653 d 12:44 ago)

@ wligtenberg
Posting: # 16873
Views: 14,108
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi,

❝ OK, I now know why these values are different.

❝ DDsolver basically calculates the lower and upper bounds as follows:


❝ df    <- nt + nr - p - 1

❝ k     <- nt * nr / (nt + nr) * df / ((nt + nr - 2) * p)

❝ fcrit <- qf(0.9, p, df)


❝ dm_lower <- dm - sqrt(fcrit/k)

❝ dm_upper <- dm + sqrt(fcrit/k)



❝ That is quite a bit simpler, is it also correct?


DDSolver's source is locked, so I thought that, unless authors let you, no one can have a peek as how it works exactly. I have no idea how you figure this out but :thumb up: from me.

If they obtain low and upper limits as you said then I don't think it's correct. I'm not a statistician so my thinking might be naïve. For me, here the Mahalanobis distance is a measure of the difference in terms of "distance" and when there is no difference the distance would be zero; if there are difference, the distance would always be a positive number. Negative distance? :confused: Couldn't comprehend.

All the best,
Shuanghe
wligtenberg
☆    

The Netherlands,
2016-12-22 12:20
(2653 d 10:46 ago)

@ Shuanghe
Posting: # 16874
Views: 14,092
 

 SAS/R code of MSD method for comparing dis­so­lution

❝ DDSolver's source is locked, so I thought that, unless authors let you, no one can have a peek as how it works exactly. I have no idea how you figure this out but :thumb up: from me.


I didn't know that they tried to limit people to look at the code.
I just opened it in Libre Office (under linux) and I could see the code, nothing fancy.

❝ If they obtain low and upper limits as you said then I don't think it's correct. I'm not a statistician so my thinking might be naïve. For me, here the Mahalanobis distance is a measure of the difference in terms of "distance" and when there is no difference the distance would be zero; if there are difference, the distance would always be a positive number. Negative distance? :confused: Couldn't comprehend.


I agree that the distance should never be negative, and that if your CR somehow goes below zero, it should be truncated to zero.
However, the interesting thing is that for the rest of the data sets that I tried, that calculation comes up with the same values as your calculation.

Also it seems that the CR is always symmetrical around the Mahalanobis distance.
The DDsolver result fits in that observation, the results of the calculation with the R code does not in this case.

So even though the negative value of the lower bound might be incorrect, and should then be 0, for the rest it seems to work well.

:confused:
Olga
☆    

Belarus,
2018-04-21 09:01
(2168 d 15:06 ago)

@ Shuanghe
Posting: # 18712
Views: 10,294
 

 SAS/R code of MSD method for comparing dis­so­lution

Hi, Shuanghe,
Thanks a lot for R code.
It's very helpful!

I have a question about max MRD.

You calculate it from vector m_dg    <- cbind(rep(dg, p))
where dg is same positive value (often 10% - max inter-batch difference).
I wonder -why only positive value?

I suppose, max inter-batch difference is absolute value, and may be positive or negative, or it is my mistake?

I modeled vector m_dg like m_dg <- c(10,-10....), of course, results were various.

Thanks in advance!
UA Flag
Activity
 Admin contact
22,957 posts in 4,819 threads, 1,636 registered users;
67 visitors (0 registered, 67 guests [including 11 identified bots]).
Forum time: 23:07 CET (Europe/Vienna)

Nothing shows a lack of mathematical education more
than an overly precise calculation.    Carl Friedrich Gauß

The Bioequivalence and Bioavailability Forum is hosted by
BEBAC Ing. Helmut Schütz
HTML5