yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-12 20:53
(edited by yjlee168 on 2008-07-13 02:31)

Posting: # 2020
Views: 49,612
 

 bear v1.0.0 - a data analytical tool for ABE in R [R for BE/BA]

Thread locked
Hi, all

We like to announce the release of bear (GPL, free opensource) which is a data analytical tool for average bioequivalence (AB) for R in this forum. Bear will perform some sample size estimation for ABE. Also it can do noncompartmental analysis (NCA) and then ANOVA (GLM). The data can be imported into bear with .csv format.

With bear, all you need is a spreadsheet, such as MS-Excel (if it's affordable) or OpenOffice Calc (which is a very nice freeware!) or even a plain .txt editor which is used to create your .csv format for the import of your data. We have tested bear with one of BE dataset and found that all output results to be the same as those generated by commercial software (like NCA using WinNonlin and ANOVA with SAS). Let us know if you have any question or comment about it. Thank you.

Visit it's website at http://pkpd.kmu.edu.tw/bear.

All the best,
Hsin-ya Lee & Yung-jin Lee,
College of Pharmacy, KMU
Kaohsiung, Taiwan
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-07-14 04:12

@ yjlee168
Posting: # 2021
Views: 46,287
 

 bear v1.0.0 for R - first impressions

 
Dear Hsin-ya and Yung-jin!

Wonderful, thank you!

I had a quick view already, some first impressions below:
  • The plots for selection of data points in NCA should be in semilog scale rather than linear.
  • Maybe it's possible to open only one graphics device and turn the history on - instead of opening a new device for every plot.
  • In sample size estimation you should consider rounding up to the next even number:
Example:
For 20% CV, theta 95%, power 80% bear comes up with:
--------------------------------------------------------------------------
                               <<Suggestion>>
--------------------------------------------------------------------------
             n>= 9.13  (sample size number for per sequence)
            Total sample size 18
--------------------------------------------------------------------------

I implemented Hauschke's approximation in Excel (sorry, a quickshot) and obtained:
9.1763 / sequence = 18.3526 / total -> rounded up to next even number = 20
David Dubins FARTSSIE comes up with 19 / total
StudySize2.01: 9.1876 / sequence = 18.3752 / total
My R-code (modified from Patterson's/Jones' SAS) would give a total of 20 (power 83.47%).
With bear's suggested n=18 I got power of only 79.1% (StudySize).

I failed in reading in the NCA test file. I tried it with tabulator and semicolon as field separators and got an error:
Error in readChar(con, 5L, useBytes = TRUE) :
  cannot open connection
  • I would avoid the term "Tests of Hypothese using the Type I MS for SUBJECT(SEQUENCE) as an error term" and use "Tests of Hypothesis for SUBJECT(SEQUENCE) as an error term" instead. There were some discussions on the R-list and here also about Types I-III; don't fall into the trap of using SAS' proprietary terminology… ;-)
  • IMHO a posteriori power is meaningless.
Two points for the wish-list:
  • CVintra and CVinter
  • intra-subject residuals - something everybody misses in WinNonlin…
Again, many thanks; I will continue testing - time allowing.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-14 10:26

@ Helmut
Posting: # 2022
Views: 46,316
 

 bear v1.0.0 for R - first impressions

 
Dear HS,

Thank you for your testing.

» I had a quick view already, some first impressions below:
» – The plots for selection of data points in NCA should be in semilog scale rather than linear.

Definitely. You're correct! We will change this asap.

» – Maybe it's possible to open only one graphics device and turn the history on - instead of opening a new device for every plot.

For cleaning reason on the screen, the plots should be displayed as you suggested. We will try and see if it can be done.

» – In sample size estimation you should consider rounding up to the next even number:
» Example:
» For 20% CV, theta 95%, power 80% bear comes up with:
» --------------------------------------------------------------------------
»                                <<Suggestion>>
» --------------------------------------------------------------------------
»              n>= 9.13  (sample size number for per sequence)
»             Total sample size 18
» --------------------------------------------------------------------------
» I implemented Hauschke's approximation in Excel (sorry, a quickshot) and obtained:
» 9.1763 / sequence = 18.3526 / total -> rounded up to next even number = 20
» David Dubins FARTSSIE comes up with 19 / total
» StudySize2.01: 9.1876 / sequence = 18.3752 / total
» My R-code (modified from Patterson's/Jones' SAS) would give a total of 20 (power 83.47%).
» With bear's suggested n=18 I got power of only 79.1% (StudySize).

Yes, we should not use rounding up. It should be 20 when we have estimated
sample size of 18.3. It will be fixed in the next release.

» I failed in reading in the NCA test file. I tried it with tabulator and
» semicolon as field separators and got an error:
» Error in readChar(con, 5L, useBytes = TRUE) :
»   cannot open connection

We have not tested with a .csv file using semicolon(;) before. Please try with comma(,) to see if it can work normally. I just tried with semicolon as field separator, it can cause error.

» – I would avoid the term "Tests of Hypothese using the Type I MS for
» SUBJECT(SEQUENCE) as an error term" and use "Tests of
» Hypothesis for SUBJECT(SEQUENCE) as an error term" instead. There were some discussions on the R-list and here also about Types I-III; don't fall into the trap of using SAS' proprietary terminology… ;-)

Of course. will be fixed, too.

» - IMHO a posteriori power is meaningless.

Just for the purpose of submission to regulatory agent (at least in Taiwan).
However, you're absolutely right.

» Two points for the wishlist:
» – CVintra and CVinter
» – intra-subject residuals - something everybody misses in WinNonlin…

These seems not so difficult to implement. We will add these as you suggest.

» Again, many thanks; I will continue testing - time allowing.

Thank you so much, HS.

Best regards,
Yung-jin Lee
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-07-14 14:02

@ yjlee168
Posting: # 2025
Views: 46,245
 

 bear v1.0.0 for R - first impressions

 
Dear Yung-jin!

» » I failed in reading in the NCA test file. I tried it with tabulator and semicolon as field separators and got an error:
» » Error in readChar(con, 5L, useBytes = TRUE) :
» »   cannot open connection
»
» We have not tested with a .csv file using semicolon(;) before. Please try with comma(,) to see if it can work normally. I just tried with semicolon as field separator, it can cause error.

Hhm, I failed with colon as well (file: [path to file]/TEST.CSV).
,subj,drug,seq,prd,Cmax,AUC0t,AUC0INF,LnCmax,LnAUC0t,LnAUC0INF
1,1,1,2,2,1739,14445.27,14933.21,7.461066,9.578123,9.611343
2,2,1,1,1,1481,12516.14,13185.57,7.300473,9.434774,9.486878
3,3,1,2,2,1780,15371.4,16032.71,7.484369,9.640264,9.682386
4,4,1,1,1,1374,11063.1,11669.01,7.225481,9.311371,9.364692
5,5,1,2,2,1555,13971.45,14557.62,7.349231,9.544771,9.58587
6,6,1,1,1,1756,15376.35,15964.84,7.470794,9.640586,9.678144
7,1,2,2,1,1633,12293.9,12972.87,7.398174,9.416858,9.470616
8,2,2,1,2,1837,15298.65,16210.46,7.515889,9.63552,9.693412
9,3,2,2,1,2073,15184,15691.62,7.636752,9.627998,9.660882
10,4,2,1,2,1629,13981.65,14650.8,7.395722,9.545501,9.592251
11,5,2,2,1,1385,11851.62,12550.94,7.233455,9.38022,9.437551
12,6,2,1,2,1522,13837.73,14343.58,7.327781,9.535154,9.571057


Importing the same file directly with R worked.
> data <- read.table("[path to file]/TEST.CSV", header=TRUE,
                   sep=",", na.strings="NA", dec=".",
                   strip.white=TRUE)
> data
    X subj drug seq prd Cmax    AUC0t  AUC0INF   LnCmax  LnAUC0t LnAUC0INF
1   1    1    1   2   2 1739 14445.27 14933.21 7.461066 9.578123  9.611343
2   2    2    1   1   1 1481 12516.14 13185.57 7.300473 9.434774  9.486878
3   3    3    1   2   2 1780 15371.40 16032.71 7.484369 9.640264  9.682386
4   4    4    1   1   1 1374 11063.10 11669.01 7.225481 9.311371  9.364692
5   5    5    1   2   2 1555 13971.45 14557.62 7.349231 9.544771  9.585870
6   6    6    1   1   1 1756 15376.35 15964.84 7.470794 9.640586  9.678144
7   7    1    2   2   1 1633 12293.90 12972.87 7.398174 9.416858  9.470616
8   8    2    2   1   2 1837 15298.65 16210.46 7.515889 9.635520  9.693412
9   9    3    2   2   1 2073 15184.00 15691.62 7.636752 9.627998  9.660882
10 10    4    2   1   2 1629 13981.65 14650.80 7.395722 9.545501  9.592251
11 11    5    2   2   1 1385 11851.62 12550.94 7.233455 9.380220  9.437551
12 12    6    2   1   2 1522 13837.73 14343.58 7.327781 9.535154  9.571057


Is bear looking at a particular path - diiferently from the one set in R?
BTW, somewhere it should be documented which format of both numbers and field separators are allowed/needed. Although everybody using R knows that the decimal separator is the point '.', converting CSV-files from some localized Excel versions can be cumbersome. For instance the German standard for numbers uses the comma ',' as a decimal separator.
I think the only 'save' field separators would be blank or tabulator.

» » - IMHO a posteriori power is meaningless.
»
» Just for the purpose of submission to regulatory agent (at least in Taiwan).

Interesting!
The only country I knew before asking - sometimes - for a posteriori power was Greece.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-14 18:12
(edited by yjlee168 on 2008-07-14 18:32)

@ Helmut
Posting: # 2028
Views: 46,392
 

 bear v1.0.0 for R - first impressions

 
Dear HS,

» Is bear looking at a particular path - differently from the one set in R?

Yes, R uses its accessible file path. You can type getwd() on R console
to see the R file path in your computer. Of course, you can change it yourself.
Please see this R Windows FAQ or R Mac OS FAQ about "What are HOME
and working directories?" We will show this directory in next version to avoid
this error.

» BTW, somewhere it should be documented which format of both numbers and field separators are allowed/needed. Although everybody using R knows that the decimal separator is the point '.', converting CSV-files from some localized Excel versions can be cumbersome. For instance the German standard for numbers uses the comma ',' as a decimal separator.
» I think the only 'save' field separators would be blank or tabulator.

You're absolutely right. We will see what we can do about this.
In this case, we have to leave it as options to allow users to use semicolon (;) or comma (,) as the separator. Is there any other separators which are used in .csv format?

» Interesting!
» The only country I knew before asking - sometimes - for a posteriori power was Greece.

Even worse that, in Taiwan, a BE study with a MR dosage form still requires a multiple-dosing design. We seem need more time to change.
Thanks again, HS.

Best regards,
Yung-jin
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-07-14 18:56

@ yjlee168
Posting: # 2029
Views: 46,169
 

 bear v1.0.0 for R...

 
Dear Yung-jin!

» » Is bear looking at a particular path - differently from the one set in R?
» Yes, R uses its accessible file path.

Strange. Of course it changed the path to different locations (including the usual suspects: the desktop, path of R, and R/bin) and placed the file in all these locations. In my example the path was set by R, and I was able to read it from the console by read.table() - but got the error from bear.

» [...] we have to leave it as options to allow users to use semicolon (;) or comma (,) as the separator. Is there any other separators which are used in .csv format?

Actually there's not even a definition what 'CSV' stands for. The original term was indeed 'Comma Separated variables', but better should be read 'Character Separated variables'. Personally I got only files with tabs, commas, blanks, semicolons, and - only once! - colons.
Maybe it's possible to implement a question/answer game with the user about his/her file's content and format. Otherwise I expect you will get a lot of e-mails...
Since everything is possible, read.table() has these two arguments:
sep the field separator character. Values on each line of the file are separated by this character. If sep = "" (the default for read.table) the separator is 'white space', that is one or more spaces, tabs, newlines or carriage returns.
dec the character used in the file for decimal points.

» Thanks again, HS.

You are welcome. Please call me Helmut; stupid me has chosen my initials as the user name in registering to my own forum– ;-)

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-14 19:30

@ Helmut
Posting: # 2031
Views: 46,169
 

 bear v1.0.0 for R...

 
Dear Helmut,

» in all these locations. In my example the path was set by R, and I was able to read it from the console by read.table() - but got the error from bear.

I see. I think you're talking about NCA output. You first did NCA and saved the output file. Then load this NCA output when you planned to do GLM. The error occurred at this moment. We will check that import function again. If you choose NCA--> GLM function, there is only one step to import.

» Maybe it's possible to implement a question/answer game with the user about his/her file's content and format. Otherwise I expect you will get a lot of e-mails...

That's what I am afraid of. lots of e-mails...:-(

» Since everything is possible, read.table() has these two arguments:
» sep the field separator character. Values on each line of the file are separated by this character. If sep = "" (the default for read.table) the separator is 'white space', that is one or more spaces, tabs, newlines or carriage returns.
» dec the character used in the file for decimal points.

You have proposed a very nice way to fix this problem. Yes, we will try to make if more flexible. Does a user know what these options for?
or still we are going to get lots of e-mails, anyway? :-(

Best regards,
Yung-jin
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-07-14 20:35

@ yjlee168
Posting: # 2032
Views: 46,296
 

 bear v1.0.0 for R...

 
Dear Yung-jin,

» I see. I think you're talking about NCA output. You first did NCA and saved the output file. Then load this NCA output when you planned to do GLM. The error occurred at this moment.

Exactly! I must confess, my description was not a proper bug report...

» You have proposed a very nice way to fix this problem. Yes, we will try to make if more flexible. Does a user know what these options for?

No idea. Sometimes I believe, that anybody in doing PK should now about how his/her file's content. But on the other hand, importing a file with field sep ';' and dec '.' also isn't straightforward as one might think (e.g., in WinNonlin):

Open -> Filetype ASCII Data (*.dat;*dta;*.prn;*.csv) -> oops, everything in the first column…
Ah, there's an Options-button, hhm, select X Automatically determine file options -> same result…
Next try: Field delimiter ;… working, but header in first row, etc.
WinNonlin has no option to select the decimal separator anyway.

Of course there are combinations which are impossible to resolve, e.g. the file nasty.csv:
x,y
1,2
1,5,2,5

where in the third row every import routine will end with an error. No program can 'guess', that it was produced with a badly configured German Excel (the two numbers 1.5 and 2.5) were meant...
If we change nasty.csv (assuming a German Excel) to:
t;c
1;2
1,5;2,5

WinNonlin would read in the file, but treat the third row as plain next.
Writing =A1+5 to cell C1 we get 6, but for =A2+5 in cell C2 we get #VALUE!...

» or still we are going to get lots of e-mails, anyway? :-(

Hard to tell. I wouldn't bother too much about makink the input routines 'foolproof', but invest a little effort in the help files telling which format is acceptable - and which one is not.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-15 09:44

@ Helmut
Posting: # 2034
Views: 46,232
 

 bear v1.0.0 for R - first impressions

 
Dear Helmut,

Aha, I finally know that why you had such an error. You did not conduct a NCA through bear, but jump directly into GLM analysis with your own prepared NCA test file which was a .csv format (TEST.CSV). If you did a NCA through bear, bear would generate a R format output file with file extension of .RData, not .CSV. Thus, in followed GLM step, bear would load a file of .RData by default, not .csv. That caused the error!
A nice mistake. You just remind us that we should also allow users to load a .csv file in the step of GLM. This explain your error.

» I failed in reading in the NCA test file. I tried it with tabulator and
» » » semicolon as field separators and got an error:
» » » Error in readChar(con, 5L, useBytes = TRUE) :
» » »   cannot open connection
» Hhm, I failed with colon as well (file: [path to file]/TEST.CSV).
» [code],subj,drug,seq,prd,Cmax,AUC0t,AUC0INF,LnCmax,LnAUC0t,LnAUC0INF

All the best,
Yung-jin
martin
Senior

Austria,
2008-07-14 12:18

@ yjlee168
Posting: # 2023
Views: 46,157
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
dear hsin-ya and yung-jin

I would like to acknowledge your work !!

I suggest that you may think about implementing a more sophisticated approach for estimation of the terminal elimination rate as currently only the last 3 data points are used:

[...] lambda, the terminal phase rate constant, will be estimated from the slope of linear regression with the last three data points [...]

when using the OLS (ordinary least squares) criteria for estimation, the terminal elimination rate can be defined as a weighted sum of log-transformed concentrations where the weights are determined by the sampling schedule. in the case of a larger time gap between the last two measurement time points compared to the first two measurement time points, the concentration at the last time point is weighted more than the remaining two concentrations. this may lead to an inaccurate estimate of the terminal elimination rate when the last concentration is inaccurate for some reason (e.g. a higher inaccuracy of a bioanalytical method when dealing with values near to the limit of detection).

more sophisticated and/or robust approaches for estimation of the terminal elimination rate are for example discussed in this post.

best regards

martin
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-14 19:02

@ martin
Posting: # 2030
Views: 46,222
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear Martin,

» I suggest that you may think about implementing a more sophisticated approach for estimation of the terminal elimination rate as currently only the last 3 data points are used:

At the beginning, when we planned how to calculate lambda in bear, we
have looked for many options. The question is that there seems no "standard"
method so far to calculate lambda. That's why we choose the last 3-point
regression method. It's simple and easy to implement. It's just like that
we use linear trapezoidal (LT) method to calculate AUC. Scientifically
speaking, we all don't think that LT is as accurate as many others. In
order to validate our results obtained from bear, we choose WinNonlin as
the "reference" to compare.

» [...] lambda, the terminal phase rate constant, will be estimated from the slope of linear regression with the last three data points [...]»
» when using the OLS (ordinary least squares) criteria for estimation, the terminal elimination rate can be defined as a weighted sum of log-transformed concentrations where the weights are determined by the sampling schedule. in the case of a larger time gap between the last two measurement time points compared to the first two measurement time points, the concentration at the last time point is weighted more than the remaining two concentrations. this may lead to an inaccurate estimate of the terminal elimination rate when the last concentration is inaccurate for some reason (e.g. a higher inaccuracy of a bioanalytical method when dealing with values near to the limit of detection).

You're absolutely right. We will see what we can do about the sophisticated method to calculate lambda.

» more sophisticated and/or robust approaches for estimation of the terminal elimination rate are for example discussed in this post.

A very good pointer and references there. Many thanks.

All the best,
Yung-jin
Aceto81
Regular

Belgium,
2008-07-15 10:07

@ yjlee168
Posting: # 2035
Views: 46,119
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
» » more sophisticated and/or robust approaches for estimation of the terminal elimination rate are for example discussed in this post.
» A very good pointer and references there. Many thanks.

Yung-jin,

first of all thanks for sharing the Bear package.
For the estimation of the terminal elimination rate and the number of points to pick, I already wrote a function in the past which uses the WinNonLin approach as in the thread mentioned here before.
Maybe it is of some help (with some adjustments for your code).
dat <- data.frame(time,conc)
  n_lambda=0
  r.adj=0
  for (i in (nrow(dat)-3):which.max(dat$conc)) {
    if (r.adj - round(summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared,4) < (0.0001))  {
      n_lambda <- nrow(dat)-i+1
      r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
    }
  }


Best regards

Ace

--
Edit: I changed the tabs in the post to serial blanks; otherwise on some displays the output gets confusing.
See also this post for updated code (preventing Cmax/tmax to be included). [Helmut]
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-16 07:22
(edited by Jaime_R on 2008-07-16 08:55)

@ Aceto81
Posting: # 2042
Views: 46,012
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear Ace,

Thank you very much for your sharing and sorry for the delay of this message. We will try to implement your algorithm in bear as the other options for users to choose. Do you have ref. for the algorithm/code you provide?

All the best,
Yung-jin

--
Edit: Full quote removed. Please see this post! [Jaime]
Aceto81
Regular

Belgium,
2008-07-16 09:47
(edited by Aceto81 on 2008-07-16 10:55)

@ yjlee168
Posting: # 2045
Views: 45,968
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear Yung-jin,

I'm sorry but the only reference I have was the thread mentioned above, and my comparison with WinNonLin.

But the good news: maybe I've got a solution for the csv problem with the , and ;
read.multi <- function(file) read.table(file,sep=ifelse(regexpr(";",scan(file,"character",1,quiet=T))>(-1),";",","),header=T)

Kind Regards

Ace

--
Edit: Full quote removed. Please see this post! [Ohlbe] Edit: Added the code [Ace]
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-16 21:32

@ Aceto81
Posting: # 2049
Views: 45,926
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear Ace,

Great! Thank you for your sharing codes. We will try to make bear as friendly as possible.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-23 12:13

@ Aceto81
Posting: # 2387
Views: 46,203
 

 lambda_z estimation

 
Dear Ace,

We've tried to implement your methods (codes) into bear now. But we find that the difference between your method and WinNonlin. Here is your method as R code:
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,36.1,125,567,963,1343,1739,1604,1460,797,383,72)
dat <- data.frame(time=b,conc=c)
n_lambda=1
r.adj=0

for (i in (nrow(dat)-3):which.max(dat$conc)) {
if (r.adj - round(summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared,4) <
(0.0001)) {
n_lambda <- nrow(dat)-i+1
r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
 }
}


And we got

Call:
lm(formula = log(conc) ~ time, data = dat[i:nrow(dat), ])

Residuals:
        7         8         9        10        11        12
-0.057481  0.009958  0.064143  0.051802 -0.088021  0.019598

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.815044   0.042438  184.15 5.22e-09 ***
time        -0.148249   0.003646  -40.66 2.19e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0677 on 4 degrees of freedom
Multiple R-squared: 0.9976,     Adjusted R-squared: 0.997
F-statistic:  1654 on 1 and 4 DF,  p-value: 2.186e-06


Your method picked 6 data points for this example. However, WinNonlin
picked 5 data points as follows:

Model:  Plasma Data, Extravascular Administration
Number of nonmissing observations:   12
Dose time:      0.00
Dose amount:    80000.00
Calculation method:  Linear Trapezoidal with Linear/Log Interpolation
Weighting for lambda_z calculations:  Uniform weighting
Lambda_z method:  Find best fit for lambda_z,  Log regression


Summary Table
-------------
      Time         Conc.      Pred.    Residual      AUC       AUMC      Weight
-------------------------------------------------------------------------------
     0.0000       0.0000                           0.0000     0.0000
     0.2500        36.10                            4.513      1.128
     0.5000        125.0                            24.65      10.07
     0.7500        567.0                            111.2      71.04
      1.000        932.0                            298.5      240.7
      1.500        1343.                            867.3      977.3
      2.000        1739.                            1638.      2350.
      3.000 *      1604.      1625.     -21.46      3309.      6495.      1.000
      4.000 *      1460.      1399.      60.79      4841. 1.182e+004      1.000
      8.000 *      797.0      768.3      28.73      9355. 3.625e+004      1.000
      12.00 *      383.0      421.8     -38.84 1.172e+004 5.820e+004      1.000
      24.00 *      72.00      69.83      2.172 1.445e+004 9.614e+004      1.000


*) Starred values were included in the estimation of Lambda_z.


Final Parameters
---------------
Rsq                                                       0.9979
Rsq_adjusted                                              0.9972
Corr_XY                                                  -0.9990
No_points_lambda_z                                        5
Lambda_z                                                  0.1499
Lambda_z_lower                                            3.0000
Lambda_z_upper                                           24.0000
HL_Lambda_z                                               4.6246
Tlag                                                      0.0000
Tmax                                                      2.0000
... (truncated after this line)


What are we missing here? Thanks.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
d_labes
Hero

Berlin, Germany,
2008-09-23 14:41

@ yjlee168
Posting: # 2390
Views: 45,501
 

 lambda_z estimation

 
Dear Yung-jin Lee,

I do not own WINNONLIN, but I suggest that the point tmax, Cmax should not be included in the terminal part.

I further suggest that the TTT (two-times tmax) rule should be implemented to restrict the number of points in the terminal part, provided that at least 3 points are available.
See this thread for the TTT rule.

May be the Akaike information criterion AIC (especially the AICc, small sample formula) is a better choice than adj. RSquare.

Regards,

Detlew
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-24 22:41

@ d_labes
Posting: # 2397
Views: 45,728
 

 lambda_z estimation

 
Dear d_labes,

Yes, Helmut also comments about this. He even said it should not include Tmax and Cmax even for 1-compartment model. I read previous threads as you and Helmut pointed. I think that it is not appropriate to include Tmax and Cmax to calculate lambdaz for a drug with a 2-compartment model. However, for a drug with 1-compartment model, theoretically, it should be o.k.. :ponder: But I will think about that later.
Currently, we do not plan to include Tmax and Cmax to estimate lambdaz.
Some automatic algorithms (such as that suggested by Ace or WinNonlin) may pick Cmax to estimate lambdaz. We now allow users to pick these data points visually and manually in bear.

We will try to implement the TTT rule in bear.

About using AICs to choose appropriate data points to estimate lambdaz is very interesting. As far as I know, AIC is used to do model selection or model discrimination. Basically, we will fit different models with the same observed data (points). I really don't know if we can use AICs to vary data points to the same model (linear regression). That's why we will not try to abandon or pick some data points from the observed dataset, when doing pk/pd modeling, to a model. But it is a very interesting and innovative approach.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
Aceto81
Regular

Belgium,
2008-09-25 10:56

@ d_labes
Posting: # 2401
Views: 45,490
 

 lambda_z estimation

 
» I do not own WINNONLIN, but I suggest that the point tmax, Cmax should not be included in the terminal part.
»
» I further suggest that the TTT (two-times tmax) rule should be implemented to restrict the number of points in the terminal part, provided that at least 3 points are available.
» See this thread for the TTT rule.
»
» May be the Akaike information criterion AIC (especially the AICc, small sample formula) is a better choice than adj. RSquare.

If the Cmax may not be included, this explains the difference between the R code and WinNonlin.
If you change
   for (i in (nrow(dat)-3):which.max(dat$conc))
to
   for (i in (nrow(dat)-3):(which.max(dat$conc)+1))
then Cmax is excluded.

And you would get the same result as WinNonlin.

Ace
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-25 21:00

@ Aceto81
Posting: # 2414
Views: 45,393
 

 lambda_z estimation

 
Dear Ace,

Got it! We'll try it again and will report the result back here soon.
Many thanks.

» If the Cmax may not be included, this explains the difference between the R code and WinNonlin.
» If you change
»    for (i in (nrow(dat)-3):which.max(dat$conc))
» to
»    for (i in (nrow(dat)-3):(which.max(dat$conc)+1))
» then Cmax is excluded.
»
» And you would get the same result as WinNonlin.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-26 08:03

@ Aceto81
Posting: # 2419
Views: 45,453
 

 lambda_z estimation

 
Dear Ace,

We tried three examples with codes as follows:

Ex.01
    b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
    c<-c(0,36.1,125,567,963,1343,1739,1604,1460,797,383,72)
    dat <- data.frame(time=b,conc=c)

    n_lambda=0
    r.adj=0

    for (i in (nrow(dat)-3):(which.max(dat$conc)+1)) {
    if (r.adj - round(summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared,4) <
    (0.0001)) {
    n_lambda <- nrow(dat)-i+1
    r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
    }
    }
    summary(lm(log(conc)~time,dat[i:nrow(dat),]))

    Call:
    lm(formula = log(conc) ~ time, data = dat[i:nrow(dat), ])

    Residuals:
           8        9       10       11       12
    -0.01329  0.04253  0.03672 -0.09658  0.03062

    Coefficients:
                 Estimate Std. Error t value Pr(>|t|)   
    (Intercept)  7.843188   0.050394  155.64 5.85e-07 ***
    time        -0.149881   0.003962  -37.83 4.06e-05 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 0.06733 on 3 degrees of freedom
    Multiple R-squared: 0.9979,     Adjusted R-squared: 0.9972
    F-statistic:  1431 on 1 and 3 DF,  p-value: 4.063e-05

Ex. 02
    b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
    d<-c(0,84.5,192,629,873,1246,1633,1375,1006,616,379,84.4)
    dat <- data.frame(time=b,conc=d)
    n_lambda=0
    r.adj=0
    for (i in (nrow(dat)-3):(which.max(dat$conc)+1)) {
    if (r.adj - round(summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared,4) <
    (0.0001)) {
    n_lambda <- nrow(dat)-i+1
    r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
    }
    }
    summary(lm(log(conc)~time,dat[i:nrow(dat),]))

    Call:
    lm(formula = log(conc) ~ time, data = dat[i:nrow(dat), ])

    Residuals:
           8        9       10       11       12
    0.11336 -0.07056 -0.04683 -0.01832  0.02236

    Coefficients:
                 Estimate Std. Error t value Pr(>|t|)   
    (Intercept)  7.498518   0.062411   120.1 1.27e-06 ***
    time        -0.128555   0.004907   -26.2 0.000122 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 0.08338 on 3 degrees of freedom
    Multiple R-squared: 0.9956,     Adjusted R-squared: 0.9942
    F-statistic: 686.5 on 1 and 3 DF,  p-value: 0.0001220

Ex. 03
    b<-c(0,0.5,0.75,1,1.5,2,3,4,8,12,24)
    c<-c(0,69.7,167,602,1023,1388,1481,1346,658,336,84)
    dat <- data.frame(time=b,conc=c)

    n_lambda=0
    r.adj=0

    for (i in (nrow(dat)-3):(which.max(dat$conc)+1)) {
    if (r.adj - round(summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared,4) <
    (0.0001)) {
    n_lambda <- nrow(dat)-i+1
    r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
    }
    }
    summary(lm(log(conc)~time,dat[i:nrow(dat),]))

    Call:
    lm(formula = log(conc) ~ time, data = dat[i:nrow(dat), ])

    Residuals:
           8        9       10       11
    0.13274 -0.03963 -0.16840  0.07528

    Coefficients:
                Estimate Std. Error t value Pr(>|t|)   
    (Intercept)   7.6155     0.1541   49.41 0.000409 ***
    time         -0.1358     0.0109  -12.46 0.006376 **
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 0.1631 on 2 degrees of freedom
    Multiple R-squared: 0.9873,     Adjusted R-squared: 0.9809
    F-statistic: 155.3 on 1 and 2 DF,  p-value: 0.006376


The final result showed that only Ex. 01 got the exactly same data points (5's) with that of WinNonlin. With Ex. 02 and Ex. 03, your method picked 5's and 4's data points, respectively, compared to 4's and 3's picked by WinNonlin. There were still something wrong with your method/codes.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
Aceto81
Regular

Belgium,
2008-09-26 10:25

@ yjlee168
Posting: # 2421
Views: 45,425
 

 lambda_z estimation

 
Dear Yung-jin,

Thanks for trying my codes with different examples, I learn a lot too.
So here we go:
It looks like the round was not a good idea, this was the reason for the failure of example 2.
For example 3: nrow(dat)-3:nrow(dat) gives 4 rows, instead of 3, so the code wasn't able to pick less than 4 points.

So here is the new code, I wraped it up to a function and run your samples through the function:
f<- function(dat) {
n_lambda=0
r.adj=0
for (i in (nrow(dat)-2):(which.max(dat$conc+1))) {
if (r.adj - summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared <
   (0.0001)) {
   n_lambda <- nrow(dat)-i+1
   r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
 }
}
return(n_lambda)
}


So for your examples:
ex1:
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,36.1,125,567,963,1343,1739,1604,1460,797,383,72)
dat <- data.frame(time=b,conc=c)
f(dat)
[1] 5


ex2:
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
d<-c(0,84.5,192,629,873,1246,1633,1375,1006,616,379,84.4)
dat <- data.frame(time=b,conc=d)
f(dat)
[1] 4


ex3:
b<-c(0,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,69.7,167,602,1023,1388,1481,1346,658,336,84)
dat <- data.frame(time=b,conc=c)
f(dat)
[1] 3


This equals to the WinNonlin results (finally....)

Ace
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-28 00:08

@ Aceto81
Posting: # 2428
Views: 45,649
 

 lambda_z estimation

 
Dear Ace,

Thanks for your assistance. I look through your codes and find that your codes pick the correct numbers of data points this time. So I change one line of your codes from return(n_lambda) to summary(lm(log(conc)~time,dat[(nrow(dat)-n_lambda+1):nrow(dat),])) because I want to see if the final results are the same with those obtained from WinNonlin. Our original code for this line is summary(lm(log(conc)~time,dat[i:nrow(dat),])) which apparently does not include n_lambda in there. So the final results are the following (dumping from R console):

f<- function(dat) {
  n_lambda=0
  r.adj=0
  for (i in (nrow(dat)-2):(which.max(dat$conc+1))) {
  if (r.adj - summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared <
     (0.0001)) {
     n_lambda <- nrow(dat)-i+1
     r.adj <- summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
  }
  }
  summary(lm(log(conc)~time,dat[(nrow(dat)-n_lambda+1):nrow(dat),]))
  }
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
<-c(0,36.1,125,567,963,1343,1739,1604,1460,797,383,72)
at <- data.frame(time=b,conc=c)
f(dat)

Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
       8        9       10       11       12
-0.01329  0.04253  0.03672 -0.09658  0.03062

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.843188   0.050394  155.64 5.85e-07 ***
time        -0.149881   0.003962  -37.83 4.06e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.06733 on 3 degrees of freedom
Multiple R-squared: 0.9979,     Adjusted R-squared: 0.9972
F-statistic:  1431 on 1 and 3 DF,  p-value: 4.063e-05

b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
d<-c(0,84.5,192,629,873,1246,1633,1375,1006,616,379,84.4)
dat <- data.frame(time=b,conc=d)
f(dat)

Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
         9         10         11         12
-0.0057874 -0.0002764  0.0100142 -0.0039504

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.4155262  0.0081685   907.8 1.21e-06 ***
time        -0.1240003  0.0005776  -214.7 2.17e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.008645 on 2 degrees of freedom
Multiple R-squared:     1,      Adjusted R-squared: 0.9999
F-statistic: 4.609e+04 on 1 and 2 DF,  p-value: 2.170e-05

b<-c(0,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,69.7,167,602,1023,1388,1481,1346,658,336,84)
dat <- data.frame(time=b,conc=c)
f(dat)

Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
       9       10       11
 0.07269 -0.09692  0.02423

Coefficients:
            Estimate Std. Error t value Pr(>|t|) 
(Intercept)  7.42148    0.16961   43.76   0.0145 *
time        -0.12562    0.01049  -11.97   0.0530 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1236 on 1 degrees of freedom
Multiple R-squared: 0.9931,     Adjusted R-squared: 0.9861
F-statistic: 143.4 on 1 and 1 DF,  p-value: 0.05305

Bingo! The final results are exactly the same as those obtained from WinNonlin with these three examples. We will test more examples to confirm this.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-29 13:32

@ Aceto81
Posting: # 2437
Views: 45,339
 

 lambda_z estimation

 
Dear Ace,

This is my second post replied to your previous post. In my first replied message, I said that the three examples (Ex. 01-03) that we tested before were all matched the data points picked by WinNonlin (WNL). Then we go further to test three more examples (Ex. 04-06) and we find that both your method and WNL may pick (Tmax, Cmax) to estimate lambdaz with Ex. 04 and Ex. 06. Please see the following results.

Ace -Ex. 04
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,30.1,211,1221,1485,1837,1615,1621,1411,763,424,109)
dat <- data.frame(time=b,conc=c)
...truncated here (due to limited characters; plz see previous posts)
Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
       6        7        8        9       10       11       12
0.01135 -0.05373  0.07742  0.06613 -0.03889 -0.11662  0.05434

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.695712   0.043288  177.78 1.07e-10 ***
time        -0.127446   0.004011  -31.77 5.80e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.07931 on 5 degrees of freedom
Multiple R-squared: 0.9951,     Adjusted R-squared: 0.9941
F-statistic:  1010 on 1 and 5 DF,  p-value: 5.8e-07


WNL - Ex. 04
...truncated here

Summary Table
-------------
      Time         Conc.      Pred.    Residual      AUC       AUMC      Weight
-------------------------------------------------------------------------------
     0.0000       0.0000                           0.0000     0.0000
     0.2500        30.10                            3.763     0.9406
     0.5000        211.0                            33.90      15.07
     0.7500        1221.                            212.9      142.7
      1.000        1485.                            551.2      442.8
      1.500 *      1837.      1816.      20.72      1382.      1503.      1.000
      2.000 *      1615.      1704.     -89.15      2245.      2999.      1.000
      3.000 *      1621.      1500.      120.8      3863.      7046.      1.000
      4.000 *      1411.      1321.      90.29      5379. 1.230e+004      1.000
      8.000 *      763.0      793.3     -30.25      9727. 3.580e+004      1.000
      12.00 *      424.0      476.4     -52.45 1.210e+004 5.818e+004      1.000
      24.00 *      109.0      103.2      5.765 1.530e+004 1.044e+005      1.000


*) Starred values were included in the estimation of Lambda_z.
...truncated here


Ace - Ex.05
b<-c(0,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,38.2,277,631,1002,1780,1776,1618,782,466,89.7)
... truncated here ...

Residuals:
        9        10        11
-0.010928  0.014570 -0.003643

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.759117   0.025498   304.3  0.00209 **
time        -0.135792   0.001577   -86.1  0.00739 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

... truncated here

WNL - Ex. 05
[code]... truncated here
Summary Table
-------------
      Time         Conc.      Pred.    Residual      AUC       AUMC      Weight
-------------------------------------------------------------------------------
     0.0000       0.0000                           0.0000     0.0000
     0.5000        38.20                            9.550      4.775
     0.7500        277.0                            48.95      33.13
      1.000        631.0                            162.5      138.0
      1.500        1002.                            570.7      671.5
      2.000        1780.                            1266.      1937.
      3.000        1776.                            3044.      6381.
      4.000        1618.                            4741. 1.228e+004
      8.000 *      782.0      790.6     -8.592      9541. 3.774e+004      1.000
      12.00 *      466.0      459.3      6.741 1.204e+004 6.143e+004      1.000
      24.00 *      89.70      90.03    -0.3273 1.537e+004 1.079e+005      1.000
...truncated here


Ace - Ex. 06
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,32.8,181,271,402,783,2073,1842,1610,883,389,75.8)
...truncated here
Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
       7        8        9       10       11       12
-0.01308  0.02206  0.04072  0.05320 -0.15341  0.05052

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.956404   0.055392  143.64 1.41e-08 ***
time        -0.153284   0.004759  -32.21 5.54e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.08836 on 4 degrees of freedom
Multiple R-squared: 0.9962,     Adjusted R-squared: 0.9952
F-statistic:  1038 on 1 and 4 DF,  p-value: 5.537e-06


WNL - Ex. 06
...truncated here

Summary Table
-------------
      Time         Conc.      Pred.    Residual      AUC       AUMC      Weight
-------------------------------------------------------------------------------
     0.0000       0.0000                           0.0000     0.0000
     0.2500        32.80                            4.100      1.025
     0.5000        181.0                            30.83      13.36
     0.7500        271.0                            87.33      50.08
      1.000        402.0                            171.5      125.7
      1.500        783.0                            467.7      519.9
      2.000 *      2073.      2100.     -27.30      1182.      1850.      1.000
      3.000 *      1842.      1802.      40.18      3139.      6686.      1.000
      4.000 *      1610.      1546.      64.25      4865. 1.267e+004      1.000
      8.000 *      883.0      837.3      45.74      9851. 3.968e+004      1.000
      12.00 *      389.0      453.5     -64.50 1.240e+004 6.314e+004      1.000
      24.00 *      75.80      72.07      3.734 1.518e+004 1.021e+005      1.000

*) Starred values were included in the estimation of Lambda_z.
... truncated here


Looks like that your method and WNL are quite consistent in data point selection for estimation of lambdaz now. Question is that both methods still cannot absolutely rule out the data point of Cmax (e.g. Ex 04 and Ex. 06).

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-09-29 15:54

@ yjlee168
Posting: # 2438
Views: 45,666
 

 WinNonlin 5.2.1 vs. 6 beta

 
Dear Yung-jin,

wonderful!
I don't know which version of WinNonlin you are using, but I could reproduce your results of Ex.04 in the current version v5.2.1. This is contradictory to the manual and the online help.
Now for the weird part. I tried the dataset also in WinNonlin v6 beta Phoenix and got:
Summary Table
-------------
      Time         Conc.      Pred.    Residual      AUC       AUMC      Weight
-------------------------------------------------------------------------------
     0.0000       0.0000                           0.0000     0.0000
     0.2500        30.10                            3.763     0.9406
     0.5000        211.0                            33.90      15.07
     0.7500        1221.                            212.9      142.7
      1.000        1485.                            551.2      442.8
      1.500        1837.                            1382.      1503.
      2.000        1615.                            2245.      2999.
      3.000        1621.                            3863.      7046.
      4.000        1411.                            5379. 1.230e+004
      8.000 *      763.0      728.2      34.77      9727. 3.580e+004      1.000
      12.00 *      424.0      451.2     -27.20 1.210e+004 5.818e+004      1.000
      24.00 *      109.0      107.3      1.681 1.530e+004 1.044e+005      1.000


*) Starred values were included in the estimation of Lambda_z.


Final Parameters
---------------
Rsq                                                       0.9968
Rsq_adjusted                                              0.9937
Corr_XY                                                  -0.9984
No_points_lambda_z                                        3
Lambda_z                                                  0.1197
Lambda_z_lower                                            8.0000
Lambda_z_upper                                           24.0000
HL_Lambda_z                                               5.7919
...

As you can see only the last three data points were picked, although according to the manual the algorithm should be the same as in previous versions. :confused:
For both cases we should consider contacting Pharsight's support to clarify the issue.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-29 18:45
(edited by yjlee168 on 2008-09-29 19:16)

@ Helmut
Posting: # 2439
Views: 45,791
 

 WinNonlin 5.2.1 vs. 6 beta

 
Dear Helmut,

We used WNL v5.1.1 here. What WNL actually does in estimation of lambdaz is to include the last three data points first. And calculate its adjusted R2 for this step. Then include one more data point (i.e. the 4th data point from the last) and calculate adj. R2 again. If the difference between these two adj. R2s is less than or equal to 0.0001, then stop including further. If not, then include the 4th data points and add one more data points (i.e. the 5th data point from the last)... It will not stop until it reaches Cmax. That's why WNL may include Cmax to estimate lambdaz. WNL should stop including at the data point next to Cmax. The result you got from WNL v6 beta seems not meet its stop criterion. The adj. R2 for Ex. 04 including Cmax is 0.9941 win WNL v5.1.1 (or your WNL 5.2.1); however, v6 beta got 0.9937 only.

WLN v6 beta - Ex. 04 (quoted from your previous post)
...
 Final Parameters
 ---------------
 Rsq                                                       0.9968
 Rsq_adjusted                                              0.9937
...

WLN - Ex. 04
Model:  Plasma Data, Extravascular Administration
Number of nonmissing observations:   12
Dose time:      0.00
Dose amount:    80000.00
Calculation method:  Linear Trapezoidal with Linear Interpolation
Weighting for lambda_z calculations:  Uniform weighting
Lambda_z method:  Find best fit for lambda_z,  Log regression

Summary Table
-------------
      Time         Conc.      Pred.    Residual      AUC       AUMC      Weight
-------------------------------------------------------------------------------
     0.0000       0.0000                           0.0000     0.0000
     0.2500        30.10                            3.763     0.9406
     0.5000        211.0                            33.90      15.07
     0.7500        1221.                            212.9      142.7
      1.000        1485.                            551.2      442.8
      1.500 *      1837.      1816.      20.72      1382.      1503.      1.000
      2.000 *      1615.      1704.     -89.15      2245.      2999.      1.000
      3.000 *      1621.      1500.      120.8      3863.      7046.      1.000
      4.000 *      1411.      1321.      90.29      5379. 1.230e+004      1.000
      8.000 *      763.0      793.3     -30.25      9727. 3.580e+004      1.000
      12.00 *      424.0      476.4     -52.45 1.210e+004 5.818e+004      1.000
      24.00 *      109.0      103.2      5.765 1.530e+004 1.044e+005      1.000

*) Starred values were included in the estimation of Lambda_z.

Final Parameters
---------------
Rsq                                                       0.9951
Rsq_adjusted                                              0.9941
Corr_XY                                                  -0.9975
No_points_lambda_z                                        7
Lambda_z                                                  0.1274
Lambda_z_lower                                            1.5000
Lambda_z_upper                                           24.0000


» For both cases we should consider contacting Pharsight's support to clarify the issue.

Yes, I agree with you. However, if you don't agree to include Cmax to estimate lambdaz, then WNL (v5.x.x) should not be used for this purpose.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-30 08:10

@ Helmut
Posting: # 2440
Views: 45,415
 

 WinNonlin 5.2.1 vs. 6 beta --> new finding

 
Dear Helmut,

I tried to eliminate Cmax from data point selection for lambdaz estimation with Ace's method (codes/algorithms). I found something interesting. WinNonlin (WNL) v6 beta does actually exclude Cmax from point selection! I got the same results obtained from modified Ace's method as WNL v6 beta did (your previous post) with Ex.04 (and probably Ex. 06, too).
In my previous post, the algorithm I mentioned about data point selection by WNL may not be correct. As the matter of the fact, WNL v5.x.x calculates each adj. R2 value starting with the last three data points, then the last 4 data points, the last 5 data points... until including Cmax. Finally it picks the dataset which had the maximum adj. R2 value among others to do linear regression for estimation of lambdaz. However, with WNL v6 beta, the calculation or searching will stop at the data point next to Cmax. So I changed the line of for (i in (nrow(dat)-2):(which.max(dat$conc+1))) with Ace's method to for (i in (nrow(dat)-2):(which.max(dat$conc))+1) (R scripts). I found that I could successfully eliminate Cmax from data points selection. Then the results surprised me! We had the same result as those you got from WNL v6 beta! See the following (pasted from R console).

Ex. 04
Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
      10       11       12
 0.04664 -0.06218  0.01555

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.548030   0.108820   69.36  0.00918 **
time        -0.119676   0.006731  -17.78  0.03577 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.07927 on 1 degrees of freedom
Multiple R-squared: 0.9968,     Adjusted R-squared: 0.9937
F-statistic: 316.1 on 1 and 1 DF,  p-value: 0.03577

Ex. 06
Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda +
    1):nrow(dat), ])

Residuals:
       8        9       10       11       12
 0.01676  0.03580  0.04976 -0.15536  0.05303

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.962810   0.076072  104.67 1.92e-06 ***
time        -0.153656   0.005981  -25.69 0.000129 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1016 on 3 degrees of freedom
Multiple R-squared: 0.9955,     Adjusted R-squared: 0.994
F-statistic: 660.1 on 1 and 3 DF,  p-value: 0.0001293


Ex. 05 had the same result as the that obtained from previous version. It happened when the dataset that included Cmax did not have the maximum adj. R2. If you like, you can test Ex. 06 (WNL v5.x.x included the Cmax with this example, too) with WNL v6 beta to see if you can get the exact same results as ours. It should, I guess. We will still test more data set. If no other errors, we will include this method into bear. Time to move to TTT method coding...

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
d_labes
Hero

Berlin, Germany,
2008-09-30 09:43
(edited by d_labes on 2008-09-30 10:08)

@ yjlee168
Posting: # 2441
Views: 45,407
 

 Example 4 in SASophylistic

 
Dear folks,

if you need an independent opinion (no R, no Far side but 'the power to know' :-D ) here are the results of log-linear regression with SAS for example 4 up to 6 decimals, without point Cmax, tmax:
N used lambdaZ     Rsq      Adj.Rsq
 3     0.119676  0.996846  0.993692
 4     0.126202  0.993931  0.990895
 5     0.128630  0.994243  0.992324
 6     0.127205  0.994425  0.993032

If I understood the algorithm right, 3 points in the terminal phase should be the solution.

From WinNONLIN users guide (dont know which version, found it on the Inder-net):
"[...] WinNonlin repeats regressions using the last three points with non-zero concentrations, then the last four points, last five, etc. Points prior to Cmax [...] are not used [...] For each regression, an adjusted R2 is computed [...] The regression with the largest adjusted R2 is selected to estimate lambdaZ, with these caveats:
• If the adjusted R2 does not improve, but is within 0.0001 of the largest adjusted R2 value, the regression with the larger number of points is used.
• lambdaZ must be positive, and calculated from at least three data points
."

If you use Cmax, tmax one gets:
N used lambdaZ     Rsq      Adj.Rsq
 7     0.127446  0.995071  0.994086

If the algorithm is not serial (stop if no increase in adjR2), and this seems to be the case, then Nused=7;

Regards,

Detlew
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-30 12:23

@ d_labes
Posting: # 2445
Views: 45,165
 

 Example 4 in SASophylistic

 
Dear DLabes,

You're correct about the method of data point selection by WNL. Thanks.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-30 12:34

@ d_labes
Posting: # 2446
Views: 45,400
 

 TTT method for lambda_z estimation

 
Dear DLabes,

We have some initial results for the estimation of lambdaz using TTT method. The following are the output obtained from R console.
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,30.1,211,1221,1485,1837,1615,1621,1411,763,424,109)
dat <- data.frame(time=b,conc=c)
n_lambda=0
x<-which.max(dat$conc)
TTT<-dat[x,1]*2
for (i in (nrow(dat)-2):(which.max(dat$conc)+1)) {
 if (dat$time[[i]] < TTT) {
    n_lambda <- nrow(dat)-i
  }
 }
dat
    time   conc
1   0.00    0.0
2   0.25   30.1
3   0.50  211.0
4   0.75 1221.0
5   1.00 1485.0
6   1.50 1837.0 <-- Tmax
7   2.00 1615.0
8   3.00 1621.0 <-- Two Tmax Time (TTT)
9   4.00 1411.0
10  8.00  763.0
11 12.00  424.0
12 24.00  109.0
summary(lm(log(conc)~time,dat[(nrow(dat)-n_lambda)+1:nrow(dat),]))

Call:
lm(formula = log(conc) ~ time, data = dat[(nrow(dat) - n_lambda) +
    1:nrow(dat), ])

Residuals:
       8        9       10       11       12  <--- data points selected
 0.06042  0.05031 -0.04997 -0.12297  0.06220

Coefficients:
             Estimate Std. Error t value Pr(>|t|)   
(Intercept)  7.716264   0.071884  107.34 1.78e-06 ***
time        -0.128630   0.005651  -22.76 0.000186 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.09604 on 3 degrees of freedom
  (7 observations deleted due to missingness)
Multiple R-squared: 0.9942,     Adjusted R-squared: 0.9923
F-statistic: 518.1 on 1 and 3 DF,  p-value: 0.0001857

It seems not so difficult to implement TTT method using R. Meanwhile, it is easy to validate visually. Thanks.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
d_labes
Hero

Berlin, Germany,
2008-09-30 14:48

@ yjlee168
Posting: # 2449
Views: 45,287
 

 TTT method for lambda_z estimation

 
Dear Yung-jin Lee,

» We have some initial results for the estimation of lambdaz using TTT method.

Congratulation.

Because I'm not an useR! I cannot totally follow your code.
But I think your implementation is the TTT rule as invented: Take all points after 2 x tmax for estimating lambdaZ. Am I right?

If yes, I wish to specify my suggestion further:
The TTT rule is only valid if the concentration time course resembles a one-compartmental curve (see the paper of the inventors for this1)).
Thus my suggestion is to restrict the maximum number of time points to use for lambdaZ estimation to the TTT range and in that range apply the search for points with adjR2 or whatever fit criterion you wish to apply.

I have no systematic evidence but after some cursory case evaluations I intuitively feel (ok this is not very scientific :-D ) that this method should choose reasonable number of points for the terminal phase, although in cases where the curves do not resemble one-compartmental shape.

1) Scheerans C, Derendorf H and C Kloft
Proposal for a Standardised Identification of the Mono-Exponential Terminal Phase for Orally Administered Drugs
Biopharm Drug Dispos 29, 145–157 (2008)

Regards,

Detlew
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-30 20:28

@ d_labes
Posting: # 2453
Views: 45,280
 

 TTT method for lambda_z estimation

 
Dear DLabes,

useR! <-- this is marvelous. I really like it very much.
» But I think your implementation is the TTT rule as invented: Take all points after 2 x tmax for estimating lambdaZ. Am I right?

Yes. We follow your suggestion to implement TTT method.

» Thus my suggestion is to restrict the maximum number of time points to use for lambdaZ estimation to the TTT range and in that range apply the search for points with adjR2 or whatever fit criterion you wish to apply.

You're absolutely right. Question is how we are going to set the maximum number of time points? How to define a reasonable number of points? If there is a acceptable number, your suggestion should be easy to implement. It will sound like the combination way of the TTT method with adj. R2 (Ace's proposed algorithm and WinNonlin method).

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
d_labes
Hero

Berlin, Germany,
2008-10-01 08:43
(edited by d_labes on 2008-10-01 08:59)

@ yjlee168
Posting: # 2457
Views: 45,277
 

 TTT method plus best fit combined

 
Dear Yung-jin Lee,

» useR! <-- this is marvelous. I really like it very much.

Compensation for that is that I have 'The power to know'!!! :cool:
This is, on what grounds ever, seen as 'industrial standard' in the pharmaceutical industry I'm employed in.

» [...] Question is how we are going to set the maximum number of time points? How to define a reasonable number of points?

Let us take our example 4.
My suggested algorithm is: Take all the points from TTT to last as maximum number to use, i.e. from t=3 to 24 h (points 8-12). Within that range apply the maximum adjR2 algorithm (or an other best fit).
For example 4 this would give the last 3 points.

In that example the result coincides with the adjR2 method using all points after Cmax. For other curves it must not be.

The combination of TTT with a best fit method avoids the choosing of to many points, a common flaw seen with the adjR2 method (see our thread here on TTT).
Further on it restricts the computation time of the best fit method (adjR2 or whatever) if this is not implemented in a serial way.

Regards,

Detlew
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-10-02 12:40

@ d_labes
Posting: # 2465
Views: 45,109
 

 AIC or ARS as the best fit criterion?

 
Dear DLabes,

» Within that range apply the maximum adjR2 algorithm (or an other best fit).
» For example 4 this would give the last 3 points.

I've got your points. Great. Now which criterion of the best fit should we use for the combo method? AIC or ARS? Looks like more people will vote for AIC, I guess. It should be implementable, too, with R. or both? Leave the option to the users to decide?

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
d_labes
Hero

Berlin, Germany,
2008-10-02 13:59

@ yjlee168
Posting: # 2466
Views: 45,006
 

 AIC or ARS as the best fit criterion?

 
Dear Yung-jin,

» [...] Now which criterion of the best fit should we use for the combo method? AIC or ARS? Looks like more people will vote for AIC, I guess. It should be implementable, too, with R. or both? Leave the option to the users to decide?

My personal choice is AIC, but with the full formula (see this message) to account for the varying number of data points.
Eventually the small sample AICc should be considered:
AICc=AIC + 2*p*(p+1)/(n-p-1)
    =AIC + 12/(n-3) with p=2

But with that n=3 is a problem (I have dealt with in an empirical manner).

But this is only an opinion. Therefore a choice from the user would be a good idea I think :yes:.

Regards,

Detlew
Aceto81
Regular

Belgium,
2008-09-30 15:11

@ yjlee168
Posting: # 2451
Views: 45,159
 

 TTT method for lambda_z estimation

 
» It seems not so difficult to implement TTT method using R. Meanwhile, it is easy to validate visually. Thanks.

without the for loop:
b<-c(0,0.25,0.5,0.75,1,1.5,2,3,4,8,12,24)
c<-c(0,30.1,211,1221,1485,1837,1615,1621,1411,763,424,109)
dat <- data.frame(time=b,conc=c)
x<-which.max(dat$conc)
summary(lm(log(conc)~time,dat[dat$time>=(dat$time[x]*2),]))


Ace
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-30 19:57

@ Aceto81
Posting: # 2452
Views: 45,135
 

 TTT method for lambda_z estimation

 
Dear Ace,

Genius. It works well too. Much, much better algorithm than ours. We learn a lot.

Many thanks.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
Aceto81
Regular

Belgium,
2008-10-01 14:18

@ yjlee168
Posting: # 2460
Views: 45,122
 

 TTT method for lambda_z estimation

 
» Genius. It works well too. Much, much better algorithm than ours. We learn a lot.

We all learn I think.

A suggestion to implement the number of points used: let the user choose which method to go for (I like d_labes suggestion of the combination), but it shouldn't be to hard to implement all of them by the use of some simple if statements:
method.chosen <- c("WNL","TTT","combination")
if (method.chosen=="WNL") {
 n_lambda <- function.Ace(dat)
 }
if (method.chosen=="TTT") {
 n_lambda <- sum(dat$time>=dat$time[which.max(dat$conc)]*2)
 }
if(method.chosen=="combination" {
 n_lambda <- function.Ace(dat[(dat$time>=dat$time[which.max(dat$conc)]*2),])
 }
summary(lm(log(conc)~time,dat[(nrow(dat)-n_lambda+1):nrow(dat),]))

This can only work when you make a function of my previous code which returns the number of points used (function.Ace)

Ace
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-10-02 12:29

@ Aceto81
Posting: # 2464
Views: 45,148
 

 TTT method for lambda_z estimation

 
Dear Ace,

Yes, we've been planning to do that just following what you've proposed here.
Thank you so much for always being so helpful.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
Aceto81
Regular

Belgium,
2008-10-03 15:33

@ yjlee168
Posting: # 2473
Views: 45,147
 

 TTT method for lambda_z estimation

 
» Yes, we've been planning to do that just following what you've proposed here.
» Thank you so much for always being so helpful.

Hello,

if I've got some time, I'm happy to help you.
As for the selection of number of points used, I wrote a quick function, which gives you an overview:
f <- function(dat){
  m <- which.max(dat$conc)
  f2 <-  function(m) return(cbind((nrow(dat)-m+1),abs(extractAIC(lm(log(conc)~time,dat[m:nrow(dat),])))[2],summary(lm(log(conc)~time,dat[m:nrow(dat),]))$adj.r.squared))
  overview <- as.data.frame(do.call(rbind,lapply((m+1):(nrow(dat)-2),f2)))
  names(overview) <- c("n","AIC","adjR2")
  n_ARS=0
  r.adj=0
  for (i in (nrow(dat)-2):(which.max(dat$conc)+1)) {
    if (r.adj - summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared <(0.0001)) {
      n_ARS = nrow(dat)-i+1
      r.adj = summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
      }
    }   

  n_TTT_ARS=0
  r.adj2=0
  for (i in (nrow(dat)-2):(min(seq_along(dat$time)[dat$time>=dat$time[which.max(dat$conc)]*2]))) {
    if (r.adj2 - summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared <(0.0001)) {
      n_TTT_ARS = nrow(dat)-i+1
      r.adj2 = summary(lm(log(conc)~time,dat[i:nrow(dat),]))$adj.r.squared
    }
  }   

  require(PK)
  l <- lee(conc=dat$conc, time=dat$time)
  n_lee <- sum(dat$time> l$chgpt)
  n_TTT <- sum(dat$time> (dat$time[which.max(dat$conc)]*2))
  n_AIC <- overview$n[which.min(abs(overview$AIC))]   plot(l,log="y")
  print(overview)
  cat("\n")
  return(data.frame(TTT=n_TTT, AIC=n_AIC, ARS=n_ARS,TTT_ARS=n_TTT_ARS,lee=n_lee))
}

The code makes an comparison between TTT, AIC criterium, ARS, the combination of TTT and ARS and the lee (although sometimes lee doesn't work according to the lt parameter set).

You can adjust the lee parameters by typing it into the function eg. f(dat,lt=TRUE, method="ols")

Maybe it's helpfull for you

Ace
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-10-03 21:22

@ Aceto81
Posting: # 2476
Views: 45,065
 

 TTT method for lambda_z estimation

 
Dear Ace,

Thank you so much. Your codes just conclude the thread discussion of lambdaz using TTT method with a best of fit criterion (either ARS or AIC). :clap: We will try your codes asap. :ok:

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-07-22 14:35
(edited by ElMaestro on 2008-07-22 15:01)

@ yjlee168
Posting: # 2071
Views: 45,929
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Hi Hsin-ya & Yung-jin,

this effort is much appreciated. Thanks a lot, it must have taken you a lot of time.
Please allow me to ask few questions or give a few comments:
  1. I typed "demoNCAGLM" and got a nice function dump.
    From this I get the impression that the ANOVA in fact is based on lm, not GLM; for example "lm(LnAUC0INF ~ seq + subj + prd + drug, data = TotalData)"?!
    Is this intended - shouldn't it be glm?

  2. The intrinsic R glm function does afair not allow for random effects (in contrast to glm in SAS); analysing datasets with a syntax like "glm(LnAUC0INF ~ seq + subj + prd + drug)" might treat subj as fixed. What's the impact of that?

  3. As far as I remember, both lm and glm are unsafe in cases with imbalance for crossover studies (can't find a ref right now, though, sorry about it). I am under the impression that this type of analysis should be done explicitly with a mixed model (LME, using the REML approach) to ascertain you get the correct sigma.

  4. Although Sums-of-Squares is pretty much a mine field, I think it would be wise to at least implement an option of getting type III SS. Some regulators seem unwilling to accept anything other than type III, while others couldn't care less (I have it on good account :-D). This means that the choice of type III is a safe choice for the time being.
These are just thoughts. I am no expert (neither in statistics nor in BE) I should add.

Best regards,
EM
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-23 09:55
(edited by yjlee168 on 2008-07-23 12:57)

@ ElMaestro
Posting: # 2072
Views: 45,941
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear EM,

» From this I get the impression that the ANOVA in fact is based on lm, not GLM; for example "lm(LnAUC0INF ~ seq + subj + prd + drug, data = TotalData)"?! Is this intended - shouldn't it be glm?
»
» 2. The intrinsic R glm function does afair not allow for random effects (in contrast to glm in SAS); analysing datasets with a syntax like "glm(LnAUC0INF ~ seq + subj + prd + drug)" might treat subj as fixed.

Exactly. You're right. We use "lm" function to run anova, not "glm" from R. However, the outputs we have tested with one BE dataset are exactly the same with those generated by SAS with proc glm; statement.
Yes, we tried "glm" at the beginning, and got some different results from it. Then we changed to "lm". Bingo! Your comments exactly help to solve something that we don't know before.

» 3. As far as I remember, both lm and glm are unsafe in cases with imbalance for crossover studies (can't find a ref right now, though, sorry about it). I am under the impression that this type of analysis should be done explicitly with a mixed model (LME, using the REML approach) to ascertain you get the correct sigma.

Our package is built currently only for the 2-trt x 2-per x 2-seq, balanced, non-replicated, cross-over designed BE study. The mixed model (such as proc mixed; in SAS or "lme" with REML) may be suitable for
replicated, imbalanced, cross-over designed BE study. We have planned to add more functions later. At that moment, "lme" will be used.
Ref.: http://www.fda.gov/cder/guidance/3616fnl.pdf US FDA Guidance for Industry: Statistical approaches to establishing bioequivalence, Jan, 2001.

» 4. Although Sums-of-Squares is pretty much a mine field, I think it would be wise to at least implement an option of getting type III SS. Some regulators seem unwilling to accept anything other than type III, while others couldn't care less (I have it on good account :-D).

Yes, we add "type III SS" in the next release. We're fixing bear right now. Hopefully it can be done before the end of July.

» ...I am no expert (neither in statistics nor in BE) I should add.

You certainly are an expert in BE ;-). Thank you for your comments.


Edit: Link corrected for FDA’s new site. [Helmut]

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-07-24 10:04
(edited by ElMaestro on 2008-07-24 12:18)

@ yjlee168
Posting: # 2076
Views: 46,216
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear Yung-jin,

I don't think the software should state it uses GLM if it doesn't. But that's of course only a personal view and I mean no offense at all.

Regarding proc mixed and proc GLM in SAS my impression is* that for balanced data they achieve the same thing, but proc mixed is more computationally intensive. Therefore, in the old days when computers were slow GLM was preferred. Nowadays, when computational power is not an issue it makes little difference.
For data without balance, however, proc mixed is used and proc GLM may be unsafe. All in all, one can say it is always kind of safe to use proc mixed.

Thus, I'd suggest you to use a mixed model the following way:
"lmeYungjin <- LnAUC0INF ~ seq + prd + drug, random=~1|subj, method="REML")
From the lmeYungjin fit you can extract the correct sigma (which is correct whether or not you have balance) and the treatment difference and use this to construct the 90% CI.

Next, you can do a glm as follows:
"glmYungjin <- glm(LnAUC0INF ~ seq + prd + drug + subj)"
followed by "drop1(glmYungjin, test="F") to get a glm anova with type III SS.

Last you just need to update the anova with the correct error term for the sequence effect (between subj error).

Result: Anova and 90% CI done the way SASoholics like and it is resistant to imbalance....

Best regards
EM


*: When I say "my impression is" then it of course means I could be completely wrong and I'd be happy to stand corrected accordingly.
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-07-25 20:38

@ ElMaestro
Posting: # 2088
Views: 45,730
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear EM,

Yes, so we've decided to used "ANOVA(lm)" instead of "GLM" in bear. The "lme" method from package nlme will be implemented and tested. And "lme" will be used to replace current "lm" if everything runs o.k..

Thank you for your suggestions.

» "lmeYungjin <- LnAUC0INF ~ seq + prd + drug, random=~1|subj, method="REML")

Should this be lmeYungjin <- lme(LnAUC0INF...)?

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-07-28 08:42

@ yjlee168
Posting: # 2089
Views: 45,662
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
» » "lmeYungjin <- LnAUC0INF ~ seq + prd + drug, random=~1|subj, method="REML")
»
» Should this be lmeYungjin <- lme(LnAUC0INF...)?

Yes, that's correct. Sorry about the syntactical typo.

EM.
ElMaestro
Hero

Denmark,
2008-09-23 16:10

@ yjlee168
Posting: # 2391
Views: 45,425
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
» Yes, we tried "glm" at the beginning, and got some different results from it. Then we changed to "lm". Bingo! Your comments exactly help to solve something that we don't know before.

Hi again,

I am in principle a little surprised to hear that R's glm is not able to give you what you want, while lm is. glm and lm -and the accompanying anova- should give pretty much the same result as long as you don't play around with glm's functionality for the error distribution.

Do you have a sample dataset that illustrates the problem you identified with glm?

Best regards
EM.
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-24 22:00

@ ElMaestro
Posting: # 2396
Views: 45,505
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear EM,

Thanks for your comments. We went back to test glm in bear again. And find that you're correct about glm. Basically, both lm and glm generate the same results with the tested dataset. The following are the Type I SS and Type III SS obtained from lm and glm.

lm - Type I SS
glmData <-lm(Cmax ~ sequence+ period + drug +subject, data=KK)
anova(glmData)

Analysis of Variance Table

Response: Cmax
          Df Sum Sq Mean Sq F value Pr(>F)
sequence   1    585     585  0.0160 0.9014
period     1  63175   63175  1.7293 0.2131
drug       1  58149   58149  1.5917 0.2311
subject   12 634325   52860  1.4469 0.2660
Residuals 12 438395   36533


lm - Type III SS
drop1(glmData, test="F")

Single term deletions

Model:
Cmax ~ sequence + period + drug + subject
         Df  Sum of Sq     RSS     AIC F value  Pr(F)
<none>                  438395     302               
sequence  0 -4.657e-10  438395     302               
period    1      63175  501570     304  1.7293 0.2131
drug      1      58149  496544     304  1.5917 0.2311
subject  12     634325 1072719     303  1.4469 0.2660   


glm - Type I SS
glmData <-glm(Cmax ~ sequence+ period + drug +subject, data=KK)
anova(glmData)

Analysis of Deviance Table

Model: gaussian, link: identity

Response: Cmax

Terms added sequentially (first to last)


         Df Deviance Resid. Df Resid. Dev
NULL                        27    1194629
sequence  1      585        26    1194044
period    1    63175        25    1130869
drug      1    58149        24    1072719
subject  12   634325        12     438395


glm - Type III SS
drop1(glmData, test="F")
Single term deletions

Model:
Cmax ~ sequence + period + drug + subject
         Df Deviance     AIC F value  Pr(F)
<none>        438395     384               
sequence  0   438395     384               
period    1   501570     386  1.7293 0.2131
drug      1   496544     385  1.5917 0.2311
subject  12  1072719     385  1.4469 0.2660


Look like that only AICs are different when comparing these two methods.
Since SAS does not calculate AICs in its output, we thus cannot compare which one is correct. The reason we chose lm, instead of glm, might be that the output generated by lm is more like to that of SAS. And we didn't quite understand what "Deviance Resid." (in glm output) meant at beginning.
Again, both glm and lm generate almost the same results.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-09-25 09:03

@ yjlee168
Posting: # 2400
Views: 45,622
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Hi again,

I am happy to hear that it is now working with glm.
Note that subject is "nested within sequence" - that is why you get slightly odd results (zero deviation +- convergence criterion) for your sequence effect after drop1. I think it goes like this*: First the model is fit with Subj, Treatm, Period and Sequence. Then, to find the SS for Sequence a model with Subj, Treatm and Period is fit. The difference in residuals is supposed to be the type III Sequence effect. However, because of the subject being nested in sequence thing, removal of the Sequence from the full model corresponds to removing nothing (i.e. Sequence is already there because of the inclusion of Subject). Thus, a compensation for this phenomenon is necessary. SAS apparent figgers it out automatically, while R doesn't.

I am not aware of any syntactical trick to make R compensate for the nesting. If others know it then it would be relevant to hear about it. Otherwise I guess there is room for improvement on the part of the R development team?

EM.

*: As usual I could be completely wrong.
ElMaestro
Hero

Denmark,
2008-09-25 13:52

@ ElMaestro
Posting: # 2404
Views: 45,364
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
» Otherwise I guess there is room for improvement on the part of the R development team?

I have been thinking a little about this now... Perhaps one should consider it a feature rather than a bug that R's drop1 does not compensate for the nesting. After all, it does exactly what drop1 is supposed to do (which is dropping one factor at a time).
Comments, anyone?

EM.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-09-25 14:39

@ ElMaestro
Posting: # 2405
Views: 45,332
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear ElMaestro!

» Comments, anyone?

Maybe this thread and the references will help…

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
ElMaestro
Hero

Denmark,
2008-09-25 14:55

@ Helmut
Posting: # 2406
Views: 45,323
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
» Maybe this thread and the references will help…

Dear HS,

thanks for the help. However I think that link misses the point a little bit. The question I have been discussing with myself this morning (yes, ok, that perhaps sounds borderline schizophrenic) is not what type III SS are, rather how they should be presented for a nested situation; the R way of doing single term deletion is to do single term deletion, while the SAS way of doing single term deletion is to do single term deletion with the added bonus of rabbits out of a hat for the factor in which another factor is nested.

You have an opinion perhaps?

EM.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-09-25 15:33

@ ElMaestro
Posting: # 2407
Views: 45,367
 

 bear v1.0.0 - a data analytical tool for ABE in R

 
Dear ElMaestro!

» You have an opinion perhaps?

You gave a wonderful description of the differences between R and SAS. I prefer the R-way, because IMHO it's more transparent – but I’m not an SAS-user.
Maybe we should wait for DLabes passing by to obtain more insights on [image]

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
d_labes
Hero

Berlin, Germany,
2008-09-25 16:45

@ Helmut
Posting: # 2411
Views: 45,516
 

 GLM in R and the power to know

 
Dear ElMaestro, dear HS,

I cannot follow your discussion exactly because I have no experience with R up to now.

Without entering combat in the war of religions ;-) about type1, type3 and so on 'Sum of squares' (Ssq) and and their appropriateness:

SAS Proc GLM behaves like R, I think.

If the model is specified as
(1) log(PKparm) = sequence treatment period subject
you will get DF=0 and type3 Ssq = missing or zero (I do not remember my first attempt, in which I tried this model also.

Only if the model is specified as
(2) log(PKparm) = sequence treatment period subject(sequence)
where subject(sequence) means subject nested within sequence, an appropriate DF and a value for the type 3 Ssq is obtained. This is because the Ssq(sequence) and Ssq(subject(sequence)) sum up to the Ssq(subject) which means model (1) is over-specified.

So I presume that R gives analog results if the model is specified according to formula (2).
I'm not sure (as already said, ≤zero experience with R) but think I found nesting operator in R is | (pipe).
Thus do me the favor and try it in R.

Regards,

Detlew
ElMaestro
Hero

Denmark,
2008-09-25 19:23

@ d_labes
Posting: # 2412
Views: 45,436
 

 GLM in R and the power to know

 
» Thus do me the favor and try it in R.

Dear d_labes,

just did that. Actually, I do not know of the "|" to specify nesting.
I used "Subj %in% Seq" etc but still Seq comes out null in a type III anova.
Feature, not bug.

EM.
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-09-25 20:49

@ ElMaestro
Posting: # 2413
Views: 45,369
 

 nesting in R?

 
dear all,

I don't know if I follow your discussion exactly or not. We use ':' to
do nesting in R. For instance

glmData <-lm(Cmax ~ seq + period + treatm + subject:seq, data=KK)

anova(glmData)


Let me know if I don't follow your discussions well. Thank you all.

» just did that. Actually, I do not know of the "|" to specify nesting. I used "Subj%in%Seq" etc but still Seq comes out null in a type III anova.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-09-25 22:01

@ yjlee168
Posting: # 2415
Views: 45,308
 

 nesting in R?

 
» glmData <-lm(Cmax ~ seq + period + treatm + subject:seq, data=KK)

Hi,

that should work fine, I think.

Best regards
EM.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-09-26 00:23

@ ElMaestro
Posting: # 2416
Views: 45,666
 

 lm in R

 
Dear ElMaestro!

» Actually, I do not know of the "|" to specify nesting.
» I used "Subj%in%Seq"

I don't know both notations. See the documentation of lm under 'Details':

Details
Models for lm are specified symbolically. A typical model has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response. A terms specification of the form first + second indicates all the terms in first together with all the terms in second with duplicates removed. A specification of the form first:second indicates the set of terms obtained by taking the interactions of all terms in first with all terms in second. The specification first*second indicates the cross of first and second. This is the same as first + second + first:second.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
ElMaestro
Hero

Denmark,
2008-09-26 09:15

@ Helmut
Posting: # 2420
Views: 45,258
 

 lm in R

 
Hi,

Help->Manuals (in PDF)->An introduction to R
(Page 51 with v 2.7.1)

y ~ A*B
y ~ A + B + A:B
y ~ B %in% A
y ~ A/B Two factor non-additive model of y on A and B.
The first two specify the same crossed classification and the second two specify the same nested classification. In abstract terms all four specify the same model subspace.


EM.
Aceto81
Regular

Belgium,
2008-09-26 12:26

@ Helmut
Posting: # 2422
Views: 45,879
 

 lm in R

 
» » Actually, I do not know of the "|" to specify nesting.
» » I used "Subj%in%Seq"

I think the "|" is only for lme (but I can be wrong).
It's also used for lattice plots, where it means "conditioned on" (see http://www.itc.nl/~rossiter/teach/R/RIntro_ov.pdf page 47)

Ace
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-10-18 21:03

@ ElMaestro
Posting: # 2557
Views: 45,062
 

 Type III SS in balanced, crossover BE studies?

 
dear EM,

Although now bear v2.0.1 calculates Type III SS with ANOVA, Type III SS are the same as the Type I SS when the BE study is a balanced, crossover design. Bear is now only applicable to a balanced, crossover designed BE study. So, can 'some regulators' agree with this or are they willing to accept Type I SS in this situation? Thanks.

» 4. Although Sums-of-Squares is pretty much a mine field, I think it would be wise to at least implement an option of getting type III SS. Some regulators seem unwilling to accept anything other than type III, while others couldn't care less (I have it on good account :-D). This means that the choice of type III is a safe choice for the time being.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-10-21 13:09

@ yjlee168
Posting: # 2571
Views: 45,136
 

 Type III SS in balanced, crossover BE studies?

 
» So, can 'some regulators' agree with this or are they willing to accept Type I SS in this situation?

Hi,

I am not entirey sure that I understand you. You should ask a regulator, by the way...

You can do type III SS with "drop1", and it will work for unbalanced data so I don't see a problem in any way. You can also do it by fitting models one by one and comparing residuals. In the end it will work out to exactly the same result. I see no reason why you would not do this in your package - please tell me why ("Bear is now only applicable to a balanced, crossover designed BE study"???). And remember, drop1 is good with both glm and lm. You can do type III's via a mixed model, too, using the approach I showed you earlier, and again it will give you valid results for balanced and unbalanced data.

Whenever a study is balanced regulators will / should accept everything, because type I = type III (= type other, too). 99.999% don't care anyway, because they haven't got the faintest clue about statistics, as far as I've heard.

I hope this provided an answer to your question?

Have a nice day.
EM.
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-10-21 13:19

@ ElMaestro
Posting: # 2572
Views: 44,977
 

 Type III SS in balanced, crossover BE studies?

 
Dear EM,

Since bear now is designed only applicable to analyze data obtained from a balanced, crossover BE study, that's why I ask if it is necessary to display Type III SS with Type I SS simultaneously for such a BE study at this moment. Currently, it seems not necessary to do that.

Thanks for your message.

» Whenever a study is balanced regulators will / should accept everything, because type I = type III (= type other, too). 99.999% don't care anyway, because they haven't got the faintest clue about statistics, as far as I've heard.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
d_labes
Hero

Berlin, Germany,
2008-10-21 14:31

@ yjlee168
Posting: # 2573
Views: 44,877
 

 Only balanced, crossover BE studies?

 
Dear Yung-jin!

» Since bear now is designed only applicable to analyze data obtained
                      ^^^
» from a balanced, crossover BE study,
          ^^^

This would be a major flaw for bear because imbalanced BE studies (with respect to imbalanced number of volunteers in sequences) are more the rule than an exception!

This would mean that those studies cannot analyzed with bear!? :surprised:

In view of all the goodies implemented until now this would be a pity.

Regards,

Detlew
yjlee168
Senior
avatar
Homepage
Kaohsiung, Taiwan,
2008-10-21 19:01

@ d_labes
Posting: # 2574
Views: 44,910
 

 Only balanced, crossover BE studies?

 
dear DLabes,

Indeed. We are currently implement lme function for bear. Until then, bear can analyze data obtained from unbalanced/imbalanced BE studies.

All the best,
---Yung-jin Lee
bear v2.8.3-3:- created by Hsin-ya Lee & Yung-jin Lee
Kaohsiung, Taiwan http://pkpd.kmu.edu.tw/bear
Download link (updated) -> here
ElMaestro
Hero

Denmark,
2008-10-24 11:26

@ yjlee168
Posting: # 2578
Views: 44,832
 

 Only balanced, crossover BE studies?

 
Hi,

I really don't understand this.
You can use the exact same syntax to analyse balanced AND imbalanced data. So why not just do it? I see no point (justification) in (for) restricting the calculations or ambitions to balanced data only.

Best regards
EM.
Helmut
Hero
avatar
Homepage
Vienna, Austria,
2008-10-24 11:58

@ ElMaestro
Posting: # 2579
Views: 45,035
 

 Thread locked

 
Dear ElMaestro, Yung-jin and Hsin-ya Lee!

I agree with ElMaestro 100%.
I will lock the thread with this post, because the depth of replies slowly blocks the entire main page of the forum. Please open a new thread for the actual version of bear (2.0.1 upwards).
Thank you.

Cheers,
Helmut Schütz
[image]

The quality of responses received is directly proportional to the quality of the question asked. ☼
Science Quotes
Activity
 Thread view
Bioequivalence and Bioavailability Forum |  Admin contact
19,280 posts in 4,099 threads, 1,315 registered users;
online 6 (1 registered, 5 guests [including 5 identified bots]).

It’s easy to lie with statistics;
it is easier to lie without them.    Frederick Mosteller

The BIOEQUIVALENCE / BIOAVAILABILITY FORUM is hosted by
BEBAC Ing. Helmut Schütz
HTML5