AB ☆ India, 20120504 17:13 (4428 d 01:24 ago) Posting: # 8514 Views: 15,362 

Dear all, we have done a BE study in 3 way ref replicate design for FDA and ended up with the following error while running SAS code for unscaled 90% bioequivalence confidence intervals as given in "Draft Guidance on Progesterone" for parameter AUCI. "NOTE: 20 observations are not included because of missing values. WARNING: Did not converge." These missing 20 observations are the ones (AUCI) which were not caculated in the respective treatment groups as the r^{2} values are less than 80%. However, when we re run the data including these 20 missing AUCI (recalculated without considering the r^{2} criteria), there was no error. plz suggest what is the correct way to deal with this? Thanks in advance — Regards, AB 
jag009 ★★★ NJ, 20120504 17:36 (4428 d 01:02 ago) @ AB Posting: # 8515 Views: 13,790 

Hi AB, Give us more info... Sample size n? John 
AB ☆ India, 20120505 08:45 (4427 d 09:52 ago) @ jag009 Posting: # 8516 Views: 13,659 

HI Jag009, ❝ Give us more info... ❝ Sample size n? sample size is 60 in the study. — Regards, AB 
d_labes ★★★ Berlin, Germany, 20120506 12:10 (4426 d 06:27 ago) @ AB Posting: # 8518 Views: 14,005 

Dear AB, ❝ "NOTE: 20 observations are not included because of missing values. ❝ WARNING: Did not converge." ❝ ❝ These missing 20 observations are the ones (AUCI) which were not caculated in the respective treatment groups as the r^{2} values are less than 80%. ❝ ❝ However, when we re run the data including these 20 missing AUCI (recalculated without considering the r^{2} criteria), there was no error. Point 1: The FDA ABE code for replicate crossover studies using SAS Proc MIXED fits a model which is overspecified for data coming from a partial replicate design. The intrasubject variability of Test formulation is confounded with the subjectbyformulation interaction within that design and not identifiable alone. This may lead to convergence problems in the REML method (see this thread) or may lead to unreliable values of the intrasubject variances, especially for the Test formulation (see this thread). Your missing values seems to exacerbate the problem. But are not the source IMHO . Ways out? Don't know exactly. The logical way within the code  method  given in the progesterone guidance would be not to use the Proc Mixed code but the estimate plus its 90% confidence interval obtained from the evaluation of the intrasubject contrasts of T vs. R (step Intermediate analysis  ilat) as measure of the ABE. I'm always wondering why the Proc MIXED code was recommended if the evaluation via intrasubject contrasts did indicate that scaled ABE was not applicable. On the other hand, if applicable, the ABE criterion evaluated via intrasubject contrasts is part of the scaled ABE criterion. That's illogical to me. Another possibility is to reduce the model in the FDA Proc MIXED code. Neglecting the subjectbyformulation interaction (setting it to zero) would let to a somewhat better behaving model. You could achieve this by setting the covariance structure within RANDOM TRT/TYPE=FA0(2) SUB=SUBJ G; to CS instead of FA0(2) or CSH .Point 2: Not to calculate the AUC(0inf) values if the fit of the terminal part of the concentrationtime curves had an r^{2} value less than 80% is at least statistically not very sound, not to say nonsense IMHO . Regardless of which study design used. Use and you will find some (partly lengthy) discussions here in the Forum about that subject. See here and here for instance. — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20120506 15:29 (4426 d 03:08 ago) @ d_labes Posting: # 8519 Views: 14,139 

Dear Detlew & AB! ❝ Not to calculate the AUC(0inf) values if the fit of the terminal part of the concentrationtime curves had an r^{2} value less than 80% is at least statistically not very sound, not to say nonsense IMHO . Yes! Explained variance (sloppy: information) in regression (strongly!) depends on the sample size. See here and a rather lengthy thread of 2002 at David Bourne’s PKPDlist. Critical values of \(\small{r}\) according to Odeh (1982)^{1} (and modified for \(\small{r^2}\)); one sided, 5%:$$\small{\begin{array}{rcc} n & r & r^2\\\hline 3 & 0.9877 & 0.9755\\ 4 & 0.9000 & 0.8100\\ 5 & 0.805\phantom{0} & 0.6486\\ 6 & 0.729\phantom{0} & 0.5319\\ 7 & 0.669\phantom{0} & 0.4481\\ 8 & 0.621\phantom{0} & 0.3863\\ 9 & 0.582\phantom{0} & 0.3390\\ 10 & 0.549\phantom{0} & 0.3018\\ 11 & 0.521\phantom{0} & 0.2719\\ 12 & 0.497\phantom{0} & 0.2473\\ 13 & 0.476\phantom{0} & 0.2267\\ 14 & 0.457\phantom{0} & 0.2093\\ 15 & 0.441\phantom{0} & 0.1944\\\hline \end{array}}$$ In other words, an \(\small{r^2}\) of 0.6486 from five data points denotes the same ‘quality of fit’ than an \(\small{r^2}\) of 0.9755 from three. Searching the forum I get the impression that you (AB) are not alone with a cutoff of 0.80. Justification: nil. Maybe there is some copypasting going on? If you really want to use a cutoff (which I don’t recommend and is not required in any GL) take the number of data points into account. I strongly suggest to revise your SOP. BTW, visual inspection of fits is mandatory (see there with references). Don’t trust in numbers alone. A classical example is Anscombe’s quartet.^{2} All data sets: \(\small{\bar{x}=9.0,\,s_x^2=11,\,\bar{y}=7.5,\,s_x^2=4.1\rightarrow\widehat{y}=3+0.5\cdot x,\,R_{yx}^2=0.82\ldots}\)
— Diftor heh smusma 🖖🏼 Довге життя Україна! _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
d_labes ★★★ Berlin, Germany, 20120507 10:48 (4425 d 07:49 ago) @ Helmut Posting: # 8520 Views: 13,594 

Dear Helmut! ❝ BTW, visual inspection of fits is mandatory (see here with references). Don’t trust in numbers alone. A classical example is Anscombe’s quartet.^{2} Very nice illustration . Thanx for that I was not aware of up to now also it was invented already in 1973 (if Wikipedia is correct). — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20120507 13:15 (4425 d 05:22 ago) @ d_labes Posting: # 8522 Views: 13,817 

Dear Detlew! ❝ Very nice illustration . I hijacked the code from Wikimedia Commons and learned that the data are available in R’s standard installation. Give it a try:
— Diftor heh smusma 🖖🏼 Довге життя Україна! _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
AB ☆ India, 20120507 14:07 (4425 d 04:31 ago) @ Helmut Posting: # 8523 Views: 13,659 

Dear Detlew & HS, Many thanks for your detailed insight. — Regards, AB 
FI ☆ Austria, 20121008 12:39 (4271 d 05:58 ago) @ Helmut Posting: # 9332 Views: 13,321 

Dear Helmut, ❝ In other words, an \(\small{r^2}\) of 0.6486 from five data points denotes the same ‘quality of fit’ than an \(\small{r^2}\) of 0.9755 from three. Searching the forum I get the impression that you (AB) are not alone with a cutoff of 0.80. Justification: nil. ❝ BTW, visual inspection of fits is mandatory... Don’t trust in numbers alone. A classical example is Anscombe’s quartet.^{2} Small addon: adj. r² method could be misleading in "the more (datapoints) the better"! Considering the PK of Azithromycin, "the less the better" could be considered, because there are (at least?) 3 elimination phases for Azi (uptake into white blood cells, rapid distribution into tissue... would resemble Anscombe2), and a very long t_{1/2}, depending (!) on the timepoints used for calculation. As terminal elimination needs to be calculated and the adj r² method from previous study took mostly 3 to 5 points, but sometimes also 12 (!), should the timepoints be limited (to 4 to 3), to reflect PK? What if one concentration looks to be an analytical mistake (?), that confounds t_{1/2} in such a way that the slope increases...? Where to put the cutoff for adj r²? Thanks a lot in advance FI 
Helmut ★★★ Vienna, Austria, 20121008 15:45 (4271 d 02:52 ago) @ FI Posting: # 9334 Views: 13,271 

Servus Franz! ❝ Considering the PK of Azithromycin, "the less the better" could be considered, because there are (at least?) 3 elimination phases […]. As terminal elimination needs to be calculated and the adj r² method from previous study took mostly 3 to 5 points, but sometimes also 12 (!), should the timepoints be limited (to 4 to 3), to reflect PK? Multiphasic PK can be problematic, especially if volumes of distribution are variable. I once had to deal with a drug (3 phases) where the terminal half life was ~3 days and the volume of distribution of the deep compartment was very large. Was this phase important? No. Running a PopPK model it turned out that this compartment accounted for <1% of the AUC. In my case the V_{2} showed little variability, but if we have large variability the predominant half life in some subjects might be the second phase and the third in others… See also Boxenbaum & Battle (1995).* ❝ What if one concentration looks to be an analytical mistake(?), that confounds t_{1/2} in such a way that the slope increases...? Since according to the EMA’s bioanalytical GL a blind plausibility review of data leading to confirmation/rejection (aka “pharmacokinetic repeat”) is not acceptable any more – bad luck. If other countries are concerned: Have an SOP in place, repeat the analysis, and cross fingers. Other options (?):
❝ Where to put the cutoff for adj r²? Nowhere. Forget it. Doesn’t make any sense, IMHO. For a bad example see this thread. I would be happy to see a publication justifying an algorithm which would allow automatic selection of the terminal phase in multicompartment PK. If anybody knows a single one, please let me know. See also Ref.#2 at the end of this thread.
— Diftor heh smusma 🖖🏼 Довге життя Україна! _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
d_labes ★★★ Berlin, Germany, 20121009 11:33 (4270 d 07:04 ago) @ Helmut Posting: # 9352 Views: 13,125 

Dear Helmut, dear FI! ❝ ❝ What if one concentration looks to be an analytical mistake(?), that confounds t_{1/2} in such a way that the slope increases...? ❝ ... ❝ Other options (?):
In this post I had claimed "I never have seen deficiency questions concerning the fit of the terminal phase of concentration time courses in my ~30 years career. Even if the 'fit' was done with only 2 points". Say never never. Quite recently I got: … The applicant should justify the calculation of the terminal rate constant for patient #xxx, reference/r.1, patient #yyy, test/r.2 … (It was a replicate BE study in patients).The questioned cases had in common that the last measured concentration was increasing compared to the preceding ones and doesn't fit into the linear part for the loglinear regression. To not grossly overestimate the terminal halflife in such situations it is my standard operation to act according to Helmut's first option above. Seems some regulators opinion do not match mine. — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20121009 16:09 (4270 d 02:28 ago) @ d_labes Posting: # 9356 Views: 13,338 

Dear Detlew! ❝ In this post I had claimed "I never have seen deficiency questions concerning the fit of the terminal phase of concentration time courses in my ~30 years career. Even if the 'fit' was done with only 2 points". I remember this post very well. So far I received only one request myself (by the sponsor, not an agency) why I have selected specific time points and not others. I answered by “visual inspection of the fit” (aka eyeball PK) as recommended in the literature. BTW, I don’t know a single reference suggesting maximum R²_{adj} or a statement about exclusion. ❝ Say never never. ❝ Quite recently I got: … The applicant should justify the calculation of the terminal rate constant for patient #xxx, reference/r.1, patient #yyy, test/r.2 … […]. ❝ The questioned cases had in common that the last measured concentration was increasing compared to the preceding ones and doesn't fit into the linear part for the loglinear regression. To not grossly overestimate the terminal halflife in such situations it is my standard operation to act according to Helmut's first option above. Oh no! What will you answer? Maybe this post helps. Seems that (some) regulators are not aware about the consequences of (acceptable!) limitations of analytical methods (20% inaccuracy, 20% imprecision at the LLOQ) and take results as set in stone. As a finger exercise we once solved the confidence bands of weighted inverse regression (aka calibration) – which required some nasty algebra (partial derivatives, etc.).* Then you can come up not only with estimated concentrations but also their confidence intervals (asymmetric – since the CI of a linear function are two hyperbolas). If the CIs of two concentrations overlap, they are not significantly different…
— Diftor heh smusma 🖖🏼 Довге життя Україна! _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 
d_labes ★★★ Berlin, Germany, 20121009 17:15 (4270 d 01:23 ago) @ Helmut Posting: # 9357 Views: 13,200 

Dear Helmut! ❝ Oh no! What will you answer? Maybe this post helps. Seems that (some) regulators are not aware about the consequences of (acceptable!) limitations of analytical methods (20% inaccuracy, 20% imprecision at the LLOQ) and take results as set in stone. ... My answer is somefink like (may be it helps others): The concentration time points selected for calculation of the terminal rate constant (lambdaZ) are chosen by visual inspection as recommended in the literature. The last concentration of the questioned terminal rate constant calculations does not fit the linear part of the concentration time curves (in loglinear plot). This behaviour is sometimes common if the concentrations are in the range of the LLOQ. To not grossly overestimate the terminal halflife in such cases it is recommended to leave out such measurement points. See for instance the book Hauschke, Steinijans, Pigeot "Bioequivalence Studies in Drug Development" Wiley, Chichester (2007) Chapter 2 “Metrics to characterize concentrationtime profiles in single and multipledose bioequivalence studies” especially Fig. 2.3On the other hand the reliable estimation of the terminal rate constant which is needed for a reliable estimate of AUC(0inf) has the one and only purpose to assure that the observation time is long enough for ensuring that AUC(0tlast) covers at least 80% of the AUC(0inf) (or the other way round that the residual area AUC(tlastinf) is <=20% of AUC(0inf)). AUC(0inf) in itself is not a primary endpoint on which the bioequivalence decision will be based (not according to the EMA guidance and also not according to the Study protocol). The bioequivalence decision taken in the Study Report is thus in no way affected by the way of calculation of the terminal rate constant. — Regards, Detlew 
Helmut ★★★ Vienna, Austria, 20121009 21:43 (4269 d 20:54 ago) @ d_labes Posting: # 9362 Views: 12,950 

Dear Detlew, good anwer! ❝ On the other hand the reliable estimation of the terminal rate constant which is needed for a reliable estimate of AUC(0inf) has the one and only purpose to assure that the observation time is long enough for ensuring that AUC(0tlast) covers at least 80% of the AUC(0inf) (or the other way round that the residual area AUC(tlastinf) is <=20% of AUC(0inf)). ❝ ❝ AUC(0inf) in itself is not a primary endpoint on which the bioequivalence decision will be based (not according to the EMA guidance and also not according to the Study protocol). The bioequivalence decision taken in the Study Report is thus in no way affected by the way of calculation of the terminal rate constant. Agree, but some hardcore assessor might tell you that if you cannot reliably estimate AUC_{∞} you have not demonstrated that AUC_{t} is an acceptable metric for the BE assessment. That’s a vicious circle. BTW, I didn’t want to go to much offtopic (as I did above) and started another thread about analytical variability. — Diftor heh smusma 🖖🏼 Довге життя Україна! _{} Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes 