ssussu
☆

China,
2019-12-06 10:58
(327 d 10:41 ago)

Posting: # 20937
Views: 4,271

## Should those subjects have only one period data be included in BE analysis? [Design Issues]

Dear guys,
When we conducted a BE study designed with two way crossover, one subject dropped out at the second period, so we just got one period data.At this situation, should this subject's data with one period be included in the BE assessment? Why?

Best regards
Beholder
★

Russia,
2019-12-06 12:36
(327 d 09:02 ago)

(edited by Beholder on 2019-12-06 13:06)
@ ssussu
Posting: # 20940
Views: 3,527

## No way! But...

Dear ssussu!

» At this situation, should this subject's data with one period be included in the BE assessment?

No, you could not.

» Why?

Because you need comparison between T and R for each subject. This is the aim - to compare. In your case you have either T or R (I dont know exactly which drug was taken on period 2). At least, you could use Cmax value from period 2 if the subject withdraw after Cmax was reached. AUC - no way.

Best regards
Beholder
ElMaestro
★★★

Belgium?,
2019-12-06 13:14
(327 d 08:25 ago)

@ Beholder
Posting: # 20941
Views: 3,509

## No way! But...

Hello, both,

» » At this situation, should this subject's data with one period be included in the BE assessment?
»
» No, you could not.

We do not need a T-R comparison in each subject to do an analysis. We are just stuck in that paradigm. We could in principle fit a model with missing period data. And all data could be used for the calculation of the CI.
I do not know why there is a regulatory tradition for only using completers - I would say there is even a potential scientific and ethical advantage of using all period data.

I could be wrong, but...

Best regards,
ElMaestro

No, of course you do not need to audit your CRO if it was inspected in 1968 by the agency of Crabongostan.
Helmut
★★★

Vienna, Austria,
2019-12-06 14:47
(327 d 06:51 ago)

@ ElMaestro
Posting: # 20942
Views: 3,498

## No way! But...

Ahoy my capt’n and welcome back to this side of the pond!

» I do not know why there is a regulatory tradition for only using completers …

Those days when studies were evaluated by a paired t-test with a pocket calculator?

» – I would say there is even a potential scientific and ethical advantage of using all period data.

Correct. If we would be allowed (pun!) to use a mixed-effects model. Patterson and Jones argued against this doubtful practice and the ethical implications of discarding data.*

• Patterson SD, Jones B. Viewpoint: observations on scaled average bioequivalence. Pharm Stat. 2011;11(1):1–7. doi:10.1002/pst.498.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
PharmCat
★

Russia,
2019-12-06 23:46
(326 d 21:52 ago)

@ Helmut
Posting: # 20954
Views: 3,468

## No way! But...

Hi all!

» Correct. If we would be allowed (pun!) to use a mixed-effects model. Patterson and Jones argued against this doubtful practice and the ethical implications of discarding data.

Really, there is no problem in mixed model approach to use all data if model constructed correctly. May be not all regulatory bodies can understand how it works - it is not rare situations when I get issue like "Give me ANOVA!!! I don't want to see your GLM or MIXED...", pfff... or "In guideline ***(any ancient as mammoth's excrement guideline) written ratio of geometric means - please just divide one to another and give me CI"... So here two answers - theoretically there is no problem to use all data, but regulatory authorities can bann you for it. (IMHO)
wienui
★

Germany, Oman,
2019-12-07 04:01
(326 d 17:38 ago)

(edited by wienui on 2019-12-07 06:47)
@ PharmCat
Posting: # 20955
Views: 3,463

## No way! But...

Dear all,

» So here two answers - theoretically there is no problem to use all data, but regulatory authorities can bann you for it. (IMHO)

I also agree about this as a previous acadmic person, but as a regulator, which regulatory supportive Guidelines (EMA & FDA) reasons, can I rely on it to be able to use data from study subjects who didn't completed all study periods?

Best regards,
Osama

Cheers,
Osama
mittyri
★★

Russia,
2019-12-07 20:58
(326 d 00:40 ago)

@ wienui
Posting: # 20959
Views: 3,428

## EMA guideline: no way...:

Dear Osama,

» I also agree about this as a previous acadmic person, but as a regulator, which regulatory supportive Guidelines (EMA & FDA) reasons, can I rely on it to be able to use data from study subjects who didn't completed all study periods?

I doubt that something exists at the moment.
Please also see the related discussion and cites of EMA guideline.

Kind regards,
Mittyri
wienui
★

Germany, Oman,
2019-12-09 05:25
(324 d 16:14 ago)

@ mittyri
Posting: # 20963
Views: 3,326

## EMA guideline: no way...:

Dear Mittyri,

» I doubt that something exists at the moment.
» Please also see the related discussion and cites of EMA guideline.

Thank you very much, it is really helpful.

Best regards,

Osama

Cheers,
Osama
ElMaestro
★★★

Belgium?,
2019-12-08 01:31
(325 d 20:07 ago)

@ PharmCat
Posting: # 20961
Views: 3,404

## Slightly off topic, but related :-)

Hi all,

» » Correct. If we would be allowed (pun!) to use a mixed-effects model. Patterson and Jones argued against this doubtful practice and the ethical implications of discarding data.
»
» Really, there is no problem in mixed model approach to use all data if model constructed correctly.

But would we really need a mixed model ??
I seem to recall a thread about it some time ago. I believe the normal linear model fits without trouble to a 222BE dataset with a missing period, so why would it be necessary to use a mixed model for that situation at all?
As I recall it mixed model done with ML are proven to give unbiased estimates, but it has never been proven that REML fits are unbiased (and REML is generally used in BE, whenever a discussion of the mixed model comes into play), hence I can't imagine that bias would be a reason for picking the mixed model over a linear model when we have a simple missing period situation?!? Am curious to learn more about this.

Here's an example in R:

rm(list=ls(all=TRUE)) library("emmeans") set.seed(123456) logCmax=rnorm(10,6, 2) Subj=c(1,1,2,2,3,3,4,4,5,5) Seq=c("AB", "AB", "BA", "BA", "AB", "AB", "BA","BA", "BA", "BA") Per=c(rep(c(1,2), 5)) Trt=substr(Seq, Per, Per) X=data.frame(Subj, Per, Trt, Seq, logCmax) X M1=lm(logCmax ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per), data=X) anova(M1) lsmeans(M1, "Trt") confint(pairs(lsmeans(M1, "Trt"), reverse =F), level=0.9) Xm=X[-7,] ##let's delete a period Xm M2=lm(logCmax ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per), data=Xm) anova(M2) lsmeans(M2, "Trt") confint(pairs(lsmeans(M2, "Trt"), reverse =F), level=0.9)

Note e.g. that the two fits have different residuals and residual df's, which to me means incomplete subjects are not deleted (R does not know and is not being told something is incomplete; the full rank design matrix is still invertible and so on).
Many thanks for any light you can shed onto this.

Edit: I added set.seed(123456) to your code in order to get reproducible results. [Helmut]

I could be wrong, but...

Best regards,
ElMaestro

No, of course you do not need to audit your CRO if it was inspected in 1968 by the agency of Crabongostan.
Shuanghe
★★

Spain,
2019-12-09 12:00
(324 d 09:38 ago)

@ ElMaestro
Posting: # 20964
Views: 3,257

## Slightly off topic, but related :-)

Hi ElMaestro,

» Note e.g. that the two fits have different residuals and residual df's, which to me means incomplete subjects are not deleted (R does not know and is not being told something is incomplete; the full rank design matrix is still invertible and so on).

 X3 <- Xm[-7,] M3 <- lm(logCmax ~ factor(Seq)+factor(Subj)+factor(Trt)+factor(Per), data=X3) anova(M3) lsmeans(M3, "Trt") confint(pairs(lsmeans(M3, "Trt"), reverse =F), level=0.9) 
Xm has missing period (1) for subject 4, X3 has no subject 4. compare anova(M2) and anova(M3), residual and df of residual are same. 90% CI also same. So wouldn't it mean that R deleted the extra period (2) of subject 4 in Xm automatically when doing BE evaluation? Lsmeans are different, so subject 4 period 2 was kept for that calculation. I woulds say that this behaviour is the same as SAS.

All the best,
Shuanghe
PharmCat
★

Russia,
2019-12-09 15:13
(324 d 06:25 ago)

@ ElMaestro
Posting: # 20966
Views: 3,248

## Slightly off topic, but related :-)

» But would we really need a mixed model ??

Hello, ElMaestro!

Look at residuals:

resid(M1)          1          2          3          4          5          6          7          8          9         10 -0.1540073  0.1540073 -0.4297153  0.4297153  0.1540073 -0.1540073 -1.1774560  1.1774560  1.6071713 -1.6071713 resid(M2)             1             2             3             4             5             6             8             9            10 -1.540073e-01  1.540073e-01 -1.018443e+00  1.018443e+00  1.540073e-01 -1.540073e-01 -3.608225e-16  1.018443e+00 -1.018443e+00

This is how I understand:

For observation 8 we have -3.608225e-16, I think, that mean, that this subject affect on "intra-variation" and make it less, and you can see SE is smaller. "Intra-individual part" went to coefficient and began part of inter-individual. Brr.. I don't know how to say with сlever words.

In lm / glm we see each observation as at statistically independent. And with lm we make smart trick when "storing" inter-individual variation in model coefficients, and exclude it from model error. Missing observation violate this system.

In mixed model we have another situation - all observation of subject is one statistically independent observation - realization of multidimentional variable. In the case of the ML / REML estimation, such a significant transition of one component of the variation to another does not occur in the case when the data contains missing values. ML and REML variation estimates all biased, but less biased than lm with missing data.
ElMaestro
★★★

Belgium?,
2019-12-21 15:02
(312 d 06:37 ago)

@ PharmCat
Posting: # 21012
Views: 3,076

## Slightly off topic, but related :-)

Hi PharmCat,

» For observation 8 we have -3.608225e-16, I think,

This. I think, is around the "effective zero" for fits in R at default settings on 64- and 32-bit systems.

» I don't know how to say with сlever words.

Very unfortunate, because I did not understand what was being said. I would like to get the insight. It is at the limits of my conception.

» ML and REML variation estimates all biased, but less biased than lm with missing data.

Is this a fact? How do we actually know this? Do you have a reference I coud learn from (not Pinheiro and Bates, I don't understand a word of it).
Does "less biased" apply to both the fixed effects and to the variance components?

I could be wrong, but...

Best regards,
ElMaestro

No, of course you do not need to audit your CRO if it was inspected in 1968 by the agency of Crabongostan.
PharmCat
★

Russia,
2019-12-22 01:12
(311 d 20:26 ago)

@ ElMaestro
Posting: # 21013
Views: 3,069

## Slightly off topic, but related :-)

Hi ElMaestro!

» This. I think, is around the "effective zero" for fits in R at default settings on 64- and 32-bit systems.

Anyway residual "can't" be zero.

» Is this a fact? How do we actually know this? Do you have a reference I coud learn from (not Pinheiro and Bates, I don't understand a word of it).

For ML: Yes, it's a fact. ML biased always "by definition". proof

For REML: this is a more difficult question. We can see following:

Some links behind the paywall, but this problem can be solved with sci-hub.

What can we say: "REML does effectively reduce bias.", "The REML estimates are typically less biased than the ML methods."

REML does not always eliminate all of the bias in parameter estimation, since many methods for obtaining REML estimates cannot return negative estimates of a variance component. However, this source of bias also exists with ML, so REML is clearly the preferred method for analyzing large data sets with complex structure.

Hm... I understand that REML not always unbiased. May be I'm wrong, but all above make me think this.

» Does "less biased" apply to both the fixed effects and to the variance components?

Mmm. If we call model coefficients as fixed effects - they are unbiased, no problems with them. Variance component in LM if say strictly is a model error. LM have analytical solution and ubiased estimator of variance. But model should be fitted correctly: each level of any factor should have at least 2 observations.
Helmut
★★★

Vienna, Austria,
2019-12-22 10:37
(311 d 11:01 ago)

@ ElMaestro
Posting: # 21016
Views: 3,101

## 2.220446e-16 ≈ 0

Hi ElMaestro,

» » For observation 8 we have -3.608225e-16, I think,
» I think, is around the "effective zero" for fits in R at default settings on 64- and 32-bit systems.

Yes, it is.

x    <- -3.608225e-16 zero <- .Machine$double.eps all.equal(x, zero) [1] TRUE zero [1] 2.220446e-16 Dif-tor heh smusma 🖖 Helmut Schütz The quality of responses received is directly proportional to the quality of the question asked. 🚮 Science Quotes ElMaestro ★★★ Belgium?, 2019-12-23 14:37 (310 d 07:02 ago) @ Helmut Posting: # 21021 Views: 2,986 ## The optional tolerance argument Hi Hötzi, » » » For observation 8 we have -3.608225e-16, I think, » » I think, is around the "effective zero" for fits in R at default settings on 64- and 32-bit systems. » » Yes, it is. » x <- -3.608225e-16» zero <- .Machine$double.eps» all.equal(x, zero)» [1] TRUE» zero» [1] 2.220446e-16

This comparison in your context is just a test if the difference is less than about 10-8 since there is an implied tolerance argument for all.equal, the square root of .Machine\$double.eps

Effective zero residuals will be somewhat better than 10-8 in practice. They will depend on the approach used to find the solution; in lm I believe the approach is via a qr decomposition of the model matrix, and R by defualt has a tol argument in that function of 10-7 which lm may be leaning on.

Here's an example of a perfect fit, therefore having effective zero residuals:

a=c(rep(1,5), rep(2,5), rep(3,5)) b=c(rep("A",5), rep("B",5), rep("C",5)) M=lm(a~0+b) resid(M)

It may actually not be the best example since the dependents are all representable inernally in R's (and computer's) binary.

Perhaps this makes a better point:

a=c(rep(pi,5), rep(sin(1.5+pi),5), rep(log(pi),5)) b=c(rep("A",5), rep("B",5), rep("C",5)) M=lm(a~0+b) resid(M)

I could be wrong, but...

Best regards,
ElMaestro

No, of course you do not need to audit your CRO if it was inspected in 1968 by the agency of Crabongostan.
PharmCat
★

Russia,
2019-12-24 14:18
(309 d 07:20 ago)

@ Helmut
Posting: # 21025
Views: 2,995

## 2.220446e-16 ≈ 0

» » » For observation 8 we have -3.608225e-16, I think,
» » I think, is around the "effective zero" for fits in R at default settings on 64- and 32-bit systems.

Hi Helmut!

I suppose hat this value come from QR decomposition with pivoting of rank deficient X matrix. So real value = 0. Can residual of random variable be equal zero? - No => bias.
Helmut
★★★

Vienna, Austria,
2019-12-24 14:54
(309 d 06:44 ago)

@ PharmCat
Posting: # 21026
Views: 2,911

## Sum of residuals ~ ε

Hi PharmCat

» I suppose hat this value come from QR decomposition with pivoting of rank deficient X matrix. So real value = 0. Can residual of random variable be equal zero? - No => bias.

I disagree. In the model ε = 0. However in the fit, the sum of residuals is only asymptotically 0. We shouldn’t speak of bias when we obtain something sufficiently close to the numeric resolution of the machine.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
ElMaestro
★★★

Belgium?,
2019-12-24 15:10
(309 d 06:28 ago)

@ Helmut
Posting: # 21027
Views: 2,915

## Sum of residuals ~ ε

Hi both,

» I disagree. In the model ε = 0. However in the fit, the sum of residuals is only asymptotically 0. We shouldn’t speak of bias when we obtain something sufficiently close to the numeric resolution of the machine.

Perhaps I am getting it wrong; the sentence above sounds a little off and I have a feeling you may not be discussing the same thing?

The sum of residuals for a fitted normal linear model will be zero. Not asymptotically. If you sum them in R or any other software you will get zero, be it either like zero-zero or effectively zero, depending on the implementation.
This is because the underlying assumption is that epsilon be distributed with mean zero. If we end up with a non-zero sum, I'd say we have screwed up somewhere.

Try sum(resid(M))

I could be wrong, but...

Best regards,
ElMaestro

No, of course you do not need to audit your CRO if it was inspected in 1968 by the agency of Crabongostan.
Helmut
★★★

Vienna, Austria,
2019-12-28 13:55
(305 d 07:43 ago)

@ ElMaestro
Posting: # 21032
Views: 2,821

## Wrong terminology

Hi ElMaestro,

» The sum of residuals for a fitted normal linear model will be zero. Not asymptotically.

Sorry, I did not mean asymptotically (central limit theorem ) but approximately (due to numeric issues). Theoretically the sum should be zero, of course.
IIRC, in old versions of lm() fitting data without errors (like your first example) threw an error and there was even a warning in the man-page.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
PharmCat
★

Russia,
2019-12-24 18:40
(309 d 02:58 ago)

@ Helmut
Posting: # 21028
Views: 2,929

## Sum of residuals ~ ε

» I disagree. In the model ε = 0. However in the fit, the sum of residuals is only asymptotically 0. We shouldn’t speak of bias when we obtain something sufficiently close to the numeric resolution of the machine.

Hi Helmut!

I don't mean that we have bias because 0 not equal eps(). Numericaly we can make calculation with values lower that eps() like: (1E-40 + 2E-40) * 3E+40 = 9.0. Problems can start with things like this: 1 + 1E-40, for example: (1E+40 - 1E+40 + 1 == 1E+40 + 1  - 1E+40) == false. - this not what about a try to say.

I try to say, that zero residual (or effective zero residual) can't be at all, "by definition" in linear model.