Helmut
★★★

Vienna, Austria,
2017-11-12 10:57

Posting: # 17971
Views: 6,977

## Two PK metrics: Inflation of the Type I Error [Two-Stage / GS Designs]

Dear all,

related to this thread about dropouts. To run the R-code you need package Power2Stage 0.4.6+.

Let’s assume a CV of 25% for Cmax and 15% for AUC, Potvin ‘Method B’ (αadj 0.0294). We want to play it safe and plan the first stage like a fixed sample design (T/R 0.95, 80% power). Hence, we start with 28 subjects. In the interim the CVs are higher than expected; for Cmax a CV of 30% and for AUC 20%. Say Cmax is not BE (94.12% CI) and power <80%. Hence, we should initiate the second stage. Re-estimated sample size:

library(Power2Stage) print(sampleN2.TOST(CV=0.30, n1=28), row.names=FALSE) # Cmax # Design  alpha  CV theta0 theta1 theta2 n1 Sample size Achieved power Target power #    2x2 0.0294 0.3   0.95    0.8   1.25 28          20      0.8177478          0.8

What does that mean? We would initiate the second stage with 20 subjects for Cmax but possible shouldn’t for AUC:

print(sampleN2.TOST(CV=0.20, n1=28), row.names=FALSE) # AUC # Design  alpha  CV theta0 theta1 theta2 n1 Sample size Achieved power Target power #    2x2 0.0294 0.2   0.95    0.8   1.25 28           0      0.8922371          0.8

Since the Type I Error strongly depends on the sample size, the study would be overrun for AUC and an inflated TIE is quite possible. If have no R-code yet to estimate how much… Suggestions are welcome.
I think that in the past everybody (including myself) looked only at the PK metric with the highest variability and ignored the other one. Likely not a good idea.

Which options do we have for the PK metric with the lower variability?
1. Assess BE with a lower sample size. In the example above ignore the second stage entirely. If the CV would be 25% instead of 20%, assess only the first six subjects of the 20 in the second stage (i.e., in the pooled analysis 28+6=34 instead of 48).
2. Use the data of all subjects and adjust α more (i.e., a wider CI). How?
3. Or?
#1 would preserve the TIE but would regulators accept it (not using all available data)? What about #2, when the GL tells us that the α has to be pre-specified in the protocol?

Of course, this issue is not limited to TSDs but applies to GSDs with (blinded/unblinded) sample size re-estimation as well.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2017-11-12 16:46

@ Helmut
Posting: # 17972
Views: 6,343

## Two PK metrics: Inflation of the Type I Error?

Dear Helmut!

Very good question.
Next question .

My gut feeling says: Don't worry, be happy .
What we had to do if we take two or more metrics into consideration is to combine the results of both metrics, i.e. some sort of inter-section-union test (IUT).
The IUT is known to be conservativ up to very conservative.
For illustration let's look at the results in a single stage design using Ben's function power.2TOST():
We don't know rho, the correlation berween both PK metrics, so lets look at the extremes.

library(PowerTOST) power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1., 1.25), rho=0) [1] 0.03784 power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1), rho=0) [1] 0.04958 power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1.25), rho=0) [1] 0.00244 power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1., 1.25), rho=1) [1] 0.00416 power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1), rho=1) [1] 0.04282 power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1.25), rho=1) [1] 0.04977
green: conservative
red: very conservative

This behavior should protect against an additional alpha inflation due to combining the results of both metrics if you control the TIE (alpha) of each.

Ok. All this is only analogy and gut feeling.
We only know exactly what's going on, if we simulate.
But I doubt if time spent and effort of doing this pays off.

Edit: Values corrected after a bug-fix in PowerTOST v1.4-7 (see the “Details” section of the man-page. [Helmut]

Regards,

Detlew
ElMaestro
★★★

Belgium?,
2017-11-12 20:43

@ Helmut
Posting: # 17973
Views: 6,191

## A place to start

Hi Hötzi,

can you run a series of sims where you include "20 more subjects than necessary" in stage 2 where a stage 2 is called for?
Just to see how things look from that perspective.

I could be wrong, but...

Best regards,
ElMaestro

"Pass or fail" (D. Potvin et al., 2008)
d_labes
★★★

Berlin, Germany,
2017-11-13 15:02

@ ElMaestro
Posting: # 17974
Views: 6,131

## A place to start?

Dear ElMaestro,

» can you run a series of sims where you include "20 more subjects than necessary" in stage 2 where a stage 2 is called for?

that could indeed be done within the (in)famous package Power2Stage, function power.2stage.fC() using the argument n2.min. At least partially because n2.min assures a minimum sample size for stage 2, if stage 2 is called for and the estimated sample size is smaller than n2.min.

But that's not the problem Helmut has described, at least as I understand it. Here we initiate a stage 2 from the perspective of the AUC evaluation where it is not neccessary to initiate one.

If having more than necessary number of subjects is a problem at all? My gut feeling says no.

Regards,

Detlew
Helmut
★★★

Vienna, Austria,
2017-11-13 15:51

@ d_labes
Posting: # 17975
Views: 6,181

## A place to start?

Dear Detlew,

» that could indeed be done within the (in)famous package Power2Stage, function power.2stage.fC() using the argument n2.min.

… and power.2stage() without the need of specifying a futility criterion.

» At least partially because n2.min assures a minimum sample size for stage 2, if stage 2 is called for and the estimated sample size is smaller than n2.min.

Right.

» But that's not the problem Helmut has described, at least as I understand it. Here we initiate a stage 2 from the perspective of the AUC evaluation where it is not neccessary to initiate one.

Right as well. I tried to modify 5+ years old code for subject simulations but lost my patience.

» If having more than necessary number of subjects is a problem at all? My gut feeling says no.

From the producer’s perspective, of course, not.
My gut feeling tells me: The adjusted α showed (well, in simulations…) that the TIE is controlled if we follow the frameworks exactly, i.e., initiate the second stage only if necessary based on the interim and with the re-estimated sample size.* If we increase the sample size, the chance of passing BE increases and hence, the TIE.
Misusing Power2Stage:

library(Power2Stage) power.2stage(CV=0.2, theta0=1.25, n1=28)$pBE # [1] 0.029911 power.2stage(CV=0.2, theta0=1.25, n1=28, min.n2=20)$pBE # [1] 0.030917

• A lower sample size – due to dropouts – is not problematic. Lower chance of passing = no inflation of the TIE.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
d_labes
★★★

Berlin, Germany,
2017-11-15 09:34

@ Helmut
Posting: # 17981
Views: 6,056

## Scientific gut feeling

Dear Helmut,

» My gut feeling tells me:

Woman are better in gut feeling, NLYW especially .

» The adjusted α showed (well, in simulations…) that the TIE is controlled if we follow the frameworks exactly, i.e., initiate the second stage only if necessary based on the interim and with the re-estimated sample size. If we increase the sample size, the chance of passing BE increases and hence, the TIE.

Wait, wait ... I will be back in 5 min with some quick shoot code to verify your claim.

Meanwhile prepare a decision scheme with two metrics. So that I can see if my implementation is correct. My brain gets crowded if I think about the decision scheme. So many "What if". And here you are trained as scuba diver I was told from somebody .

Regards,

Detlew
d_labes
★★★

Berlin, Germany,
2017-11-15 12:15
(edited by d_labes on 2017-11-15 12:47)

@ d_labes
Posting: # 17982
Views: 6,074

## Five minutes gone - power.tsd.2m() arose

Dear Helmut, dear All,

» » The adjusted α showed (well, in simulations…) that the TIE is controlled if we follow the frameworks exactly, i.e., initiate the second stage only if necessary based on the interim and with the re-estimated sample size. If we increase the sample size, the chance of passing BE increases and hence, the TIE.
» Wait, wait ... I will be back in 5 min with some quick shoot code to verify your claim.

5 minutes are gone .
See Power2Stage on Github. Function power.tsd.2m().
It's pure Potvin B, abbreviated with implicit power monitoring, i.e. via sample size estimation, which gives n2=0 in case of 'enough' power.
What's missing at the moment is some sort of correlation between the PK metrics analogous to what is possible in power.2TOST() with the argument rho. Here I don't know at the moment at which place it comes into play. Suggestions are welcome.
The function is not public at the moment. Not in the mood to document .

Example:
# install from GitHub via
# devtools::install_github("Detlew/Power2Stage")

library(Power2Stage) # Cmax not BE, but AUC assumed BE Power2Stage:::power.tsd.2m(CV=c(0.3, 0.2), theta0=c(1.25, 0.95), n1=28)$pBE [1] 0.043414 # Cmax assumed BE, but AUC not BE Power2Stage:::power.tsd.2m(CV=c(0.3, 0.2), theta0=c(0.95, 1.25), n1=28)$pBE [1] 0.031875 # Cmax not BE, AUC also not BE Power2Stage:::power.tsd.2m(CV=c(0.3, 0.2), theta0=c(1.25, 1.25), n1=28)\$pBE [1] 0.001452

Take the results with a grain of salt. Don't know how to validate. Suggestions here also welcome.
But I think the results look plausible.

Edited numbers after removing a little bug.

Regards,

Detlew
ElMaestro
★★★

Belgium?,
2017-11-13 23:04

@ d_labes
Posting: # 17976
Views: 6,176

## A better place to start.

Hi d_labes,

» But that's not the problem Helmut has described, at least as I understand it. Here we initiate a stage 2 from the perspective of the AUC evaluation where it is not neccessary to initiate one.

Best I can come up with is as follows:
The only thing that determines the need for for stage 2 is the CV.
AUCt is generally having a lower CV than Cmax (for the record, I said 'generally', I did not say 'always'). Often it is about 5-20 percentage points.

So run the sims in the usual fashion, but add 5-20 points on the CV for the power and sample size estimation.

I could be wrong, but...

Best regards,
ElMaestro

"Pass or fail" (D. Potvin et al., 2008)
Helmut
★★★

Vienna, Austria,
2017-11-13 23:21

@ ElMaestro
Posting: # 17977
Views: 6,222

## Nope

Hi ElMaestro,

» Best I can come up with is as follows:
» The only thing that determines the need for for stage 2 is the CV.
»
» So run the sims in the usual fashion, but add 5-20 points on the CV for the power and sample size estimation.

That’s not my point. Meditate over my example. We would initiate the second stage with 28 subjects (driven by the CV of Cmax 0.30) but should stop after the first stage for AUC (CV 0.20).

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
nobody
nothing

2017-11-14 07:11

@ Helmut
Posting: # 17978
Views: 6,112

## Nope

IANAL, but I think what San-Diego-man is proposing is to calculate BE-outcome (pass/fail) after stage 1 and compare it to BE-outcome after the (unnecessary) second stage for, let's say, 1 gazillion of studies and see if there is a meaningful difference.

Isn't this a special example of the more general "forced BE" theme in one-stage designs?

Kindest regards, nobody
ElMaestro
★★★

Belgium?,
2017-11-14 11:41

@ Helmut
Posting: # 17979
Views: 6,160

## I've meditated hard

Hi Hötzi,

» That’s not my point. Meditate over my example. We would initiate the second stage with 28 subjects (driven by the CV of Cmax 0.30) but should stop after the first stage for AUC (CV 0.20).

Can you reformulate?
I meditated hard and I don't see why my suggestion does not address it. It will treat AUCt as if it having a higher CV for everything except BE evaluation. It is if I get your question correctly exactly what is going on.
So perhaps I did not understand you. While I sit in Lotus position and humming gently, do you think you could somehow put other words to the part I don't get?

I could be wrong, but...

Best regards,
ElMaestro

"Pass or fail" (D. Potvin et al., 2008)