Bioequivalence and Bioavailability Forum • Two PK metrics: Inflation of the Type I Error

Helmut
★★★

Vienna, Austria,
2017-11-12 12:57
(2791 d 16:44 ago)

Posting: # 17971
Views: 12,236

Two PK metrics: Inflation of the Type I Error [Two-Stage / GS Designs]

Dear all,

related to this thread about dropouts. To run the R-code you need package Power2Stage 0.4.6+.

Let’s assume a CV of 25% for C_max and 15% for AUC, Potvin ‘Method B’ (α_adj 0.0294). We want to play it safe and plan the first stage like a fixed sample design (T/R 0.95, 80% power). Hence, we start with 28 subjects. In the interim the CVs are higher than expected; for C_max a CV of 30% and for AUC 20%. Say C_max is not BE (94.12% CI) and power <80%. Hence, we should initiate the second stage. Re-estimated sample size:

library(Power2Stage) print(sampleN2.TOST(CV=0.30, n1=28), row.names=FALSE) # Cmax # Design alpha CV theta0 theta1 theta2 n1 Sample size Achieved power Target power # 2x2 0.0294 0.3 0.95 0.8 1.25 28 20 0.8177478 0.8

What does that mean? We would initiate the second stage with 20 subjects for C_max but possible shouldn’t for AUC:

print(sampleN2.TOST(CV=0.20, n1=28), row.names=FALSE) # AUC # Design alpha CV theta0 theta1 theta2 n1 Sample size Achieved power Target power # 2x2 0.0294 0.2 0.95 0.8 1.25 28 0 0.8922371 0.8

Since the Type I Error strongly depends on the sample size, the study would be overrun for AUC and an inflated TIE is quite possible. If have no R-code yet to estimate how much… Suggestions are welcome.
I think that in the past everybody (including myself) looked only at the PK metric with the highest variability and ignored the other one. Likely not a good idea.

Which options do we have for the PK metric with the lower variability?

Assess BE with a lower sample size. In the example above ignore the second stage entirely. If the CV would be 25% instead of 20%, assess only the first six subjects of the 20 in the second stage (i.e., in the pooled analysis 28+6=34 instead of 48).
Use the data of all subjects and adjust α more (i.e., a wider CI). How?
Or?

#1 would preserve the TIE but would regulators accept it (not using all available data)? What about #2, when the GL tells us that the α has to be pre-specified in the protocol?

Of course, this issue is not limited to TSDs but applies to GSDs with (blinded/unblinded) sample size re-estimation as well.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2017-11-12 18:46
(2791 d 10:55 ago)

@ Helmut
Posting: # 17972
Views: 10,921

Two PK metrics: Inflation of the Type I Error?

Post reply

Dear Helmut!

Very good question.
Next question :-D

.

My gut feeling says: Don't worry, be happy :cool:

.
What we had to do if we take two or more metrics into consideration is to combine the results of both metrics, i.e. some sort of inter-section-union test (IUT).
The IUT is known to be conservativ up to very conservative.
For illustration let's look at the results in a single stage design using Ben's function power.2TOST():
We don't know rho, the correlation berween both PK metrics, so lets look at the extremes.

library(PowerTOST)

power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1., 1.25), rho=0)

[1] 0.03784

power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1), rho=0)

[1] 0.04958

power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1.25), rho=0)

[1] 0.00244



power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1., 1.25), rho=1)

[1] 0.00416

power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1), rho=1)

[1] 0.04282

power.2TOST(CV=c(0.3,0.2), n=28, theta0=c(1.25, 1.25), rho=1)

[1] 0.04977

green: conservative
red: very conservative

This behavior should protect against an additional alpha inflation due to combining the results of both metrics if you control the TIE (alpha) of each.

Ok. All this is only analogy and gut feeling.
We only know exactly what's going on, if we simulate.
But I doubt if time spent and effort of doing this pays off.

Edit: Values corrected after a bug-fix in PowerTOST v1.4-7 (see the “Details” section of the man-page. [Helmut]

—
Regards,

Detlew

ElMaestro ★★★ Denmark, 2017-11-12 22:43 (2791 d 06:58 ago) @ Helmut Posting: # 17973 Views: 10,462	A place to start Post reply
	Hi Hötzi, can you run a series of sims where you include "20 more subjects than necessary" in stage 2 where a stage 2 is called for? Just to see how things look from that perspective. — Pass or fail! ElMaestro

d_labes
★★★

Berlin, Germany,
2017-11-13 17:02
(2790 d 12:40 ago)

@ ElMaestro
Posting: # 17974
Views: 10,399

A place to start?

Post reply

Dear ElMaestro,

❝ can you run a series of sims where you include "20 more subjects than necessary" in stage 2 where a stage 2 is called for?

that could indeed be done within the (in)famous package Power2Stage, function power.2stage.fC() using the argument n2.min. At least partially because n2.min assures a minimum sample size for stage 2, if stage 2 is called for and the estimated sample size is smaller than n2.min.

But that's not the problem Helmut has described, at least as I understand it. Here we initiate a stage 2 from the perspective of the AUC evaluation where it is not neccessary to initiate one.

If having more than necessary number of subjects is a problem at all? My gut feeling says no.

—
Regards,

Detlew

Helmut
★★★

Vienna, Austria,
2017-11-13 17:51
(2790 d 11:50 ago)

@ d_labes
Posting: # 17975
Views: 10,516

A place to start?

Post reply

Dear Detlew,

❝ that could indeed be done within the (in)famous package Power2Stage, function power.2stage.fC() using the argument n2.min.

… and power.2stage() without the need of specifying a futility criterion.

❝ At least partially because n2.min assures a minimum sample size for stage 2, if stage 2 is called for and the estimated sample size is smaller than n2.min.

Right.

❝ But that's not the problem Helmut has described, at least as I understand it. Here we initiate a stage 2 from the perspective of the AUC evaluation where it is not neccessary to initiate one.

Right as well. I tried to modify 5+ years old code for subject simulations but lost my patience.

❝ If having more than necessary number of subjects is a problem at all? My gut feeling says no.

From the producer’s perspective, of course, not. :-D

My gut feeling tells me: The adjusted α showed (well, in simulations…) that the TIE is controlled if we follow the frameworks exactly, i.e., initiate the second stage only if necessary based on the interim and with the re-estimated sample size.* If we increase the sample size, the chance of passing BE increases and hence, the TIE.
Misusing Power2Stage:

library(Power2Stage) power.2stage(CV=0.2, theta0=1.25, n1=28)$pBE # [1] 0.029911 power.2stage(CV=0.2, theta0=1.25, n1=28, min.n2=20)$pBE # [1] 0.030917

A lower sample size – due to dropouts – is not problematic. Lower chance of passing = no inflation of the TIE.

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

d_labes
★★★

Berlin, Germany,
2017-11-15 11:34
(2788 d 18:07 ago)

@ Helmut
Posting: # 17981
Views: 10,297

Scientific gut feeling

Post reply

Dear Helmut,

❝ My gut feeling tells me:

Woman are better in gut feeling, NLYW especially :-D

❝ The adjusted α showed (well, in simulations…) that the TIE is controlled if we follow the frameworks exactly, i.e., initiate the second stage only if necessary based on the interim and with the re-estimated sample size. If we increase the sample size, the chance of passing BE increases and hence, the TIE.

Wait, wait ... I will be back in 5 min :cool:

with some quick shoot code to verify your claim.

Meanwhile prepare a decision scheme with two metrics. So that I can see if my implementation is correct. My brain gets crowded if I think about the decision scheme. So many "What if". And here you are trained as scuba diver I was told from somebody ;-)

—
Regards,

Detlew

d_labes
★★★

Berlin, Germany,
2017-11-15 14:15
(2788 d 15:26 ago)

@ d_labes
Posting: # 17982
Views: 10,580

Five minutes gone - power.tsd.2m() arose

Post reply

Dear Helmut, dear All,

❝ ❝ The adjusted α showed (well, in simulations…) that the TIE is controlled if we follow the frameworks exactly, i.e., initiate the second stage only if necessary based on the interim and with the re-estimated sample size. If we increase the sample size, the chance of passing BE increases and hence, the TIE.

❝ Wait, wait ... I will be back in 5 min :cool: with some quick shoot code to verify your claim.

5 minutes are gone :PCchaos:

.
See Power2Stage on Github. Function power.tsd.2m().
It's pure Potvin B, abbreviated with implicit power monitoring, i.e. via sample size estimation, which gives n2=0 in case of 'enough' power.
What's missing at the moment is some sort of correlation between the PK metrics analogous to what is possible in power.2TOST() with the argument rho. Here I don't know at the moment at which place it comes into play. Suggestions are welcome.
The function is not public at the moment. Not in the mood to document :sleeping:

.

Example:
# install from GitHub via
# devtools::install_github("Detlew/Power2Stage")

library(Power2Stage)

# Cmax not BE, but AUC assumed BE

Power2Stage:::power.tsd.2m(CV=c(0.3, 0.2), theta0=c(1.25, 0.95), n1=28)$pBE

[1] 0.043414

# Cmax assumed BE, but AUC not BE

Power2Stage:::power.tsd.2m(CV=c(0.3, 0.2), theta0=c(0.95, 1.25), n1=28)$pBE

[1] 0.031875

# Cmax not BE, AUC also not BE

Power2Stage:::power.tsd.2m(CV=c(0.3, 0.2), theta0=c(1.25, 1.25), n1=28)$pBE

[1] 0.001452

Take the results with a grain of salt. Don't know how to validate. Suggestions here also welcome.
But I think the results look plausible.

Edited numbers after removing a little bug.

—
Regards,

Detlew

ElMaestro
★★★

Denmark,
2017-11-14 01:04
(2790 d 04:37 ago)

@ d_labes
Posting: # 17976
Views: 10,424

A better place to start.

Post reply

Hi d_labes,

❝ But that's not the problem Helmut has described, at least as I understand it. Here we initiate a stage 2 from the perspective of the AUC evaluation where it is not neccessary to initiate one.

Best I can come up with is as follows:
The only thing that determines the need for for stage 2 is the CV.
AUCt is generally having a lower CV than Cmax (for the record, I said 'generally', I did not say 'always'). Often it is about 5-20 percentage points.

So run the sims in the usual fashion, but add 5-20 points on the CV for the power and sample size estimation.

—
Pass or fail!
ElMaestro

Helmut
★★★

Vienna, Austria,
2017-11-14 01:21
(2790 d 04:21 ago)

@ ElMaestro
Posting: # 17977
Views: 10,525

Nope

Post reply

Hi ElMaestro,

❝ Best I can come up with is as follows:

❝ The only thing that determines the need for for stage 2 is the CV. :blahblah:

❝

❝ So run the sims in the usual fashion, but add 5-20 points on the CV for the power and sample size estimation.

That’s not my point. Meditate over my example. We would initiate the second stage with 28 subjects (driven by the CV of C_max 0.30) but should stop after the first stage for AUC (CV 0.20).

—
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes

nobody nothing 2017-11-14 09:11 (2789 d 20:30 ago) @ Helmut Posting: # 17978 Views: 10,368	Nope Post reply
	IANAL, but I think what San-Diego-man is proposing is to calculate BE-outcome (pass/fail) after stage 1 and compare it to BE-outcome after the (unnecessary) second stage for, let's say, 1 gazillion of studies and see if there is a meaningful difference. Isn't this a special example of the more general "forced BE" theme in one-stage designs? — Kindest regards, nobody

ElMaestro
★★★

Denmark,
2017-11-14 13:41
(2789 d 16:00 ago)

@ Helmut
Posting: # 17979
Views: 10,433

I've meditated hard

Post reply

Hi Hötzi,

❝ That’s not my point. Meditate over my example. We would initiate the second stage with 28 subjects (driven by the CV of C_max 0.30) but should stop after the first stage for AUC (CV 0.20).

Can you reformulate?
I meditated hard and I don't see why my suggestion does not address it. It will treat AUCt as if it having a higher CV for everything except BE evaluation. It is if I get your question correctly exactly what is going on.
So perhaps I did not understand you. While I sit in Lotus position and humming gently, do you think you could somehow put other words to the part I don't get?

—
Pass or fail!
ElMaestro