Bioequivalence and Bioavailability Forum

What's wrong with this picture? [General Statistics]

posted by Shuanghe – Spain, 2015-09-28 20:57 (3565 d 18:04 ago) – Posting: # 15482
Views: 29,262

Dear all,

When I said it seems the analysis is doable in R I made at least 2 mistakes.

I forgot the step 3 mentioned in warfarin's guidance. The 90% CI analysed by unscaled BE should be inside 80-125%. This (replicate BE), if I recall one of Detlew's earlier posts correctly, cannot be done in R at the moment, at least in the sense of "reproduce" SAS output.
Step 2 in guidance mentioned that 95% upper CI should be calculated as:
(Y_T - Y_R)² - θ * s²_WR,
where "Y_T and Y_R are means of the ln-transformed PK endpoint" for T and R respectively, which is easy to calculate (or so I thought...)

Let's fotget the 90% CI since it's impossible to do so in R. I played a bit with R for calculation of the rest parameters as a way of learning R and here it is the code. What's wrong with this picture (see comments in red colour)?

# read data from website, variable names: subj, per, seq, treat, pk, logpk

emadata  <- "https://dl.dropboxusercontent.com/u/7685360/permlink/emabe4p.csv"

ema <- read.csv(file = emadata, colClasses = (c(rep("factor", 4), rep("numeric", 2))))



# remove subjects who has imcomplete periods. More elegant solution?

library(dplyr)

d <- filter(ema, subj != 11 & subj != 20 & subj != 24 & subj != 31 &

                 subj != 42 & subj != 67 & subj != 69 & subj != 71)



# create data sets containing only R1, R2, T1 and T2, respectively.

# change variable name accordingly and drop unnecessary variables.

r1 <- d %>% filter((seq == "TRTR" & per == 2) | (seq == "RTRT" & per == 1)) %>%

            mutate(pkr1 = logpk) %>%           # rename variable

            select(-c(per, treat, pk, logpk))  # drop variables



r2 <- d %>% filter((seq == "TRTR" & per == 4) | (seq == "RTRT" & per == 3)) %>%

            mutate(pkr2 = logpk) %>%

            select(-c(per, treat, pk, logpk))



t1 <- d %>% filter((seq == "TRTR" & per == 1) | (seq == "RTRT" & per == 2)) %>%

            mutate(pkt1 = logpk) %>%

            select(-c(per, treat, pk, logpk))



t2 <- d %>% filter((seq == "TRTR" & per == 3) | (seq == "RTRT" & per == 4)) %>%

            mutate(pkt2 = logpk) %>%

            select(-c(per, treat, pk, logpk))



# merge data sets

r12 <- merge(r1, r2,   by = c("seq", "subj"))

t12 <- merge(t1, t2,   by = c("seq", "subj"))

drt <- merge(r12, t12, by = c("seq", "subj"))



# calculate mean difference between T and R, diff between R1 and R2, T1 and T2

scavbe <- mutate(drt,

                 diff_tr = 0.5 * (pkt1 + pkt2 - pkr1 - pkr2),  # FDA's ilat

                 diff_r  = pkr1 - pkr2,  # FDA's dlat, for swr estimation

                 diff_t  = pkt1 - pkt2)  # additional variable to estimate swt



# calculate mean difference by sequence ---------------------

mdiff <- scavbe %>% group_by(seq) %>%

            summarise(size     = n(),  # number of subjects

                      mdiff_tr = mean(diff_tr),

                      mdiff_r  = mean(diff_r),

                      mdiff_t  = mean(diff_t))

# calculate swr and swt. same as SAS output

all <- merge(scavbe, mdiff, by = "seq")

sw <- all %>% mutate(ss_r = (diff_r - mdiff_r)**2,

                     ss_t = (diff_t - mdiff_t)**2) %>%

              summarise(s2wr = sum(ss_r) / (2*(n() - 2)),  # formula in FDA's guidance

                        s2wt = sum(ss_t) / (2*(n() - 2)),

                        swr = sqrt(s2wr),

                        swt = sqrt(s2wt))



# 95% upper CI as (mean(T) - mean(R))**2 - theta*s2wr. Finding mean T and R

# Should mean value be calculated like this? It seems too simple. But if not, how? 

t <- filter(d, treat == "T") %>%

     mutate(mean_t = mean(logpk)) %>%

     distinct(treat) %>%

     select(mean_t)



r <- filter(d, treat == "R") %>%

     mutate(mean_r = mean(logpk)) %>%

     distinct(treat) %>%

     select(mean_r)



# critbound value is different from sas output.

# 90% CI for sigma(T)/sigma(R) are same as sas output

tr <- merge(t, r)

final <- merge(sw, tr) %>%

         mutate(size = nrow(distinct(d, subj)),

                theta = (log(1/0.9)/0.10)**2,

                critbound = (mean_t - mean_r)**2 - theta * s2wr,  ## different from SAS.

                cilower = swt/swr/(sqrt(qf(0.05, df1 = size - 2, df2 = size -2, lower.tail = FALSE))),

                ciupper = swt/swr/(sqrt(qf(0.95, df1 = size - 2, df2 = size -2, lower.tail = FALSE))))

So to summarise, I can get s_WR, s_WT, 90% CI for variability comparison (Thanks to Detlew), but not 95% upper limit for (Y_T - Y_R)² - θ * s²_WR. I'm sure the error comes from the mean T and R. Any thought?

—
All the best,
Shuanghe

Complete thread:

SAS and R (?) for variability comparison (FDA NTID Guidance) Shuanghe 2015-09-25 18:48
- SAS and R (?) for variability comparison (FDA NTID Guidance) jag009 2015-09-25 22:42
  - SAS and R (?) for variability comparison (FDA NTID Guidance) Shuanghe 2015-09-28 17:05
    - Missings d_labes 2015-09-30 10:52
      - Missings Shuanghe 2015-09-30 15:34
- LaTeX, MathML, PNGs Helmut 2015-09-27 14:23
  - OT: LaTeX, MathML, PNGs Shuanghe 2015-09-28 17:32
    - OT: LaTeX, MathML, PNGs Helmut 2015-09-28 17:46
- Quantiles of the F-distribution d_labes 2015-09-28 09:45
  - Quantiles of the F-distribution Shuanghe 2015-09-28 17:39
- What's wrong with this picture?Shuanghe 2015-09-28 18:57
  - Suggestion Helmut 2015-09-29 02:22
    - Suggestion Shuanghe 2015-09-29 18:44
      - OT: The Hadleyverse d_labes 2015-09-30 08:44
  - Another suggestion for a R-solution d_labes 2015-09-29 14:26
    - Wow – another gem from the master! Helmut 2015-09-29 16:31
    - Another suggestion for a R-solution Shuanghe 2015-09-29 18:37
      - Another R-solution d_labes 2015-09-29 21:55
    - @Shuanghe: Outdated URL gvk 2019-05-21 11:36
- SAS and R (?) for variability comparison (FDA NTID Guidance) M.tareq 2017-04-14 15:04
  - SAS and R (?) for variability comparison (FDA NTID Guidance) M.tareq 2017-04-15 01:01

What's wrong with this picture? [General Sta­tis­tics]

Complete thread:

What's wrong with this picture? [General Statistics]