What's wrong with this picture? [General Statistics]
Dear all,
When I said it seems the analysis is doable in R I made at least 2 mistakes.
So to summarise, I can get sWR, sWT, 90% CI for variability comparison (Thanks to Detlew), but not 95% upper limit for (YT - YR)2 - θ * s2WR. I'm sure the error comes from the mean T and R. Any thought?
When I said it seems the analysis is doable in R I made at least 2 mistakes.
- I forgot the step 3 mentioned in warfarin's guidance. The 90% CI analysed by unscaled BE should be inside 80-125%. This (replicate BE), if I recall one of Detlew's earlier posts correctly, cannot be done in R at the moment, at least in the sense of "reproduce" SAS output.
- Step 2 in guidance mentioned that 95% upper CI should be calculated as:
(YT - YR)2 - θ * s2WR,
where "YT and YR are means of the ln-transformed PK endpoint" for T and R respectively, which is easy to calculate (or so I thought...)
# read data from website, variable names: subj, per, seq, treat, pk, logpk
emadata <- "https://dl.dropboxusercontent.com/u/7685360/permlink/emabe4p.csv"
ema <- read.csv(file = emadata, colClasses = (c(rep("factor", 4), rep("numeric", 2))))
# remove subjects who has imcomplete periods. More elegant solution?
library(dplyr)
d <- filter(ema, subj != 11 & subj != 20 & subj != 24 & subj != 31 &
subj != 42 & subj != 67 & subj != 69 & subj != 71)
# create data sets containing only R1, R2, T1 and T2, respectively.
# change variable name accordingly and drop unnecessary variables.
r1 <- d %>% filter((seq == "TRTR" & per == 2) | (seq == "RTRT" & per == 1)) %>%
mutate(pkr1 = logpk) %>% # rename variable
select(-c(per, treat, pk, logpk)) # drop variables
r2 <- d %>% filter((seq == "TRTR" & per == 4) | (seq == "RTRT" & per == 3)) %>%
mutate(pkr2 = logpk) %>%
select(-c(per, treat, pk, logpk))
t1 <- d %>% filter((seq == "TRTR" & per == 1) | (seq == "RTRT" & per == 2)) %>%
mutate(pkt1 = logpk) %>%
select(-c(per, treat, pk, logpk))
t2 <- d %>% filter((seq == "TRTR" & per == 3) | (seq == "RTRT" & per == 4)) %>%
mutate(pkt2 = logpk) %>%
select(-c(per, treat, pk, logpk))
# merge data sets
r12 <- merge(r1, r2, by = c("seq", "subj"))
t12 <- merge(t1, t2, by = c("seq", "subj"))
drt <- merge(r12, t12, by = c("seq", "subj"))
# calculate mean difference between T and R, diff between R1 and R2, T1 and T2
scavbe <- mutate(drt,
diff_tr = 0.5 * (pkt1 + pkt2 - pkr1 - pkr2), # FDA's ilat
diff_r = pkr1 - pkr2, # FDA's dlat, for swr estimation
diff_t = pkt1 - pkt2) # additional variable to estimate swt
# calculate mean difference by sequence ---------------------
mdiff <- scavbe %>% group_by(seq) %>%
summarise(size = n(), # number of subjects
mdiff_tr = mean(diff_tr),
mdiff_r = mean(diff_r),
mdiff_t = mean(diff_t))
# calculate swr and swt. same as SAS output
all <- merge(scavbe, mdiff, by = "seq")
sw <- all %>% mutate(ss_r = (diff_r - mdiff_r)**2,
ss_t = (diff_t - mdiff_t)**2) %>%
summarise(s2wr = sum(ss_r) / (2*(n() - 2)), # formula in FDA's guidance
s2wt = sum(ss_t) / (2*(n() - 2)),
swr = sqrt(s2wr),
swt = sqrt(s2wt))
# 95% upper CI as (mean(T) - mean(R))**2 - theta*s2wr. Finding mean T and R
# Should mean value be calculated like this? It seems too simple. But if not, how?
t <- filter(d, treat == "T") %>%
mutate(mean_t = mean(logpk)) %>%
distinct(treat) %>%
select(mean_t)
r <- filter(d, treat == "R") %>%
mutate(mean_r = mean(logpk)) %>%
distinct(treat) %>%
select(mean_r)
# critbound value is different from sas output.
# 90% CI for sigma(T)/sigma(R) are same as sas output
tr <- merge(t, r)
final <- merge(sw, tr) %>%
mutate(size = nrow(distinct(d, subj)),
theta = (log(1/0.9)/0.10)**2,
critbound = (mean_t - mean_r)**2 - theta * s2wr, ## different from SAS.
cilower = swt/swr/(sqrt(qf(0.05, df1 = size - 2, df2 = size -2, lower.tail = FALSE))),
ciupper = swt/swr/(sqrt(qf(0.95, df1 = size - 2, df2 = size -2, lower.tail = FALSE))))
So to summarise, I can get sWR, sWT, 90% CI for variability comparison (Thanks to Detlew), but not 95% upper limit for (YT - YR)2 - θ * s2WR. I'm sure the error comes from the mean T and R. Any thought?
—
All the best,
Shuanghe
All the best,
Shuanghe
Complete thread:
- SAS and R (?) for variability comparison (FDA NTID Guidance) Shuanghe 2015-09-25 18:48 [General Statistics]
- SAS and R (?) for variability comparison (FDA NTID Guidance) jag009 2015-09-25 22:42
- SAS and R (?) for variability comparison (FDA NTID Guidance) Shuanghe 2015-09-28 17:05
- LaTeX, MathML, PNGs Helmut 2015-09-27 14:23
- OT: LaTeX, MathML, PNGs Shuanghe 2015-09-28 17:32
- OT: LaTeX, MathML, PNGs Helmut 2015-09-28 17:46
- OT: LaTeX, MathML, PNGs Shuanghe 2015-09-28 17:32
- Quantiles of the F-distribution d_labes 2015-09-28 09:45
- Quantiles of the F-distribution Shuanghe 2015-09-28 17:39
- What's wrong with this picture?Shuanghe 2015-09-28 18:57
- Suggestion Helmut 2015-09-29 02:22
- Suggestion Shuanghe 2015-09-29 18:44
- OT: The Hadleyverse d_labes 2015-09-30 08:44
- Suggestion Shuanghe 2015-09-29 18:44
- Another suggestion for a R-solution d_labes 2015-09-29 14:26
- Wow – another gem from the master! Helmut 2015-09-29 16:31
- Another suggestion for a R-solution Shuanghe 2015-09-29 18:37
- Another R-solution d_labes 2015-09-29 21:55
- @Shuanghe: Outdated URL gvk 2019-05-21 11:36
- Suggestion Helmut 2015-09-29 02:22
- SAS and R (?) for variability comparison (FDA NTID Guidance) M.tareq 2017-04-14 15:04
- SAS and R (?) for variability comparison (FDA NTID Guidance) M.tareq 2017-04-15 01:01
- SAS and R (?) for variability comparison (FDA NTID Guidance) jag009 2015-09-25 22:42