replicateBE: Data import [🇷 for BE/BA]
❝ First, as stated by roman_max, thanks you, Michael and Detlew for all efforts in this package!
Welcome.
❝ So, I am dealing with some troubles when I run the methods A and B in the package to analyse a pivotal study …
❝
❝ I prepare my datafile as descripted on "DataStructure" in help(p=replicateBE) and save as a CSV: subject, period, sequence, treatment, PK, logPK
So far, so good. You don’t need the column
logPK
. By default the library will use values given in the column PK
and transform the data internally in full numeric precision.❝ Following the example of GitHUB, …
Such an example is not given on GitHub.
❝ … I am using the code:
❝
❝ cabo <- read.csv(file="Cabo_Outcome.CSV", head=TRUE, sep=",")
❝ cabo
❝ subject period sequence treatment PK logPK
❝ 1 1 1 RTTR R 23017.637 4.362060731
❝ ...
❝ 20 5 4 RTTR R 25284.525 4.402854806
A pivotal study with five subjects? Your data are extremely imbalanced (four subjects in sequence RTTR and one in sequence TRRT). Pretty unusual. If you read the file in 🇷 you get
str(cabo)
'data.frame': 20 obs. of 6 variables:
$ subject : int 1 1 1 1 2 2 2 2 3 3 ...
$ period : int 1 2 3 4 1 2 3 4 1 2 ...
$ sequence : chr "RTTR" "RTTR" "RTTR" "RTTR" ...
$ treatment: chr "R" "T" "T" "R" ...
$ PK : num 23018 26945 27755 23902 7715 ...
$ logPK : num 4.36 4.43 4.44 4.38 3.89 ...
subject
, period
are integers and sequence
, treatment
are characters. For the linear model they have to be factors – and many more preparations have to be done before the data can be used.1❝ res <- method.A(verbose = TRUE, details = TRUE,
❝ print = FALSE, data = cabo)
❝
❝ Error in if (!act %in% ref) { : argument is of length zero
By design. You should not use
method.A(..., data = cabo)
. Although that’s possible in principle, I don’t recommend it. Let the library do the import as explained in help(method.A)
.library(replicateBE)
path <- "your path"
res <- method.A(path.in = path, path.out = path, file = "Cabo_Outcome",
ext = "CSV", print = FALSE, details = TRUE, verbose = TRUE)
Data set Cabo_Outcome: Method A by lm()
---------------------------------------
Type III Analysis of Variance Table
Response: log(PK)
Df Sum Sq Mean Sq F value Pr(>F)
sequence 1 0.150001 0.150001 0.15981 0.71609068
period 3 0.954294 0.318098 5.51168 0.01475522
treatment 1 0.094433 0.094433 1.63624 0.22716731
sequence:subject 3 2.815829 0.938610 16.26328 0.00023442
Residuals 11 0.634848 0.057713
treatment T – R:
Estimate Std. Error t value Pr(>|t|)
-0.171786 0.134296 -1.279150 0.227167
11 Degrees of Freedom
To show the results:
print(res, row.names = FALSE)
Design Method n nTT nRR Sub/seq Miss/seq Miss/per alpha DF CVwT(%)
TRRT|RTTR A 5 5 5 1|4 0|0 0|0|0|0 0.05 11 33.1855982312
CVwR(%) swT swR sw.ratio sw.ratio.CL
19.7497208298 0.323226975979 0.195611007311 1.65239666429 5.03279665904
BE.lo(%) BE.hi(%) CL.lo(%) CL.hi(%) PE(%) CI GMR BE
80 125 66.1684875056 107.185933986 84.2159790878 fail pass fail
log.half-width sw.ratio.rec.CL
0.241180348576 NA
I recommend to print to a file (that’s the default of the functions).
method.A(path.in = path, path.out = path, file = "Cabo_Outcome", ext = "CSV")
Cabo_Outcome_ABEL_MethodA.txt
:R version : 4.2.1 (2022-06-23 ucrt)
PowerTOST version : 1.5.4 (2022)
replicateBE version: 1.1.3 (2022)
───────────────────────────────────────────────────────────────────────────────
Function : CV.calc(): stats:lm() executed 2022-09-17 11:00:22 CEST
Fixed effects : sequence, subject(sequence), period
Data : treatment = R
Function : method.A(): stats:lm() executed 2022-09-17 11:00:22 CEST
Fixed effects : sequence, subject(sequence), period, treatment
Data : all
───────────────────────────────────────────────────────────────────────────────
Analysis performed on column ‘PK’ (data internally log-transformed)
Sequences (design) : TRRT|RTTR (4-period full replicate)
Subjects / sequence: 1|4 (unbalanced)
Subjects (total) : 5
Subj’s with T and R: 5 (calculation of the CI)
Less than 12 as required acc. to the BE-GL.
Subj’s with two Ts : 5
Subj’s with two Rs : 5
Degrees of freedom : 11
alpha : 0.05 (90% CI)
Regulator : EMA
Switching CV : 30.00%
Scaling cap : 50.00%
Regul. constant (k): 0.760
CVwT : 33.19%
swT : 0.32323
CVwR : 19.75% (reference-scaling not applicable)
BE-limits : 80.00% ... 125.00%
swT / swR : 1.6524 (T higher variability than R)
sw-ratio (upper CL): 5.0328 (T higher variability than R)
Confidence interval: 66.17% ... 107.19% fail
Point estimate : 84.22% pass
Mixed (CI & PE) : fail
❝ … and calculate CVw
The library will not give you CVw.2 In your full replicate design you get CVwT and CVwR. It’s a bit tricky if you are really interested in it (works also for a partial replicate design):
path <- "your path"
# use the internal function to import and prepare the data
x <- replicateBE:::get.data(path.in = path, path.out = path, file = "Cabo_Outcome",
set = "", ext = "CSV", print = FALSE, plot.bxp = FALSE,
data = NULL)
# conventional linear model
mod <- lm(log(PK) ~ sequence + subject %in% sequence + period + treatment, data = x$data)
mse <- anova(mod)["Residuals", "Mean Sq"] # residual mean squares error
CVw <- sqrt(exp(mse) - 1) # convert to within-subject CV
cat(sprintf("CVw = %.2f%%\n", 100 * CVw))
CVw = 24.37%
cat(sprintf("CVw = %.2f%%\n", 100 * sqrt(exp(0.057713) - 1)))
CVw = 24.37%
Hope that helps. If not, feel free to ask.
- The internal function
get.data()
will do a lot of stuff. Then the data is a list with this structure:
List of 16
$ data :'data.frame': 20 obs. of 6 variables:
..$ subject : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 2 2 2 2 3 3 ...
..$ period : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
..$ sequence : Factor w/ 2 levels "RTTR","TRRT": 1 1 1 1 2 2 2 2 1 1 ...
..$ treatment: Factor w/ 2 levels "R","T": 1 2 2 1 2 1 1 2 1 2 ...
..$ PK : num [1:20] 23018 26945 27755 23902 7715 ...
..$ logPK : num [1:20] 4.36 4.43 4.44 4.38 3.89 ...
$ ref :'data.frame': 10 obs. of 6 variables:
..$ subject : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 2 3 3 4 4 5 5
..$ period : Factor w/ 4 levels "1","2","3","4": 1 4 2 3 1 4 1 4 1 4
..$ sequence : Factor w/ 2 levels "RTTR","TRRT": 1 1 2 2 1 1 1 1 1 1
..$ treatment: Factor w/ 2 levels "R","T": 1 1 1 1 1 1 1 1 1 1
..$ PK : num [1:10] 23018 23902 21282 13604 8982 ...
..$ logPK : num [1:10] 4.36 4.38 4.33 4.13 3.95 ...
$ RR :'data.frame': 10 obs. of 6 variables:
..$ subject : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 2 3 3 4 4 5 5
..$ period : Factor w/ 4 levels "1","2","3","4": 1 4 2 3 1 4 1 4 1 4
..$ sequence : Factor w/ 2 levels "RTTR","TRRT": 1 1 2 2 1 1 1 1 1 1
..$ treatment: Factor w/ 2 levels "R","T": 1 1 1 1 1 1 1 1 1 1
..$ PK : num [1:10] 23018 23902 21282 13604 8982 ...
..$ logPK : num [1:10] 4.36 4.38 4.33 4.13 3.95 ...
$ test :'data.frame': 10 obs. of 6 variables:
..$ subject : Factor w/ 5 levels "1","2","3","4",..: 1 1 2 2 3 3 4 4 5 5
..$ period : Factor w/ 4 levels "1","2","3","4": 2 3 1 4 2 3 2 3 2 3
..$ sequence : Factor w/ 2 levels "RTTR","TRRT": 1 1 2 2 1 1 1 1 1 1
..$ treatment: Factor w/ 2 levels "R","T": 2 2 2 2 2 2 2 2 2 2
..$ PK : num [1:10] 26945 27755 7715 11701 21978 ...
..$ logPK : num [1:10] 4.43 4.44 3.89 4.07 4.34 ...
$ type : chr "TRRT|RTTR"
$ n : int 5
$ nTT : int 5
$ nRR : int 5
$ design : chr "full"
$ txt : chr "\nAnalysis performed on column ‘PK’ (data internally log-transformed)\nSequences (design) : TRRT|RTTR (4-period"| __truncated__
$ Sub.Seq : 'table' int [1:2(1d)] 1 4
..- attr(*, "dimnames")=List of 1
.. ..$ : chr [1:2] "TRRT" "RTTR"
$ Miss.seq: Named num [1:2] 0 0
..- attr(*, "names")= chr [1:2] "TRRT" "RTTR"
$ Miss.per: num [1:4] 0 0 0 0
$ logtrans: logi TRUE
$ res.file: logi NA
$ png.path: logi NA
This explains❝ Error in if (!act %in% ref) { : argument is of length zero
Incobo
you have only the data framedata
(not factorized; see above) but the others are missing. The error is caused by the missing data frameref
containing only the data of the reference.
- You are the first one to ask for it (the packages had 22,000 downloads so far). If you explain why you need it, I will consider to add it to the next release.
Dif-tor heh smusma 🖖🏼 Довге життя Україна!
Helmut Schütz
The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes
Complete thread:
- replicateBE_xlsx datafile roman_max 2020-06-10 17:32 [🇷 for BE/BA]
- replicateBE: data input Helmut 2020-06-10 18:14
- replicateBE: data input roman_max 2020-06-22 17:06
- replicateBE: error in data input thalita_m 2022-09-17 05:28
- replicateBE: Data importHelmut 2022-09-17 11:02
- replicateBE: data input Helmut 2020-06-10 18:14