## Three-way crossover example data set [Design Issues]

Dear Amilcar!

» I would like to have an example (explicit) where a three-way crossover study is appropriately analysed.

You have to use a Williams’ design (three period, six sequences); the topic was covered in previous threads.

A 6×3 design is needed in order to ‘extract’ two 2×2 tables, which are also balanced. Although the full 6×3 table will be used in the analysis of AUC and Cmax, you will need these 2×2s for the nonparametric analysis of tmax (un­fortunately there’s no confidence interval based nonparametric method available for more than two formulations/periods). The asterisks * denote pseudo-sequences and pseudo-periods, e.g. P1* means only that the treatment was given in a period prior to P2* – irrespective of the true study period:
+----+------------+  -->  +----+--------+  and  +----+--------+
|    | P1  P2  P3 |       |    | P1* P2*|       |    | P1* P2*|
+----+------------+       +----+--------+       +----+--------+
| S1 | T   R1  R2 |       | S1*| T   R1 |       | S1*| T   R2 |
| S2 | R1  R2  T  |       | S2*| R1  T  |       | S2*| R2  T  |
| S3 | R2  T   R1 |       | S1*| T   R1 |       | S2*| R2  T  |
| S4 | T   R2  R1 |       | S1*| T   R1 |       | S1*| T   R2 |
| S5 | R1  T   R2 |       | S2*| R1  T  |       | S1*| T   R2 |
| S6 | R2  R1  T  |       | S2*| R1  T  |       | S2*| R2  T  |
+----+------------+       +----+--------+       +----+--------+
^   balanced          ^   balanced

A common mistake is to design the study as a set of 3×3 latin squares, which will lead (especially if the sample size is small and in the case of drop outs) to extremely imbalanced data sets:
+----+------------+  -->  +----+--------+  and  +----+--------+
|    | P1  P2  P3 |       |    | P1* P2*|       |    | P1* P2*|
+----+------------+       +----+--------+       +----+--------+
| S1 | T   R1  R2 |       | S1*| T   R1 |       | S1*| T   R2 |
| S2 | R1  R2  T  |       | S2*| R1  T  |       | S2*| R2  T  |
| S3 | R2  T   R1 |       | S1*| T   R1 |       | S2*| R2  T  |
+----+------------+       +----+--------+       +----+--------+
^ imbalanced          ^ imbalanced

» Namely, I would like to know what kind of ANOVA should be performed and how the 90%CI should be calculated for the different combinations of study formulations (i.e. A vs B and A vs C).

For the design see Chapter 10 of

Chow S-C, Liu J-p.
Design and Analysis of Bioavailability and Bioequivalence Studies. New York: Marcel Dekker; 2nd ed. 2001, p. 302–32.

For a detailed discussion of variance balanced designs see Chapter 4 of

Jones B, Kenward MG.
Design and Analysis of Cross-over Trials. Boca Raton: Chapman & Hall/CRC; 2nd ed. 2003, p. 151–204.

For an example data set see Chapter 4 of

Patterson S, Jones B.
Bioequivalence and Statistics in Clinical Pharmacology. Boca Raton: Chapman & Hall/CRC; 2006, p. 79–132.

Their example 4.5 matches your design (one test formulation is compared to two reference formulations)
You may download zipped datasets and programs (SAS/S+) from CRC’s website. If you don’t have access to commercial software, S+ code will run with open-source R with minor modifications.

Patterson/Jones give results in Table 4.12 (p. 105) with:
T/R (% Test vs. Reference 1)
+----------+-------+---------------+
| Endpoint |  PE   |     90% CI    |
+----------+-------+---------------+
| AUC      | 116.2 | 109.0 , 123.9 |
| Cmax     | 130.0 | 119.1 , 141.8 |
+----------+-------+---------------+

T/S (% Test vs. Reference 2)
+----------+-------+---------------+
| Endpoint |  PE   |     90% CI    |
+----------+-------+---------------+
| AUC      |  82.8 |  77.6 ,  88.3 |
| Cmax     |  81.5 |  74.7 ,  89.0 |
+----------+-------+---------------+

WinNonlin 5.2 comes up with:
T/R (% Test vs. Reference 1)
+----------+-------+---------------+
| Endpoint |  PE   |     90% CI    |
+----------+-------+---------------+
| AUC      | 116.2 | 109.0 , 123.8 |
| Cmax     | 129.7 | 118.4 , 141.5 |
+----------+-------+---------------+

T/S (% Test vs. Reference 2)
+----------+-------+---------------+
| Endpoint |  PE   |     90% CI    |
+----------+-------+---------------+
| AUC      |  82.6 |  77.5 ,  88.1 |
| Cmax     |  81.2 |  74.3 ,  88.6 |
+----------+-------+---------------+

Results are slightly different (although both WinNonlin and SAS use GLM – not ANOVA; implementation, rounding, etc. is different – and of course ‘proprietary information’ and not documented). I assume Patterson/Jones' results were produced by SAS; I will check the results from their S+ code the next days.

» Perhaps this is a naive question,...

Not at all; little is published - and no worked examples at all.

There’s another point which is a little bit tricky: multiplicity.
If you are testing one test (A) against two references (B, C), any impression of ‘data dredging’ must be avoided, e.g., calculation 90% CIs of A/B and A/C - and only picking out the best getting an approval.
Since in the EU the Innovator's product from any European country may be used as the reference, you may run into problems (A vs B is BE, whereas A vs C is not BE). It may be wise to use 95% CIs instead (overall Bonferroni-corrected patient’s risk: α = 1–(1–0.05/k)k, where k is the number of simultaneous comparisons).
95% CI should also be applied in testing for dose proportionality of three dose levels (or 96.67% for four levels).
IMHO the only case where 90% CIs should be used is the comparison of two test formulations against one reference, and only one of them will be further used in the approval process.

Dif-tor heh smusma 🖖
Helmut Schütz

The quality of responses received is directly proportional to the quality of the question asked. 🚮
Science Quotes