|
| 1 | +--- |
| 2 | +title: "Sample Size for Multiple Hypothesis Testing in Biosimilar Development" |
| 3 | +author: "Thomas Debray" |
| 4 | +output: rmarkdown::html_vignette |
| 5 | +vignette: > |
| 6 | + %\VignetteIndexEntry{Sample Size for Multiple Hypothesis Testing in Biosimilar Development} |
| 7 | + %\VignetteEngine{knitr::rmarkdown} |
| 8 | + %\VignetteEncoding{UTF-8} |
| 9 | +bibliography: 'references.bib' |
| 10 | +link-citations: yes |
| 11 | +--- |
| 12 | + |
| 13 | +```{r, include = FALSE} |
| 14 | +knitr::opts_chunk$set( |
| 15 | + collapse = TRUE, |
| 16 | + comment = "#>" |
| 17 | +) |
| 18 | +require(kableExtra) |
| 19 | +``` |
| 20 | + |
| 21 | +We here reproduce the examples of @mielke_sample_2018. In these examples, the minimal sample size is estimated to give at least 80\% power for the rejection of $k$ out of $m$ tests at a one-sided significance level of $\alpha = 0.05$ in a parallel groups design. It is assumed that the sample size, the true difference between T and R and the standard deviation of the tests is equal in each test. |
| 22 | + |
| 23 | +# Multiple Independent Co-Primary Endpoints |
| 24 | +The first example assumes a ratio of 1.05 between the effect sizes of the test and reference products. @mielke_sample_2018 conducts a difference-of-means test on the log scale, with $\delta = \log(1.05)$. It is assumed that the standard deviation of the log-transformed response variable is $\sigma = 0.3$, and that all tests are independent ($\rho = 0$). Below, we estimate the sample size to demonstrate that the test and reference product are equivalent with respect to all $m=5$ endpoints. |
| 25 | + |
| 26 | +```{r, eval = TRUE} |
| 27 | +# Calculate required sample size for 80 % power |
| 28 | +ssMielke <- sampleSize_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0, |
| 29 | + sigma = 0.3, true.diff = log(1.05), |
| 30 | + equi.tol = log(1.25), design = "parallel", |
| 31 | + alpha = 0.05, adjust = "no", seed = 1234, |
| 32 | + nsim = 10000) |
| 33 | +ssMielke |
| 34 | +``` |
| 35 | +For 80\% power, `r ssMielke["SS"]` subjects per sequence (`r ssMielke["SS"] * 2` in total) would have been required. |
| 36 | + |
| 37 | +We can perform the same sample size calculation using `sampleSize`, assuming that effect sizes are normally distributed on the log scale. We use a difference-of-means test (`ctype = "DOM"`) with the specified values for `mu_list` and `sigma_list` (`lognorm = FALSE`). |
| 38 | + |
| 39 | +```{r, eval = TRUE} |
| 40 | +ss <- sampleSize(power = 0.8, alpha = 0.05, |
| 41 | + mu_list = list("R" = rep(log(1.00), 5), |
| 42 | + "T" = rep(log(1.05), 5)), |
| 43 | + sigma_list = list("R" = rep(0.3, 5), |
| 44 | + "T" = rep(0.3, 5)), |
| 45 | + lequi.tol = rep(log(0.80), 5), |
| 46 | + uequi.tol = rep(log(1.25), 5), |
| 47 | + dtype = "parallel", ctype = "DOM", lognorm = FALSE, |
| 48 | + adjust = "no",ncores = 1, nsim = 10000, seed = 1234) |
| 49 | +ss |
| 50 | +``` |
| 51 | + |
| 52 | +# Multiple Correlated Co-Primary Endpoints |
| 53 | +In the second example, we have $k=m=5$, $\sigma = 0.3$ and $\rho = 0.8$. Again, we can estimate the sample size using the functions provided by @mielke_sample_2018: |
| 54 | + |
| 55 | +```{r, eval = TRUE} |
| 56 | +ssMielke <- sampleSize_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0.8, |
| 57 | + sigma = 0.3, true.diff = log(1.05), |
| 58 | + equi.tol = log(1.25), design = "parallel", |
| 59 | + alpha = 0.05, adjust = "no", seed = 1234, |
| 60 | + nsim = 10000) |
| 61 | +ssMielke |
| 62 | +``` |
| 63 | +For 80\% power, `r ssMielke["SS"]` subjects per sequence (`r ssMielke["SS"] * 2` in total) would have been required. |
| 64 | + |
| 65 | +We can perform the same analysis using [sampleSize()](../man/sampleSize.html). In this case, we provide estimates for $\mu$ and $\sigma$ on the original scale, assuming they follow a normal distribution on the log scale (`lognorm = TRUE`). Instead of testing the difference of log-transformed means, we now test the ratio of the (untransformed) means. |
| 66 | + |
| 67 | +```{r, eval = TRUE} |
| 68 | +ss <- sampleSize(power = 0.8, alpha = 0.05, |
| 69 | + mu_list = list("R" = rep(1.00, 5), |
| 70 | + "T" = rep(1.05, 5)), |
| 71 | + sigma_list = list("R" = rep(0.3, 5), |
| 72 | + "T" = rep(0.3, 5)), |
| 73 | + rho = 0.8, # high correlation between the endpoints |
| 74 | + lequi.tol = rep(0.8, 5), |
| 75 | + uequi.tol = rep(1.25, 5), |
| 76 | + dtype = "parallel", ctype = "ROM", lognorm = TRUE, |
| 77 | + adjust = "no", ncores = 1, k = 5, nsim = 10000, seed = 1234) |
| 78 | +ss |
| 79 | +``` |
| 80 | + |
| 81 | +# Example 3 |
| 82 | +In the Zarzio example, we have the following: |
| 83 | + |
| 84 | +```{r, eval = FALSE} |
| 85 | +# Calculate the standard deviation and the mean using the |
| 86 | +# reported confidence intervals |
| 87 | +sigma <- c(sqrt(40)*(log(0.8884)-log(0.8249))/qt(0.95, df=40-2), |
| 88 | + sqrt(26)*(log(0.9882)-log(0.9576))/qt(0.95, df=26-2), |
| 89 | + sqrt(28)*(log(0.8661)-log(0.7863))/qt(0.95, df=28-2), |
| 90 | + sqrt(28)*(log(0.9591)-log(0.8873))/qt(0.95, df=28-2), |
| 91 | + sqrt(24)*(log(0.885)-log(0.8155))/qt(0.95, df=24-2) |
| 92 | + ) |
| 93 | +
|
| 94 | +mu.ratio <- c(88.84, 98.82, 86.61, 95.91, 88.5)/100 |
| 95 | +mu <- log(mu.ratio) |
| 96 | +
|
| 97 | +# Required sample size for all tests to be successful |
| 98 | +N_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0, sigma = sigma, |
| 99 | + true.diff = mu, equi.tol = log(1.25), design = "22co", alpha = 0.05, |
| 100 | + nsim = 10000) |
| 101 | +``` |
| 102 | +We find that 47 subjects per sequence are needed, and thus 94 subjects per study or 470 in total. |
| 103 | + |
| 104 | +```{r, eval = FALSE} |
| 105 | +simsamplesize::calopt(power = 0.8, # target power |
| 106 | + alpha = 0.05, |
| 107 | + mu_list = list("R" = c(log(1), log(1), log(1), log(1), log(1)), |
| 108 | + "T" = c(log(0.8884), log(0.9882), log(0.8661), log(0.9591), log(0.8850))), |
| 109 | + sigma_list = list("R" = sigma, |
| 110 | + "T" = sigma), |
| 111 | + lequi.tol = rep(log(0.80), 5), |
| 112 | + uequi.tol = rep(log(1.25), 5), |
| 113 | + dtype="2x2", |
| 114 | + ctype = "DOM", |
| 115 | + lognorm = FALSE, |
| 116 | + adjust = "no", |
| 117 | + ncores = 1, |
| 118 | + nsim = 10000, |
| 119 | + seed = 1234) |
| 120 | +``` |
| 121 | + |
| 122 | +# Example 4 |
| 123 | + |
| 124 | +In the Zarzio example, the required sample size to demonstrate equivalence for 3 out of 5 tests is as follows: |
| 125 | + |
| 126 | +```{r, eval = FALSE} |
| 127 | +# No adjustment |
| 128 | +N_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 3, rho = 0, sigma = sigma, |
| 129 | + true.diff = mu, equi.tol = log(1.25), design = "22co", alpha = 0.05, |
| 130 | + adjust = "no", nsim = 10000) |
| 131 | +
|
| 132 | +# k-adjustment |
| 133 | +N_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 3, rho = 0, sigma = sigma, |
| 134 | + true.diff = mu, equi.tol = log(1.25), design = "22co", alpha = 0.05, |
| 135 | + adjust = "k", nsim = 10000) |
| 136 | +
|
| 137 | +# Bonferroni adjustment |
| 138 | +N_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 3, rho = 0, sigma = sigma, |
| 139 | + true.diff = mu, equi.tol = log(1.25), design = "22co", alpha = 0.05, |
| 140 | + adjust = "bon", nsim = 10000) |
| 141 | +``` |
| 142 | +Without any adjustment, we find 18 subjects per study (hence 90 for the complete trial). When adopting k-adjustment, we find 22 subjects per study (hence 110 in total). Finally, when adopting Bonferroni adjustment we find 34 subjects per study (and thus 170 in total). |
| 143 | + |
| 144 | + |
| 145 | +# References |
0 commit comments