### Biological motivation for simulation study

The signals used for study were all from a set of GWASs of 515 LCLs exposed to either the cancer drug temozolomide or 5‐fluorouracil. A strong association (p= 3.3∗10^{−16}) was found between LCL cytotoxicity to temozolomide and locus *rs531572*, located within the gene coding for *O*
^{6}
*‐methylguanine‐DNA methyltransferase* [*MGMT*: ENSG00000170430]. In addition, a strong association (p= 6.8∗10^{−26}) was found between the same loci and expression levels for MGMT transcripts[15]. *MGMT* is known to repair DNA damaged by temozolomide, and genetic variants affecting *MGMT* are known to be associated with temozolomide clinical efficacy[16]. Similarly, suggestive associations (*p*= 5.9∗10^{−7} and) were found between LCL cytotoxicity to 5‐fluorouracil and locus *rs2270311*, located within the gene coding for chimerin 2 [*CHN2*: ENSG00000106069]. Significant differences in the expression of CHN2 have been found between between colon cancer cells having different levels of 5‐fluorouracil resistance[17].

Figure1 illustrates the differences in mean viabilities between genotypes at each concentration for temozolomide and 5‐fluorouracil. The mean viability was corrected for potentially confounding covariates by least squares regression and estimation at the sample means for each covariate. These covariates include cellular growth rate, laboratory temperature, the first two genetic principal components and laboratory date (nominal). Signals are due to single nucleotide polymorphisms (SNPs), where black, red and blue circles represent genotypes for 0, 1 and 2 minor alleles, respectively.

After performing regression using the covariates mentioned above, error terms were assessed for multivariate normality. Although the error terms failed the Shapiro‐Wilk test for multivariate normality (p= 2.8∗10^{−4} for temozolomide and p= 1.4∗10^{−8} for 5‐fluorouracil,[18]), the goal of this simulation was to use real data as a guide in simulation. To this end, residuals were first transformed to be standardized and uncorrelated, according to[19]. Then histograms of errors for each drug concentration were overlaid with standard normal densities in Figures2 and3. In addition, Figures4 and5 show scatter plots of residuals between each pair of drug concentrations for temozolomide and 5‐fluorouracil. Although the distribution of errors are definitely not normal, from these plots it appears, at least visually, that the deviations from normality (with the exception of a few outliers) are not severe.

### Simulation and power comparisons of cell line methods

A simulation study was performed using the appropriate estimated means and error covariances, as described in the previous section. Using parameter estimates from these biological signals, data were generated as multivariate normal according to:

\phantom{\rule{-3.0pt}{0ex}}\begin{array}{ccc}\phantom{\rule{1em}{0ex}}{Y}_{\mathit{\text{ijk}}}\hfill & \phantom{\rule{0.5em}{0ex}}\sim \hfill & {\mathrm{N}}_{6}({\stackrel{~}{\mu}}_{i},\hat{\Sigma})\hfill \\ \phantom{\rule{1.8em}{0ex}}{\stackrel{~}{\mu}}_{i}\hfill & \phantom{\rule{0.5em}{0ex}}=\hfill & {\hat{\mu}}_{0}+\mathit{\text{ES}}\left[{\hat{\beta}}_{1}\mathrm{I}(i=1)+{\hat{\beta}}_{2}\mathrm{I}(i=2)\right]\hfill \\ \phantom{\rule{1.4em}{0ex}}{\hat{\beta}}_{m}\hfill & \phantom{\rule{0.5em}{0ex}}=\hfill & {\hat{\mu}}_{m}-{\hat{\mu}}_{0},\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}m\in 1,2\hfill \end{array}

(1)

where *Y*
_{
i
j
k
} is the vector of viabilities for six concentrations for the *k*
^{th} replication of the *j*
^{th} individual having genotype *i* (the number of minor alleles). Here,\hat{\Sigma} is the sample covariance of errors,{\hat{\mu}}_{i} is the vector of mean viabilities for genotype *i*, and *ES* is the effect size. Effect sizes ranged from 0 (corresponding to the null) to 1.0 (the observed differences between genotypes). The sample size was set to 500 and minor allele frequency (MAF) was set to 0.5.

For each effect size and signal, 2500 data sets were used to calculate test statistics for four previously reported methods (*I* *C* 50, *Slope*, *A* *U* *C*
_{
E
m
p
} and *ANOVA*)[14], as well as a new method using Pillai’s trace from a multivariate analysis of variance (MANOVA)[20]. In addition, 10,000 data sets were created with an effect size of zero, and a different random number seed, to represent the null distribution. In this way, *p*‐values for each test statistic under the alternative distribution were estimated by the proportion of larger statistics under the null distribution, as described in[14]. This was required, for the *ANOVA* method, as applied to all (non‐independent) observations, generated test statistics that did not follow the expected distribution under the null. Power curves describing the proportion of times the null hypothesis of no difference between genotypes was rejected, at the alpha = 0.05 level, are illustrated in Figure6, where panels **A** and **B** represent the power curves for simulation using signals from temozolomide/MGMT and 5‐fluorouracil/CHN2.

In addition, each of these same methods were compared using a previous simulation described in[14], where differences in the DR curves between genotypes are due to differences in the distribution of hill slope parameters. Figure6 gives power curves for a representative sample of these simulations, where data were simulated under an additive genetic model, with equally spaced drug dosages and a MAF of 0.5. Panels **C** ‐ **E** represent power curves for each method where differences in curves between genotypes are due to the “Min”, “IC50” and “Slope” parameter distributions, respectively. Using the Friedman test, significant differences (*p* < 10^{−15} for all) in *p*‐values were found between methods for every positive (*i.e.* non‐null) effect size for both sets of simulations.

Also, the effect of modifying the error structures from Equation 1 were explored. Here, the mean vectors *μ*
_{
i
} across genotypes *i* were taken from the signal for temozolomide/MGMT but the covariance matrix Σ was modified to represent various contrived correlation structures. This was done to assess how sensitive the power of MANOVA was to error structures and also to the assumption of multivariate normality. The chosen error structures include equal correlations using compound symmetric (*i.e.* constant) correlation (with *ρ* = 0.25,0.5 or 0.75), autoregressive (exponential attenuation) correlation (with *ρ* = 0.25,0.5 or 0.75) and no correlation. In addition, errors were generated independently, but using a centered gamma distribution with parameters shape=8 and scale=0.125. The results for each of these simulations is shown in Figure7.

Finally, simulations were performed to assess the strength of MANOVA with data generated using multivariate normal, but under non‐ideal situations. For these, mean vectors and covariances were constructed from 12 equally‐spaced doses. The difference in mean vectors between genotypes were designed to follow a specific univariate summary, including area under the curve, and the hill slope parameters “Min”, “IC50” and “Slope”, as illustrated in Figure8. Using these mean vectors, simulations were performed where error terms follow an exponential decay correlation structure, with *ρ* = 0.25. Power plots from these simulations are illustrated in Figure9.