Skip to main content
Fig. 3 | BioData Mining

Fig. 3

From: Principal component gene set enrichment (PCGSE)

Fig. 3

Simulation results for standardized mean difference statistic, \({S_{k}^{D}}\). Boxplots showing the distribution of PCGSE-computed enrichment p-values for the first 10 of 20 simulated gene sets relative to the first 2 PCs of 1000 datasets simulated according to the model described in Section “Benchmark PC gene set testing method” of the main PCGSE manuscript and illustrated in Fig. 2 above. For all displayed results, PCGSE was executed using the Fisher-transformed Pearson correlation coefficient between each genomic variable and each PC as the gene-level test statistic with the standardized mean difference as the gene set test statistic. Plots a, b and c display the distribution of enrichment p-values for the first 10 gene sets relative to the first PC of all simulated data sets. In plots d, e and f, enrichment p-values computed relative to the second PC are displayed. For plots a and d, the p-values were computed using a two-sided t-test on \({S^{D}_{k}}\), for plots b and e, the p-values were computed using a two-sided t-test on \(S^{D,adj}_{k}\) and, for plots c and f, the p-values were computed using a two-sided permutation test on \({S^{D}_{k}}\). For PC 1 and gene set 2, the type I error rate at a nominal α of 0.05 was 0.382 for the unadjusted t-test, 0.057 for the correlation-adjusted t-test and 0.05 for sample permutation of \({S_{k}^{D}}\). For PC 2 and gene set 1, the type I error rate at a nominal α of 0.05 was 0.257 for the t-test, 0.016 for the correlation-adjusted t-test and 0.014 for sample permutation test

Back to article page