Skip to main content
Fig. 1 | BioData Mining

Fig. 1

From: Improving machine learning reproducibility in genetic association studies with proportional instance cross validation (PICV)

Fig. 1

Comparing traditional cross validation and proportional instance cross validation (PICV). a The overall distribution of 9 SNP-SNP interaction genotypes (the 9 categories that result from the interaction of two SNPs in a hypothetical population of individuals. Note: only one possible allocation is depicted. b Traditional cross validation in which 2/3 of observations are randomly allocated to the training set and the remaining 1/3 are allocated to the testing set can result in draws with imbalanced genotype proportions. c PICV randomly allocates 2/3 of observations of each genotype to the training set and the remaining 1/3 to the testing set, ensuring that the relative proportions of genotypes are maintained

Back to article page