Skip to main content
Figure 2 | BioData Mining

Figure 2

From: Neural networks for genetic epidemiology: past, present, and future

Figure 2

Overview of the GPNN method (adapted from Ritchie et al. 2003). First, GPNN has a set of parameters to be initialized before beginning the evolution of NN models. Second, the data are divided into 10 equal parts for 10-fold cross-validation. Third, training begins by generating an initial population of random solutions. Fourth, each NN is evaluated on the training set and its fitness (classification error) recorded. Fifth, the best solutions are selected for crossover and reproduction using a fitness-proportionate selection technique. The new generation begins the cycle again. This continues until a stopping criterion (classification error of zero or limit on the number of generations) is met. At the end of the GPNN evolution, the overall best solution is selected as the optimal NN. Sixth, this best GPNN model is tested on the 1/10 of the data left out to estimate the prediction error of the model. Steps two through six are performed ten times with the same parameters settings, each time using a different 9/10 of the data for training and 1/10 of the data for testing. The loci that are consistently present in the GPNN models are selected as the functional loci and are used as input to a final GPNN evolutionary process to estimate the classification and prediction error of the GPNN model.

Back to article page