Method | Strengths | Limitations | LOOCV (or 10-fold CV) | Independent validation |
---|---|---|---|---|
AGEP | Good classifier. Results available per gene, with a biologically meaningful distance metric. | Computationally intensive. Weight of all genes equal. | 93.6% accuracy | 96.9% combined accuracy |
NN | Relatively robust and easy to setup. | Very sensitive to the selection of parameters and the distance metric chosen. No simple choice for distance metric. No simple way to interpret gene-by-gene contribution to the similairy. | 90.2% accuracy | 94.4% combined accuracy |
SVM | Powerful classifying performance if properly customized for the task | No simple solution for selection of kernel. With complex tasks somewhat subject to overfitting. No gene-by-gene contribution available in biologically interpretable manner. | 90.4% accuracy NOTE: due to computational limitations was actually 10-fold cross-validation. | 98.0% combined accuracy |
DNA barcode (Zilliox et al. 2007) | Good classifier. Simple to understand per gene comparison. | Per gene classification is binary, missing out a lot of the variation. | Not tested | Not tested |
Cancer molecular classification (Parmigiani et al. 2002) | Good classifier. Simple to understand per gene comparison. | Per gene classification is ternary, missing out a lot of the variation. | Not tested | Not tested |
Probabilistic retrieval and visualization of biologically relevant microarray experiments (Caldas et al. 2009) | Good at finding experiments that repeat biological responses. | Works for gene sets derived from comparative experiments | N/A | N/A |