Skip to main content

Table 2 Summary of the tissue identification capabilities of most related methods

From: Alignment of gene expression profiles from test samples against a reference database: New method for context-specific interpretation of microarray data

Method Strengths Limitations LOOCV (or 10-fold CV) Independent validation
AGEP Good classifier. Results available per gene, with a biologically meaningful distance metric. Computationally intensive. Weight of all genes equal. 93.6% accuracy 96.9% combined accuracy
NN Relatively robust and easy to setup. Very sensitive to the selection of parameters and the distance metric chosen. No simple choice for distance metric. No simple way to interpret gene-by-gene contribution to the similairy. 90.2% accuracy 94.4% combined accuracy
SVM Powerful classifying performance if properly customized for the task No simple solution for selection of kernel. With complex tasks somewhat subject to overfitting. No gene-by-gene contribution available in biologically interpretable manner. 90.4% accuracy NOTE: due to computational limitations was actually 10-fold cross-validation. 98.0% combined accuracy
DNA barcode (Zilliox et al. 2007) Good classifier. Simple to understand per gene comparison. Per gene classification is binary, missing out a lot of the variation. Not tested Not tested
Cancer molecular classification (Parmigiani et al. 2002) Good classifier. Simple to understand per gene comparison. Per gene classification is ternary, missing out a lot of the variation. Not tested Not tested
Probabilistic retrieval and visualization of biologically relevant microarray experiments (Caldas et al. 2009) Good at finding experiments that repeat biological responses. Works for gene sets derived from comparative experiments N/A N/A