Skip to main content

Table 2 Comparison of performances of Random Forest (RF) and k-Nearest Neighbors (k-NN) classifiers in terms of number of predicted annotations (N), their precision (Pr) and the average level (\(\overline {L}\)) of predicted annotation terms in the Gene Ontology DAG (when the term of a predicted annotation belongs to multiple Gene Ontology levels, only its lowest level was considered). SM is the single model method; AVG considers the average of the likelihood scores given by the models inferred from five different perturbation random seeds; ∩x/5 considers those predictions from x out of the five models. Probability of perturbation p and likelihood threshold ρ were set to their respective default values p=10% and ρ=0.8

From: Gene function finding through cross-organism ensemble learning

   RF k-NN
Target Ensemble method N Pr \(\overline {L}\) N Pr \(\overline {L}\)
Mus SM 2,285 0.908 1.604 2,841 0.803 1.864
musculus AVG 1,204 0.952 1.799 1,227 0.817 2.487
 1/5 4,753 0.826 1.736 6,378 0.704 1.888
 2/5 2,896 0.916 1.653 3,380 0.836 1.826
 3/5 2,157 0.947 1.569 2,396 0.911 1.734
 4/5 1,764 0.973 1.491 1,499 0.937 1.774
 5/5 932 0.987 1.626 552 0.955 2.317
Bos SM 132 0.874 2.721 123 0.657 2.835
taurus AVG 57 0.947 3.037 44 0.568 3.000
 1/5 373 0.794 2.544 355 0.625 2.725
 2/5 173 0.925 2.831 155 0.710 2.864
 3/5 100 0.960 2.854 62 0.726 3.022
 4/5 60 0.967 2.931 32 0.656 3.238
 5/5 37 0.946 3.143 13 0.462 3.500
Gallus SM 69 0.721 2.701 50 0.534 3.255
gallus AVG 36 0.833 3.367 29 0.690 3.800
 1/5 175 0.617 2.157 137 0.416 2.509
 2/5 88 0.682 2.700 55 0.564 3.290
 3/5 56 0.857 2.958 31 0.742 3.652
 4/5 38 0.895 3.324 17 0.765 4.308
 5/5 24 0.917 3.545 11 0.909 5.000
Dictyostelium SM 966 0.846 2.522 1,029 0.718 2.651
discoideum AVG 773 0.906 2.500 869 0.794 2.733
 1/5 1,917 0.741 2.454 2,108 0.574 2.518
 2/5 1,334 0.833 2.531 1,233 0.737 2.664
 3/5 997 0.872 2.517 858 0.830 2.718
 4/5 760 0.905 2.529 622 0.883 2.703
 5/5 444 0.941 2.555 326 0.951 2.900
\