Skip to main content

Table 4 Summary of the top five features which are most important for the classification of EFR according to the GMLVQ and RF method

From: Application of an interpretable classification model on Early Folding Residues during protein folding

 

GMLVQ

Random Forest

Feature

Rank

Influence score

Rank

Influence score

PlipHpCL

1

0.159

3

1.370

LF

2

0.127

2

1.403

PlipBN

3

0.063

15

0.900

SecSize

4

0.059

19

0.854

e

5

0.042

4

1.332

PlipCL

7

0.012

1

1.700

ConvCC

23

0.009

5

1.223

  1. Importance scores for the RF were computed by the MATLAB implementation. Influence scores are in arbitrary units, higher values occur for features important for class discrimination. The values of GMLVQ and the predictor importance values are method-specific and not directly comparable; therefore, the ranks of the top five features are given