Skip to main content

Table 3 Evaluation of two classifiers A and B on the same two datasets

From: The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation

Classifier

dataset

TP

FN

TN

FP

Ï•

TPR

TNR

BM

MCC

  

(a) Relative CM

A

1

0.35

0.15

0.35

0.15

0.5

0.7

0.7

0.4

0.4

  

A

2

0.035

0.015

0.665

0.285

0.05

0.7

0.7

0.4

0.2

  

B

1

0.40

0.10

0.40

0.10

0.5

0.8

0.8

0.6

0.6

  

B

2

0.04

0.01

0.76

0.19

0.05

0.8

0.8

0.6

0.3

  

Classifier

dataset

TP

FN

TN

FP

Ï•

TPR

TNR

BM

MCC

n+

n–

(b) Exemplary CM for a sample size of 200

A

1

70

30

70

30

0.5

0.7

0.7

0.4

0.4

100

100

A

2

7

3

133

57

0.05

0.7

0.7

0.4

0.2

10

190

B

1

80

20

80

20

0.5

0.8

0.8

0.6

0.6

100

100

B

2

8

2

152

38

0.05

0.8

0.8

0.6

0.3

10

190

  1. Ideally, both classifiers are evaluated on both datasets as shown in this table. Otherwise, one should rely on metrics which are independent of the prevalence such as BM. Matthews correlation coefficient (MCC) might be unreliable if one wants to compare classification results across datasets