Skip to main content

Table 2 Classification results (H. pylori positive vs negative) in clustered and raw data

From: Taxonomy-based data representation for data mining: an example of the magnitude of risk associated with H. pylori infection

 

Data set

Area under ROC

(95% CI)

Percent correct

(95% CI)

True positive rate

(95% CI)

True negative rate

(95% CI)

Number of rules

(95% CI)

Tree size

(95% CI)

Serialized model size in bytes

(95%CI)

FURIA

Clustered

0.871a (0.87…0.873)

87.3 (87.1…87.5)

0.799a (0.795…0.802)

0.930a (0.928…0.932)

9.5a (9.2…9.7)

 

592230a (588,461…595,998)

Raw

0.880a (0.878…0.882)

86.9 (86.7…87.1)

0.824a (0.821…0.828)

0.904a (0.902…0.907)

11.6a (11.2…11.9)

 

976093a (970,497…981,689)

RIPPER

Clustered

0.872a (0.87…0.874)

87.1 (86.9…87.3)

0.804a (0.801…0.807)

0.923a (0.921…0.925)

4.8a (4.7…4.8)

 

18746a (18,706…18,785)

Raw

0.881a (0.879…0.883)

86.9 (86.7…87.2)

0.833a (0.83…0.837)

0.897a (0.895…0.9)

4.2a (4.2…4.3)

 

25909a (25,882…25,936)

RIDOR

Clustered

0.847 (0.845…0.849)

85.8 (85.6…86.0)

0.760a (0.756…0.765)

0.934a (0.931…0.936)

7.3a (7.2…7.5)

 

7940a (7796…8083)

Raw

0.85 (0.847…0.852)

85.6 (85.4…85.8)

0.802a (0.797…0.807)

0.897a (0.894…0.9)

6.4a (6.3…6.4)

 

5412a (5328…5495)

C4.5

Clustered

0.891a (0.889…0.894)

86.9a (86.7…87.1)

0.802a (0.798…0.806)

0.921a (0.918…0.923)

23.3a (22.8…23.8)

28.6a (28.0…29.1)

17914a (17,819…18,010)

Raw

0.868a (0.865…0.87)

86.1a (85.9…86.3)

0.826a (0.822…0.829)

0.889a (0.886…0.891)

26.3a (25.5…27.1)

37.3a (36.3…38.4)

25801a (25,608…25,994)

CART

Clustered

0.867a (0.865…0.869)

86.1a (85.9…86.3)

0.786a (0.783…0.79)

0.919a (0.917…0.921)

 

6.3a (6.2…6.5)

603873a (603,126…604,620)

Raw

0.889a (0.887…0.891)

87.8a (87.6…88.0)

0.845a (0.842…0.848)

0.904a (0.901…0.906)

 

5.1a (5.1…5.2)

909785a (908,345…911,226)

Random Forest

Clustered

0.887a (0.885…0.889)

82.2a (82.0…82.4)

0.736a (0.732…0.74)

0.888a (0.886…0.891)

  

4537297a (4,532,554…4,542,040)

Raw

0.915a (0.913…0.916)

85.2a (85.0…85.4)

0.754a (0.75…0.758)

0.927a (0.925…0.929)

  

4003216a (3,999,312…4,007,121)

  1. astatistically significant difference (p < 0.05, Mann-Whitney U test)