Skip to main content

Table 2 Classification results (H. pylori positive vs negative) in clustered and raw data

From: Taxonomy-based data representation for data mining: an example of the magnitude of risk associated with H. pylori infection

  Data set Area under ROC
(95% CI)
Percent correct
(95% CI)
True positive rate
(95% CI)
True negative rate
(95% CI)
Number of rules
(95% CI)
Tree size
(95% CI)
Serialized model size in bytes
(95%CI)
FURIA Clustered 0.871a (0.87…0.873) 87.3 (87.1…87.5) 0.799a (0.795…0.802) 0.930a (0.928…0.932) 9.5a (9.2…9.7)   592230a (588,461…595,998)
Raw 0.880a (0.878…0.882) 86.9 (86.7…87.1) 0.824a (0.821…0.828) 0.904a (0.902…0.907) 11.6a (11.2…11.9)   976093a (970,497…981,689)
RIPPER Clustered 0.872a (0.87…0.874) 87.1 (86.9…87.3) 0.804a (0.801…0.807) 0.923a (0.921…0.925) 4.8a (4.7…4.8)   18746a (18,706…18,785)
Raw 0.881a (0.879…0.883) 86.9 (86.7…87.2) 0.833a (0.83…0.837) 0.897a (0.895…0.9) 4.2a (4.2…4.3)   25909a (25,882…25,936)
RIDOR Clustered 0.847 (0.845…0.849) 85.8 (85.6…86.0) 0.760a (0.756…0.765) 0.934a (0.931…0.936) 7.3a (7.2…7.5)   7940a (7796…8083)
Raw 0.85 (0.847…0.852) 85.6 (85.4…85.8) 0.802a (0.797…0.807) 0.897a (0.894…0.9) 6.4a (6.3…6.4)   5412a (5328…5495)
C4.5 Clustered 0.891a (0.889…0.894) 86.9a (86.7…87.1) 0.802a (0.798…0.806) 0.921a (0.918…0.923) 23.3a (22.8…23.8) 28.6a (28.0…29.1) 17914a (17,819…18,010)
Raw 0.868a (0.865…0.87) 86.1a (85.9…86.3) 0.826a (0.822…0.829) 0.889a (0.886…0.891) 26.3a (25.5…27.1) 37.3a (36.3…38.4) 25801a (25,608…25,994)
CART Clustered 0.867a (0.865…0.869) 86.1a (85.9…86.3) 0.786a (0.783…0.79) 0.919a (0.917…0.921)   6.3a (6.2…6.5) 603873a (603,126…604,620)
Raw 0.889a (0.887…0.891) 87.8a (87.6…88.0) 0.845a (0.842…0.848) 0.904a (0.901…0.906)   5.1a (5.1…5.2) 909785a (908,345…911,226)
Random Forest Clustered 0.887a (0.885…0.889) 82.2a (82.0…82.4) 0.736a (0.732…0.74) 0.888a (0.886…0.891)    4537297a (4,532,554…4,542,040)
Raw 0.915a (0.913…0.916) 85.2a (85.0…85.4) 0.754a (0.75…0.758) 0.927a (0.925…0.929)    4003216a (3,999,312…4,007,121)
  1. astatistically significant difference (p < 0.05, Mann-Whitney U test)