Skip to main content

Table 1 Descriptive values of the data set

From: Taxonomy-based data representation for data mining: an example of the magnitude of risk associated with H. pylori infection

Attribute Value H. pylori histology Positive
(N = 609)
H. pylori histology Negative
(N = 473)
All
(N = 1082)
p-value
Age (years)   52.10 ± 6.66 52.01 ± 6.65 52.06 ± 6.66 0.887**
Sex Female 326 (53.5%) 286 (60.5%) 612 (56.6%) 0.022*
Education level (graduated) N/A 1 (0.2%) 0 (0.0%) 1 (0.1%) 0.130*
Secondary school 22 (3.6%) 14 (3.0%) 36 (3.3%)  
High school 99 (16.3%) 76 (16.1%) 175 (16.2%)  
Vocational school 297 (48.8%) 202 (42.7%) 499 (46.1%)  
Higher education (college/ university) 190 (31.2%) 181 (38.3%) 371 (34.3%)  
Income level Don’t know 25 (4.1%) 17 (3.6%) 42 (3.9%) 0.876*
< 100€ 30 (4.9%) 17 (3.6%) 47 (4.3%)  
100€-250€ 171 (28.1%) 124 (26.2%) 295 (27.3%)  
250€-500€ 274 (45.0%) 222 (46.9%) 496 (45.8%)  
500€-1000€ 80 (13.1%) 66 (14.0%) 146 (13.5%)  
> 1000€ 5 (0.8%) 5 (1.1%) 10 (0.9%)  
Will not answer 24 (3.9%) 22 (4.7%) 46 (4.3%)  
Has smoked at least 100 cigarettes No 326 (53.5%) 289 (61.1%) 615 (56.8%) 0.043*
Yes 282 (46.3%) 183 (38.7%) 465 (43.0%)  
N/A 1 (0.2%) 1 (0.2%) 2 (0.2%)  
Alcohol consumption per month (ethanol, g)   131.2 ± 197.4 105.5 ± 175.7 112.0 ± 188.6 0.004**
H. pylori serology Positive 570 (93.6%) 137 (29.0%) 707 (65.3%) < 0.001*
Total 609 473 1082
  1. *Chi-square test
  2. **Mann-Whitney U-test (the attribute is not normally distributed)