Skip to main content

Table 1 Statistics of Candidate Sentences. We sorted each abstract into a training, tuning and testing set. Numbers in parentheses show the number of positives and negatives that resulted from the hand-labeling process

From: Expanding a database-derived biomedical knowledge graph via multi-relation extraction from biomedical abstracts

Relationship

Train

Tune

Test

Disease-associates-Gene (DaG)

2.49 M

696 K (397 + , 603-)

348 K (351 + , 649-)

Compound-binds-Gene (CbG)

2.4 M

684 K (37 + , 463-)

341 k (31 + , 469-)

Compound-treats-Disease (CtD)

1.5 M

441 K (96 + , 404-)

223 K (112 + , 388-)

Gene-interacts-Gene (GiG)

11.2 M

2.19 M (60 + , 440-)

1.62 M (76 + , 424-)