Skip to main content
Fig. 2 | BioData Mining

Fig. 2

From: Building a glaucoma interaction network using a text mining approach

Fig. 2

The Text Mining Pipeline. The text mining pipeline that corresponds to step 3 in Fig. 1. First, the segmenter module segments each article into its constituent sentences denoted s1 to sn. Second, the sentence tokenizer module tokenizes each sentence into a bag of words denoted w1 to wn. Third, the part of Speech POS module identifies the role of each word in a sentence. Fourth, the name entity recognition module NER extracts gene mentions E1, E2, En from the words of the sentence. Finally the relation extraction module (RE) extracts relations R1, R2, Rn from the words of the sentence. The output interaction from applying this sequence of modules is in the form: “Es, Rs, Es” and is saved in a database of interactions

Back to article page