Skip to main content
Fig. 1 | BioData Mining

Fig. 1

From: Changing word meanings in biomedical literature reveal pandemics and new technologies

Fig. 1

A The first step of our data pipeline is where PMCOA papers and BioRxiv/MedRxiv preprints are binned by their respective posting year. Following the binning process, we train ten word2vec models for each year’s manuscripts. B Upon training each individual word2vec model, we align every model onto an anchor model. C We capture token differences using an intra-year and inter-year approach. Each arrow indicates comparing all tokens from one model with their respective selves in a different model. D The last step combines the above calculations into a single metric to allow for a time series to be constructed. Once constructed, we use a statistical technique to autodetect the presence of a changepoint

Back to article page