Skip to main content

Novel digital approaches to the assessment of problematic opioid use

Abstract

The opioid epidemic continues to contribute to loss of life through overdose and significant social and economic burdens. Many individuals who develop problematic opioid use (POU) do so after being exposed to prescribed opioid analgesics. Therefore, it is important to accurately identify and classify risk factors for POU. In this review, we discuss the etiology of POU and highlight novel approaches to identifying its risk factors. These approaches include the application of polygenic risk scores (PRS) and diverse machine learning (ML) algorithms used in tandem with data from electronic health records (EHR), clinical notes, patient demographics, and digital footprints. The implementation and synergy of these types of data and approaches can greatly assist in reducing the incidence of POU and opioid-related mortality by increasing the knowledge base of patient-related risk factors, which can help to improve prescribing practices for opioid analgesics.

Peer Review reports

Background

The prevalence and consequences of problematic opioid use (POU) continue to be serious societal issues. Of the 71,000 overdose deaths that occurred in the United States in 2019, over 70% involved opioids [1, 2]. Although the majority of fatal opioid overdoses involve illicitly manufactured and/or obtained opioids [2], like fentanyl and heroin, most individuals who misuse opioids are initially prescribed opioids for pain management [3,4,5]. Therefore, developing effective approaches to identify risk factors associated with the initiation of POU in healthcare settings can contribute to safer opioid prescribing practices and fewer deleterious consequences of prolonged and elevated opioid use.

Major challenges associated with identifying the risk factors of POU stem, in part, from the complexity of the concept itself. POU is described by many terms, each existing on a continuum of severity with opioid misuse considered the least severe and opioid dependence commonly considered the most severe [6]. However, this varies in the literature [7]. In addition to these terms, there also exists a Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) clinical diagnosis for opioid use disorder (OUD), which incorporates terms from the POU continuum and provides a severity scale based on the number of criteria met [8]. Although the OUD diagnosis subsumes the POU continuum, terms like misuse, abuse, addiction, and dependence are still commonly used, both in the literature and in patient medical records, and can describe POU with or without an accompanying OUD diagnosis [9,10,11]. Because of the terminologic ambiguity and potential overlap among terms, the reported prevalence of POU-related traits varies widely [7, 9, 10] and is commonly not reported or underreported [12,13,14,15]. However, despite difficulties arising from the challenges of describing and quantifying POU as a determinant of health, many individuals exposed to opioid analgesics develop some level of problematic use.

A second difficulty in identifying risk factors for POU results from the sources of risk themselves. Variation in the POU phenotype comes from three sources: genetic variation, environmental variation, and the interaction between the two [16], the first two of which have been shown to be significant factors underlying POU prevalence and severity [17,18,19,20,21,22,23,24,25,26,27]. However, although humans live complex lives where important interactions and events occur regularly, potential and perhaps significant sources of risk are commonly overlooked or not recognized [28,29,30]. Expansion of the genetic and environmental dimensions of risk yields many sources to be considered including omics data, electronic medical records, demographics and personal histories, and digital footprints, which include, but are not limited to, social media activity (Fig. 1). Thus, capturing the greater phenotypic and environmental profile of an individual has the potential to greatly improve risk assessments for POU.

Fig. 1
figure 1

Conceptual image illustrating potential sources of POU risk using puzzle pieces. The right puzzle pieces (Medical History Data and Lifetime & Current Psychiatric Disease Comorbidity) represent phenotypes that are usually mined from electronic health records (EHR), clinical notes, or structured assessments/questionnaires. The left puzzle pieces (Digital Footprint and Environmental & Societal Data) represent mostly environmental data that can be obtained from EHR, clinical notes, structured assessments/questionnaires, social media, and biometrics. The middle puzzle piece (Omics Data) represents large-scale genetic, epigenetic, proteomic, and metabolomic data. This piece links the two sides of the image as there are likely synergistic and/or causal relationships with the environment and the greater phenotype

The complexity of POU as a measurable trait presents many challenges from a data science perspective due to ambiguous POU-related terms and their complex, but poorly explored, root causes [6, 7, 28, 30, 31]. These issues make powerful digital approaches difficult to implement despite many recent advances in the fields of artificial intelligence, bioinformatics, and computational biomedicine. In this review, we discuss the importance of considering all available sources of data when assessing disease risk, ways in which POU can be explored as a trait of interest in biomedical research, and novel digital approaches and technologies that can be utilized to explore complex and diverse datasets. Our goal is to illustrate how diversifying and expanding both data acquisition and methodology can improve POU risk assessment and prediction, potentially alleviating adverse impacts of POU on patients, families, and society.

Review methods

For compiling the significant predictors of POU in the literature, we used the search terms “risk factors of opioid use disorder” and “predictors of opioid use disorder” in Google Scholar and identified scientific research articles and clinical studies over the past 10 years within the first 100 search results of both search terms. We used this information to create Fig. 2 (orange bars). To include the gene/locus information presented in Fig. 2 (blue bars), we integrated data from a literature search protocol implemented in a previous review [31].

Fig. 2
figure 2

Bar plot (orange) of the count of psychiatric disorders and substance use disorders that were significant indicators of problematic opioid use (POU) phenotypes in our literature search (see Review Methods for criteria) and bar plot (blue) of the shared gene/locus count between psychiatric disorders and substance use disorders with POU (see reference [28] for methodology). Shared gene/locus associations reflect the relative representation of each disorder as significant predictors of POU. Depression, nicotine and alcohol use disorders, and anxiety disorders show high shared genetic liabilities with POU and are the most significant indicators of POU. However, schizophrenia displays high shared genetic liability with POU despite lower POU prediction

Risk factors of POU

Risk factors of POU are likely vast and include sources of data that may not be obvious or readily available. However, risk factors can stem from five basic sources (Fig. 1): Lifetime and current psychiatric disease comorbidities that include both mental disorders and substance use disorders (including previous opioid use) [32,33,34], medical histories from EHR data and clinical notes [35, 36], environmental and societal factors that include demographics and personal histories [28], digital footprints including social media and biometrics [37,38,39], and omics data which comprise genetic, epigenetic, transcriptomic, and other large-scale biological data [19,20,21,22,23,24,25,26,27]. In Fig. 1, the right side of the puzzle image represents medical/biological phenotypes while the left side represents mostly environmental features. In the center, omics data links the two sides of the conceptual image, illustrating that a patient’s genotype likely has synergistic and/or causal relationships with the environment and the greater phenotype.

Of the above sources of information, comorbid lifetime and current psychiatric disease diagnoses are perhaps the most significant indicators of POU and its development. For example, tobacco use disorder (TUD) has a comorbidity rate as high as 98% in populations of patients in medication-assisted treatment programs for POU [40,41,42,43]. TUD is also a common pre-morbid risk factor of POU, associated with the initiation and persistence of opioid use and OUD development [44]. Similar relationships with POU have also been described for cocaine use disorder [45, 46], alcohol use disorder [47], and cannabis use [48]. Mood and anxiety disorders are also commonly associated with POU. Depression alone has been linked to the risk of opioid relapse [34], opioid misuse [49], a diagnosis of OUD [50], and risk of death from OUD-related overdose [51], while anxiety disorders have been linked to opioid relapse [34], non-medical use [52], and misuse [53]. Figure 2 (orange bars) illustrates how often a psychiatric disease was found to be a significant indicator of a POU phenotype within our literature search criteria (see Review Methods) [10, 17, 44, 47, 49, 50, 52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81]. Depression, nicotine use/smoking status, alcohol use, and anxiety disorders were the most common predictors of POU (Fig. 2; orange bars). In addition to these comorbidities as risk factors of POU, the strongest indicators of future POU development are past instances of opioid use or POU [10, 17, 44, 47, 49, 50, 52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81]. Due to the nature of the relationship between psychiatric disease and POU, histories of substance use/abuse and mental illness, although sometimes challenging to obtain, are important sources of data for POU risk assessment.

Other major contributions to POU risk are an individual’s genetic and epigenetic profiles [22, 24, 27, 82, 83]. Several genetic variants have been associated with opioid use and dependence via genome-wide association studies (GWAS) and candidate gene studies. Foremost of these are variants located in the gene encoding the μ-opioid receptor 1 (OPRM1), which have putative roles in the genetics and pharmacogenetics of POU [22, 24, 27, 82]. Structural and epigenetic variation has also been associated with POU. As examples, copy number variation in the genes KCND2 and MAP3K4 have been associated with opioid dependence [83] and opioid exposure has been shown to induce marked changes in histone acetylation, histone methylation, DNA methylation, and non-coding RNA expression, which collectively have the capacity to affect the expression of many gene targets [84]. Many of the genetic variants associated with various POU phenotypes have also been associated with other psychiatric disorders. Figure 2 (blue bars) illustrates how many genes or loci, per disorder, have also been associated with a POU phenotype [31]. Somewhat reflecting the relative representation of each disorder as important predictors of POU in the literature (Fig. 2; orange bars), genes associated with nicotine use, schizophrenia, depression, and alcohol use have the strongest shared genetic liabilities with POU (Fig. 2; blue bars) [31]. Although considerable research has been conducted to identify genetic factors contributing to POU and those shared with other disorders, new variants are being discovered using modern and robust approaches, highlighting the importance of gathering high quality genomic data when assessing POU risk [85].

A patient’s medical history, usually accessible via EHR structured data and clinical notes, can provide extensive information useful for POU risk assessment. However, the structure and format of EHR data and clinical notes are not uniform and can be limited and/or difficult to navigate depending on the healthcare institution [86, 87]. Despite this, efforts should be taken to collect as much data as possible on each patient when considering risk of POU as many the non-biological predictors associated with it include level of education, marital status, income, geographic location, and insurance status [17, 49, 63, 88]. There are also, of course, biological predictors of POU risk that include but are not limited to BMI, race, age, sex, medication history, procedure/operation history, comorbid disorders, and endophenotypes [17, 49,50,51, 88, 89]. Endophenotypes are defined as physiological traits related or contributing to a disease trait [89]. For example, an endophenotype for hypertension is blood pressure. Clinical notes can also be a robust source of risk indicators for POU as they can capture critical pieces of information not available in structured EHR data. During a clinical encounter, a patient may discuss topics with a healthcare professional that may be associated with risk of POU development but not be captured in specific EHR data entry fields [36]. If this information is recorded electronically, natural language processing (NLP) approaches can identify features from typed language that can assist in the development of risk assessment protocols [64, 90]. Other potential sources of risk can be derived from pain and mental health assessments during hospital stays as ratings of both have been associated with POU [66, 91, 92]. An example of a system designed to capture these types of information is the Patient-Reported Outcomes Measurement Information System® (PROMIS®), which produces scores for several metrics including anger, anxiety, depression, pain behavior, and pain interference. If available, these types of data should be incorporated in POU risk assessments due to the strength of the associations between mental health and perceived pain with POU.

Perhaps the most elusive source of data for POU risk assessment comes from an individual’s environment, both physical and digital. As highlighted above, many aspects of an individual’s personal life have been shown to be significant indicators of POU [17, 18]. Much of this information is not readily available in EHR data but may be accessible via clinical notes, structured assessments, or questionnaires or derived from social media data. For example, text analyzed from opioid-related groups on Reddit.com identified significant risk factors for OUD, opioid relapse, and recovery seeking behavior [39]. However, most striking is that 72% of individuals who had relapsed exhibited strong emotional language in 2 of 10 possible emotional categories – “Joy” and “Negative”. This implies that relapse is related to extreme emotions and treatments aimed at supporting the regulation of emotion could reduce the risk of relapse or increased opioid use [39]. Another example comes from utilizing over 9000 opioid-related posts and demographic information from Twitter [37]. Using NLP and ML classification, a significant correlation was found between posts classified as related to opioid misuse and real-time overdose deaths rates in Pennsylvania [37]. Although powerful approaches, these examples are either anonymous or not linked to a person’s medical history and therefore difficult to use in POU risk assessment in clinical settings. However, text and demographic data from Facebook, linked to an individual’s EHR data, was implemented in various ML algorithms to predict medical conditions That include diabetes, hypertension, depression, and digestive issues [38]. Further, the inclusion of data from Facebook significantly improved the prediction accuracy of models in 18 of the 21 disease categories explored [38]. Notably, among mental health conditions, predictions of anxiety, depression, and psychosis showed the most improvement [38]. Data from social media may also provide additional insight into risk factors, which EHR data cannot, insofar as many aspects of patients’ personal histories and daily experiences are not available in the medical record. These studies illustrate how encoded social media language and demographic data, much like genetic variants, can be used as disease factors in ML approaches. Also, the predictive implications for psychiatric traits suggest that social media data could be powerful in assessing factors contributing to the risk of many substance use disorders including POU [38]. Finally, biometric data from sources like smart watches and phone apps can provide insight into a patient’s exercise regime or sleep schedule, among other things. Because greater physical activity has been associated with lower opioid use [17] and greater opioid use has been associated with interrupted sleep [93], biometric data can be useful in POU risk assessment. Linking these types of data to a patient’s medical history can significantly improve POU risk assessment.

Quantifying and categorizing POU

Because POU is a broad phenotype, categorizing it or quantifying its levels as a phenotype or response variable presents challenges. Terms like use, misuse, dependence, addiction, and abuse do not have universally accepted criteria and, therefore; commonly do not translate into useful comparisons across experiments [7]. However, in the case of OUD, because it is a clinical diagnosis, both its presence or absence or its severity can be useful phenotypes for risk assessment. Despite this, there are limitations in relying on the diagnosis of OUD for risk assessment. The first of these is the underdiagnosis of the disorder, which results in a failure to detect risk factors and/or underperformance of predictive models [12,13,14,15]. The second limitation is that there is variation in the interpretation of criteria and/or severity of OUD diagnoses across clinicians or healthcare settings [94]. A possible solution to this limitation is the quantification of opioid exposure or usage. Although there are many opioid-based analgesics prescribed in clinical settings, the dosage of various opioids can be standardized as morphine equivalent dosage (MED), also called morphine milligram equivalent daily dosage (MEDD) or morphine milligram equivalents (MME) [95]. Other conversion metrics that exist internationally include the defined daily dose (DDD) [96] and oral morphine equivalent (OMEQ) [97]. A potential downside of this approach is that MME does not account for biological, genetic, or pharmacokinetic differences among individuals, highlighting the importance of collecting data from diverse sources when performing risk assessments and using MME in combination with a clinical diagnosis of OUD or the presence of other POU-related terms, as may be available.

MME is also advantageous over an OUD diagnosis or POU-related terms, particularly those derived from the EHR, in that it is a continuous variable. Continuous variables are statistically more powerful in regression-based approaches in ML pipelines and can be converted to discrete variables to take advantage of classification-based approaches. Discrete variables, like OUD presence/absence or severity as measured by criteria met, are limited only to classification-based approaches as they cannot be inherently converted to continuous variables. MME can be converted into discrete levels based on pre-determined dosage ranges or by increments of change before and after a medical procedure like surgery.

Predicting MME after surgery can be a powerful indicator of surgical success as the primary goal of elective surgery is to decrease patient pain, and therefore MME. However, many patients’ experience greater pain after surgery [98], potentially leading to higher levels of MME. Identifying the contributors to this type of outcome can assist clinicians in determining whether surgery is in the best interest of the patient, as high levels of MME are associated with death by opioid overdose and opioid-related toxicity [81]. With increasing awareness of this risk, efforts are being made to reduce the use of opioid analgesics by using, for example, non-steroidal anti-inflammatory drugs or other non-opioid analgesics when surgery is indicated [99, 100]. In addition to MME, it is also important to collect and include pain ratings provided by the patient, both before and after surgery [92]. These data can be merged with MME to predict the pain response following surgery, as MME and pain are positively correlated [98].

Advances and limitations of digital approaches for POU risk assessment

Polygenic risk scores

Polygenic risk scores (PRSs) are a useful method of estimating an individual’s genetic risk for a specific trait and a promising approach for disease risk assessment. The literature on PRSs for POU, although limited, is growing. In a recent study of a large, mixed ancestry cohort, PRSs were calculated for four substance use traits (alcohol use disorder, OUD, smoking initiation, and lifetime cannabis use) [101]. Among African Americans, the PRS for alcohol use disorder and among European Americans the PRS for alcohol use disorder, OUD, and smoking initiation were associated with their respective Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnoses and criterion counts – highlighting the predictive power of PRS. Phenome-wide association studies (PheWAS) of these PRSs showed the most associations with other substance use phenotypes. For example, the PRS for OUD was associated with 7 substance-use phenotype categories, the strongest of which was the DSM-5 diagnosis of tobacco dependence. A large, meta-GWAS of European Americans identified loci for problematic alcohol use with significant genetic correlations between problematic alcohol use and 138 phenotypes [102]. The highest genetic correlations were with other alcohol phenotypes, tobacco phenotypes, and psychiatric disorders including depression, schizophrenia, and bipolar disorder. These results highlight the utility of PRS for identifying substance use disorder risk and shared genetic liabilities among various psychiatric comorbidities.

Despite the potential utility of PRS for identifying and quantifying disease risk, it has inherent limitations. Many studies are of samples with a specific ancestral background, which limits the applicability of the PRS. Despite this, limiting cohorts to a specific ancestral background is a common practice as to include multiple ancestral groups, except by meta-analysis, overlooks their different genetic architectures and allele frequencies [103]. PRSs have been shown to provide only limited utility across population groups [104]. Further, PRSs can differ significantly even within population groups when the data are stratified by characteristics such as socioeconomic status, age, and sex [105].

Due to the difficulty in recruiting and assessing large samples for GWAS, it is becoming increasingly common to use EHR data linked to genomic data for PRS prediction due to the wide array of phenotypes available, the speed at which studies can be performed, and the potentially high levels of reproducibility. However, these benefits come with their own challenges. A recent review highlights some of the challenges of linking medical records to genomic biobank data and considerations on how to limit or remove them [89]. Potential difficulties include properly defining disease phenotypes universally (e.g., which for POU can be difficult given the absence of a universally agreed-upon phenotype), complexities and redundancies in the International Classification of Diseases (ICD) code systems, the limited applicability of GWAS summary statistics using datasets that represent only one ancestral group, and the small effect sizes associated with common variants [89]. However, these limitations and challenges can be mitigated by utilizing the right tools. For instance, much of the ambiguity introduced by ICD codes or complex disease phenotypes, like POU, can be alleviated by using phenotyping algorithms specifically designed to deal with diverse types of data, like EHR data [89, 106,107,108]. The Phenotype KnowledgeBase is a public repository that houses numerous algorithms for this distinct purpose and can help to identify difficult phenotypes [109]. Furthermore, by utilizing ML-based approaches, EHRs can drastically improve PRS study design, reproducibility, and prediction as the use of EHR-mined phenotypes can reduce the time required to build a cohort while ensuring that the population in which the PRS is being estimated is representative of the healthcare system population, increasing overall diversity [89].

Another limitation of PRSs is that they do have the statistical power to detect the existence of epistatic (i.e., gene-gene) interactions when assessing polygenic risk. A recent approach, the Multilocus Risk Score (MRS), uses model-based multifactor dimensionality reduction to detect epistasis between loci. A study that tested the efficacy of this approach compared standard PRS methods with the MRS method in a diverse collection of simulated datasets [110]. In 335 of 450 datasets, MRS produced greater area under the receiver operating characteristics (auROC) curve than PRS, even when no epistatic interactions were detected. Using a Wilcoxon signed rank test, the improvement of MRS over PRS was significant (P < 10− 5) [110]. Thoughtful considerations and improvements, as highlighted in [110], can be used to improve the efficacy of PRS so that genetic relationships for POU and other disorders can more robustly be described, detected, and generalized.

Machine learning and artificial intelligence

Advances in ML are appearing daily and several of these have the potential to be useful in OUD research. There has been substantial attention given to neural networks as a ML method. In particular, deep learning (DL) has been developed to extend the architecture of neural networks to include many layers of nodes, thus greatly improving their ability to perform tasks such as image recognition [111]. It will be important to explore how best to adapt these algorithms to the study of OUD. One promising approach is the application of knowledge of biology and biochemical pathways to guide the architecture of a DL neural network [112]. Adapting this approach to the conduct of research on OUD is promising because researchers can build on the existing knowledgebase to help reduce the computational complexity of algorithms by reducing the feature spaces in informative ways. Another promising area to explore is automated ML (autoML). One of the challenges of ML is knowing which methods to select. Each method looks at the data in a different way and it is difficult to know a priori which method is best for detecting unknown patterns in a specific data set. The goal of auto ML is to let the computer explore the space of possible algorithms and parameter settings to automatically select the best method [113]. An example of an autoML package that can be used for big biomedical datasets is TPOT which uses genetic programming (GP) to optimize potential ML pipelines [114,115,116]. The goal of the GP applied in TPOT is to assign fitness scores to each ML pipeline and through generations of reproduction and mutation, arrive at an optimized solution in terms of model accuracy. Approaches like this can take some of the guesswork out of ML, as the technology becomes more accessible to individuals with less experience or skill in applying the methods. Finally, interpretation is key to translating ML results into improvements in our understanding of a phenomenon or in leading to new biological or clinical studies. Making sense of ML results is, in some cases, more challenging than developing the models themselves. This is where the human element comes in. ML and artificial intelligence (AI) are tools that need human interpretation and experience to turn data into knowledge. Interpretability, transparency, and trust are new frontiers in ML research.

In their application to POU, ML algorithms and approaches aimed at extracting phenotypes from EHR data are extremely useful because of the number of terms, diagnoses, and metrics that can translate to some level of problematic use. Several recent studies have incorporated and evaluated a diverse set of ML methods to derive phenotypes from EHR data for disorders including atopic dermatitis [117], rheumatoid arthritis [118, 119], and type 2 diabetes mellitus [120]. In the case of type 2 diabetes, several ML methods were evaluated including k-nearest neighbor, decision tree, random forest, support vector machine, and naïve Bayes. All of these approaches yielded higher auROC (average across methods was 0.98) than the state-of-the-art linear regression algorithm (auROC = 0.71) [120]. ML algorithms, in addition to detecting occurrences of a particular phenotype, can also enrich current phenotypes by expanding them into levels of severity or subtypes. As examples, two recent papers used latent class analysis to identify sub-phenotypes of acute respiratory distress syndrome [121] and pediatric sepsis [122]. Elucidating phenotype stratification is important as different sub-phenotypes often require different treatment strategies and responses. POU could particularly benefit from EHR mining as the phenotype is diverse and complicated and improvements in its detection will improve both treatment and risk assessment strategies as the knowledge base expands.

Although NLP is a branch of AI, its usefulness and robustness in the identification and risk assessment of POU warrants a focused discussion. ICD codes, which are used by physicians to diagnose and categorize patients, can help to identify both POU and OUD. However, standardized systems like ICD codes or EHR fields often underestimate the total of number of patients who exhibit one or more of these diagnoses [123]. When used in clinical settings, NLP creates a dictionary of terms and phrases from text sources (structured or unstructured) using automated algorithms to identify individuals who have, or may be at risk of having, a diagnosis of interest. Thus, NLP may identify patterns from clinical notes associated with a certain diagnosis that standardized classifications (e.g., ICD codes) cannot. Indeed NLP-assisted manual review of EHR data has been shown to greatly assist the classification of POU by identifying additional instances of POU that ICD code identification alone misses [90, 124]. However, in these examples, NLP methods alone did not identify all patients with POU ICD codes. This lack of overlap highlights the importance of using both detection methods in tandem to enhance POU identification. NLP also has the potential to identify risk factors of POU. For example, NLP methods accurately predicted opioid agreement violations in chronic non-cancer pain patients (sensitivity of 96.1%, specificity of 92.8%, and positive predictive value of 92.6%) [125]. Because of the high probability of developing OUD when prescribed opioids, clinicians and patients can enter into an opioid or pain management agreement in which the patient agrees to undergo random drug screenings and/or pill counts. Identifying patients that have violated or are at risk of violating these agreements is important to responsible opioid dispensing. Finally, improvements in text-based classifiers can have significant positive effects on NLP performance. A recent study highlights one such improvement. The researchers performed manual reviews of hospital discharge summaries and identified several text classes describing potential POU [36]. Annotated sentences were used to generate features using the open-source knowledge bases Empath [126], the Unified Medical Language System [127], and PyConText [128]. Several ML classifiers were used to predict sentence classification. Of these classifiers, AutoGluon had the best performance among classes in testing sets (average P = 81.4, R = 77.8, and F1 = 78.2 compared to average P = 81.2, R = 65.8, and F1 = 70 in logistic regression). AutoGluon is an autoML package that incorporates DL for text, image, and tabular data classification from structured data and focuses on multi-layer model stacking instead of model and hyperparameter selection [129]. The stacking allows basal models predictions to improve future models using both prediction information and feature space from the previous layer. This yields greater accuracy and faster computational times than several other autoML frameworks [129]. AutoML packages like TPOT and AutoGluon represent significant advances in model selection and optimization and have the potential to improve significantly the classification and prediction of complex phenotypes like POU.

Conclusions and synthesis

In this review we have highlighted the difficulties in classifying and identifying POU as a biomedical phenotype, the complex and potential risk factors associated with POU to inform feature identification and engineering, recommendations on how to quantify and classify the phenotype itself, and several methods, approaches, and advancements in the fields of ML, AI, and bioinformatics to identify POU and its risk factors. Throughout the review, we have sought to emphasize the importance of incorporating diverse and varied types of data and multiple methods and approaches to assess and predict POU risk. Figure 3 conceptually reinforces this idea. Each pipeline alone has its own potential to yield important features and risk predictions. However, combining various sources of data and methodological pipelines increases the potential knowledge base, which yields more robust models and better identification and prediction. The workflow from data to knowledge to prediction can be greatly improved by accessing all available data sources and incorporating novel digital approaches. It is our recommendation that future work in the fields of POU prediction and POU risk assessment incorporates diverse types of data (e.g., environmental data, digital footprints, comorbidities, and omics data) as well as multiple methodologies to create robust models and pipelines. Although the collection of varied data can be particularly challenging, we implore researchers to develop novel ways to capture the complex lives of their cohort(s). It is our hope that improving the knowledge base of POU will lead to the development of more efficient and accurate opioid risk prediction/assessment techniques, which is essential to limiting the exposure of individuals at risk and managing this public health crisis.

Fig. 3
figure 3

Conceptual Image illustrating the flow from data to knowledge to prediction. In A, the puzzle pieces from Fig. 1 are not connected, illustrating that each part of the puzzle is being treated separately. Three individual data extraction methods are shown with blue arrows passing through a filter. The filter represents feature cleanup and engineering. Data build a knowledge base (gray cylinder) before various methodologies (gears) create models for prediction. The area under the receiver operating characteristic curve (auROC) plots and feature importance plots (horizontal bar graphs) represent levels of model accuracy. In B, the puzzle pieces from Fig. 1 are connected. Individual data streams are filtered together to create a single source of data that contributes to a larger knowledge base. Multiple methodologies working together (interlinked gears) result in better models (as measured by higher auROC and greater feature importance)

Abbreviations

AI:

Artificial intelligence

autoML:

Automated machine learning

auROC:

Area under the receiver operating characteristics curve

DL:

Deep learning

DSM:

Diagnostic and statistical manual of mental disorders

DSM-5:

Diagnostic and statistical manual of mental disorders, fifth edition

EHR:

Electronic health record(s)

GP:

Genetic programming

GWAS:

Genome-wide association study(ies)

ICD:

International classification of diseases

ML:

Machine learning

MME:

Morphine equivalent dose

MRS:

Multi-locus risk score(s)

NLP:

Natural language processing

OPRD1:

δ-opioid receptor 1

OPRM1:

μ-opioid receptor 1

OUD:

Opioid use disorder

PheWAS:

Phenome-wide association study(ies)

POU:

Problematic opioid use

PRS:

Polygenic risk score(s)

SNP:

Single nucleotide polymorphism

TUD:

Tobacco use disorder

References

  1. Wide-ranging online data for epidemiologic research (WONDER). CDC Natl. Cent. Health Stat; 2020. Available from: http://wonder.cdc.gov. Accessed 1 Aug 2021.

  2. Mattson CL. Trends and geographic patterns in drug and synthetic opioid overdose deaths — United States, 2013–2019. MMWR Morb Mortal Wkly Rep. 2021;70 [cited 2022 Mar 31]. Available from: https://www.cdc.gov/mmwr/volumes/70/wr/mm7006a4.htm.

  3. Jones CM. Heroin use and heroin use risk behaviors among nonmedical users of prescription opioid pain relievers – United States, 2002–2004 and 2008–2010. Drug Alcohol Depend. 2013;132:95–100.

    Article  PubMed  Google Scholar 

  4. Lankenau SE, Teti M, Silva K, Bloom JJ, Harocopos A, Treese M. Initiation into prescription opioid misuse amongst young injection drug users. Int J Drug Policy. 2012;23:37–44.

    Article  PubMed  Google Scholar 

  5. Cicero TJ, Ellis MS, Surratt HL, Kurtz SP. The changing face of heroin use in the United States: a retrospective analysis of the past 50 years. JAMA Psychiatry. 2014;71:821–6.

    Article  PubMed  Google Scholar 

  6. Smith SM, Dart RC, Katz NP, Paillard F, Adams EH, Comer SD, et al. Classification and definition of misuse, abuse, and related events in clinical trials: ACTTION systematic review and recommendations. PAIN®. 2013;154:2287–96.

    Article  PubMed Central  Google Scholar 

  7. Vowles KE, McEntee ML, Julnes PS, Frohe T, Ney JP, van der Goes DN. Rates of opioid misuse, abuse, and addiction in chronic pain: a systematic review and data synthesis. PAIN. 2015;156:569–76.

    Article  PubMed  Google Scholar 

  8. American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition. Arlington: American Psychiatric Association; 2013. http://repository.poltekkes-kaltim.ac.id/657/1/Diagnostic%20and%20statistical%20manual%20of%20mental%20disorders%20_%20DSM-5%20%28%20PDFDrive.com%20%29.pdf.

  9. Boscarino JA, Rukstalis MR, Hoffman SN, Han JJ, Erlich PM, Ross S, et al. Prevalence of prescription opioid-use disorder among chronic pain patients: comparison of the DSM-5 vs. DSM-4 diagnostic criteria. J Addict Dis. 2011;30:185–94.

    Article  PubMed  Google Scholar 

  10. Boscarino JA, Hoffman S, Han J. Opioid-use disorder among patients on long-term opioid therapy: impact of final DSM-5 diagnostic criteria on prevalence and correlates. Subst Abus Rehabil. 2015;6:83.

    Article  Google Scholar 

  11. Cheatle MD. Facing the challenge of pain management and opioid misuse, abuse and opioid-related fatalities. Expert Rev Clin Pharmacol. 2016;9:751–4.

    Article  CAS  PubMed  Google Scholar 

  12. Le Roux C, Tang Y, Drexler K. Alcohol and opioid use disorder in older adults: neglected and treatable illnesses. Curr Psychiatry Rep. 2016;18:87.

    Article  PubMed  Google Scholar 

  13. Hallgren KA, Witwer E, West I, Baldwin L-M, Donovan D, Stuvek B, et al. Prevalence of documented alcohol and opioid use disorder diagnoses and treatments in a regional primary care practice-based research network. J Subst Abus Treat. 2020;110:18–27.

    Article  Google Scholar 

  14. Bowman S, Eiserman J, Beletsky L, Stancliff S, Bruce RD. Reducing the health consequences of opioid addiction in primary care. Am J Med. 2013;126:565–71.

    Article  PubMed  Google Scholar 

  15. Rieckmann T, Muench J, McBurnie MA, Leo MC, Crawford P, Ford D, et al. Medication-assisted treatment for substance use disorders within a national community health center research network. Subst Abuse. 2016;37:625–34.

    Article  Google Scholar 

  16. Lande R, Arnold SJ. The measurement of selection on correlated characters. Evolution. 1983;37:1210–26 [Society for the Study of Evolution, Wiley].

    Article  PubMed  Google Scholar 

  17. Bernard DM, Encinosa W, Cohen J, Fang Z. Patient factors that affect opioid use among adults with and without chronic pain. Res Soc Adm Pharm. 2021;17:1059–65.

    Article  Google Scholar 

  18. Shaw WS, Roelofs C, Punnett L. Work environment factors and prevention of opioid-related deaths. Am J Public Health. 2020;110:1235–41 American Public Health Association.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Cheng Z, Zhou H, Sherva R, Farrer LA, Kranzler HR, Gelernter J. Genome-wide association study identifies a regulatory variant of RGMA associated with opioid dependence in European Americans. Biol Psychiatry. 2018;84:762–70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Cheng Z, Yang B, Zhou H, Nunez Y, Kranzler HR, Gelernter J. Genome-wide scan identifies opioid overdose risk locus close to MCOLN1. Addict Biol. 2020;25 [cited 2020 Feb 13]. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/adb.12811.

  21. Coombs ID, Soto D, Zonouzi M, Renzi M, Shelley C, Farrant M, et al. Cornichons modify channel properties of recombinant and glial AMPA receptors. J Neurosci. 2012;32:9796–804.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Crist RC, Clarke T-K, Ang A, Ambrose-Lanci LM, Lohoff FW, Saxon AJ, et al. An intronic variant in OPRD1 predicts treatment outcome for opioid dependence in African-Americans. Neuropsychopharmacology. 2013;38:2003–10.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Gelernter J, Kranzler HR, Sherva R, Koesterer R, Almasy L, Zhao H, et al. Genome-wide association study of opioid dependence: multiple associations mapped to calcium and potassium pathways. Biol Psychiatry. 2014;76:66–74.

    Article  CAS  PubMed  Google Scholar 

  24. Hancock DB, Levy JL, Gaddis NC, Glasheen C, Saccone NL, Page GP, et al. Cis-expression quantitative trait loci mapping reveals replicable associations with heroin addiction in OPRM1. Biol Psychiatry. 2015;78:474–84.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Mayer P, Rochlitz H, Rauch E, Rommelspacher H, Hasse HE, Schmidt S, et al. Association between a delta opioid receptor gene polymorphism and heroin dependence in man. NeuroReport. 1997;8:2547–50.

    Article  CAS  PubMed  Google Scholar 

  26. Nielsen DA, Ji F, Yuferov V, Ho A, He C, Ott J, et al. Genome-wide association study identifies genes that may contribute to risk for developing heroin addiction. Psychiatr Genet. 2010;20:207–14.

    Article  PubMed  Google Scholar 

  27. Zhang H, Kranzler HR, Yang B-Z, Luo X, Gelernter J. The OPRD1 and OPRK1 loci in alcohol or drug dependence: OPRD1 variation modulates substance dependence risk. Mol Psychiatry. 2008;13:531–43.

    Article  CAS  PubMed  Google Scholar 

  28. Scherbaum N, Specka M. Factors influencing the course of opiate addiction. Int J Methods Psychiatr Res. 2008;17:S39–44.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Badiani A, Robinson TE. Drug-induced neurobehavioral plasticity: the role of environmental context. Behav Pharmacol. 2004;15:327–39.

    Article  CAS  PubMed  Google Scholar 

  30. Eitan S, Emery MA, Bates MLS, Horrax C. Opioid addiction: who are your real friends? Neurosci Biobehav Rev. 2017;83:697–712.

    Article  PubMed  Google Scholar 

  31. Freda PJ, Moore JH, Kranzler HR. The phenomics and genetics of addictive and affective comorbidity in opioid use disorder. Drug Alcohol Depend. 2021;221:108602.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wittenauer Welsh J, Knight JR, Hou SS-Y, Malowney M, Schram P, Sherritt L, et al. Association between substance use diagnoses and psychiatric disorders in an adolescent and young adult clinic-based population. J Adolesc Health. 2017;60:648–52.

    Article  Google Scholar 

  33. Goesling J, Henry MJ, Moser SE, Rastogi M, Hassett AL, Clauw DJ, et al. Symptoms of depression are associated with opioid use regardless of pain severity and physical functioning among treatment-seeking patients with chronic pain. J Pain. 2015;16:844–51.

    Article  PubMed  Google Scholar 

  34. Ferri M, Finlayson AJR, Wang L, Martin PR. Predictive factors for relapse in patients on buprenorphine maintenance: relapse factors in buprenorphine maintenance. Am J Addict. 2014;23:62–7.

    Article  PubMed  Google Scholar 

  35. Hser Y-I, Mooney LJ, Saxon AJ, Miotto K, Bell DS, Huang D. Chronic pain among patients with opioid use disorder: results from electronic health records data. J Subst Abus Treat. 2017;77:26–30.

    Article  Google Scholar 

  36. Poulsen MN, Freda PJ, Troiani V, Davoudi A, Mowery DL. Classifying characteristics of opioid use disorder from hospital discharge summaries using natural language processing. Front Public Health. 2022;10 [cited 2022 Jun 2]. Available from: https://www.frontiersin.org/article/10.3389/fpubh.2022.850619.

  37. Sarker A, Gonzalez-Hernandez G, Ruan Y, Perrone J. Machine learning and natural language processing for geolocation-centric monitoring and characterization of opioid-related social media chatter. JAMA Netw Open. 2019;2:e1914672.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Merchant RM, Asch DA, Crutchley P, Ungar LH, Guntuku SC, Eichstaedt JC, et al. Evaluating the predictability of medical conditions from social media posts. PLoS One. 2019;14:e0215476 Public Library of Science.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Yang Z, Bradshaw S, Hewett R, Jin F. Discovering opioid use patterns from social media for relapse prevention. Computer. 2022;55:23–33.

    Article  Google Scholar 

  40. Best D, Lehmann P, Gossop M, Harris J, Noble A, Strang J. Eating too little, smoking and drinking too much: wider lifestyle problems among methadone maintenance patients. Addict Res. 1998;6:489–98.

    Article  Google Scholar 

  41. Chun J, Haug NA, Guydish JR, Sorensen JL, Delucchi K. Cigarette smoking among opioid-dependent clients in a therapeutic community. Am J Addict. 2009;18:316–20.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Clemmey P. Smoking habits and attitudes in a methadone maintenance treatment population. Drug Alcohol Depend. 1997;44:123–32.

    Article  CAS  PubMed  Google Scholar 

  43. Pajusco B, Chiamulera C, Quaglio G, Moro L, Casari R, Amen G, et al. Tobacco addiction and smoking status in heroin addicts under methadone vs. buprenorphine therapy. Int J Environ Res Public Health. 2012;9:932–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Rajabi A, Dehghani M, Shojaei A, Farjam M, Motevalian SA. Association between tobacco smoking and opioid use: a meta-analysis. Addict Behav. 2019;92:225–35.

    Article  PubMed  Google Scholar 

  45. Grella CE, Anglin MD, Wugalter SE. Cocaine and crack use and HIV risk behaviors among high-risk methadone maintenance clients. Drug Alcohol Depend. 1995;37:15–21.

    Article  CAS  PubMed  Google Scholar 

  46. Tzilos GK, Rhodes GL, Ledgerwood DM, Greenwald MK. Predicting cocaine group treatment outcome in cocaine-abusing methadone patients. Exp Clin Psychopharmacol. 2009;17:320–5.

    Article  PubMed  Google Scholar 

  47. Ives TJ, Chelminski PR, Hammett-Stabler CA, Malone RM, Perhac JS, Potisek NM, et al. Predictors of opioid misuse in patients with chronic pain: a prospective cohort study. BMC Health Serv Res. 2006;6:46.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Arterberry BJ, Horbal SR, Buu A, Lin H-C. The effects of alcohol, cannabis, and cigarette use on the initiation, reinitiation and persistence of non-medical use of opioids, sedatives, and tranquilizers in adults. Drug Alcohol Depend. 2016;159:86–92.

    Article  PubMed  Google Scholar 

  49. Hah JM, Sturgeon JA, Zocca J, Sharifzadeh Y, Mackey SC. Factors associated with prescription opioid misuse in a cross-sectional cohort of patients with chronic non-cancer pain. J Pain Res. 2017;10:979–87.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Bilal M, Chatila A, Siddiqui MT, Al-Hanayneh M, Shah AR, Desai M, et al. Rising prevalence of opioid use disorder and predictors for opioid use disorder among hospitalized patients with chronic pancreatitis. Pancreas. 2019;48:1386–92.

    Article  PubMed  Google Scholar 

  51. Foley M, Schwab-Reese LM. Associations of state-level rates of depression and fatal opioid overdose in the United States, 2011–2015. Soc Psychiatry Psychiatr Epidemiol. 2019;54:131–4.

    Article  PubMed  Google Scholar 

  52. Martins SS, Fenton MC, Keyes KM, Blanco C, Zhu H, Storr CL. Mood and anxiety disorders and their association with non-medical prescription opioid use and prescription opioid-use disorder: longitudinal evidence from the national epidemiologic study on alcohol and related conditions. Psychol Med. 2012;42:1261–72 Cambridge University Press.

    Article  CAS  PubMed  Google Scholar 

  53. Morasco BJ, Turk DC, Donovan DM, Dobscha SK. Risk for prescription opioid misuse among patients with a history of substance use disorder. Drug Alcohol Depend. 2013;127:193–9.

    Article  PubMed  Google Scholar 

  54. Bateman BT, Franklin JM, Bykov K, Avorn J, Shrank WH, Brennan TA, et al. Persistent opioid use following cesarean delivery: patterns and predictors among opioid-naïve women. Am J Obstet Gynecol. 2016;215:353.e1–353.e18.

    Article  Google Scholar 

  55. Blanco C, Iza M, Schwartz RP, Rafful C, Wang S, Olfson M. Probability and predictors of treatment-seeking for prescription opioid use disorders: a national study. Drug Alcohol Depend. 2013;131:143–8.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Boscarino JA, Kirchner HL, Pitcavage JM, Nadipelli VR, Ronquest NA, Fitzpatrick MH, et al. Factors associated with opioid overdose: a 10-year retrospective study of patients in a large integrated health care system. Subst Abus Rehabil. 2016;7:131–41.

    Article  Google Scholar 

  57. Carlson RG, Nahhas RW, Martins SS, Daniulaityte R. Predictors of transition to heroin use among initially non-opioid dependent illicit pharmaceutical opioid users: a natural history study. Drug Alcohol Depend. 2016;160:127–34.

    Article  PubMed  PubMed Central  Google Scholar 

  58. Cochran BN, Flentje A, Heck NC, Van Den Bos J, Perlman D, Torres J, et al. Factors predicting development of opioid use disorders among individuals who receive an initial opioid prescription: mathematical modeling using a database of commercially-insured individuals. Drug Alcohol Depend. 2014;138:202–8.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Connolly J, Javed Z, Raji MA, Chan W, Kuo Y-F, Baillargeon J. Predictors of long term opioid use following lumbar fusion surgery. Spine. 2017;42:1405–11.

    Article  PubMed  PubMed Central  Google Scholar 

  60. Dilokthornsakul P, Moore G, Campbell JD, Lodge R, Traugott C, Zerzan J, et al. Risk factors of prescription opioid overdose among Colorado Medicaid beneficiaries. J Pain. 2016;17:436–43.

    Article  PubMed  Google Scholar 

  61. Glanz JM, Narwaney KJ, Mueller SR, Gardner EM, Calcaterra SL, Xu S, et al. Prediction model for two-year risk of opioid overdose among patients prescribed chronic opioid therapy. J Gen Intern Med. 2018;33:1646–53.

    Article  PubMed  PubMed Central  Google Scholar 

  62. Hadlandsmyth K, Vander Weg MW, McCoy KD, Mosher HJ, Vaughan-Sarrazin MS, Lund BC. Risk for prolonged opioid use following total knee arthroplasty in veterans. J Arthroplast. 2018;33:119–23.

    Article  Google Scholar 

  63. Han B, Compton WM, Blanco C, Jones CM. Correlates of prescription opioid use, misuse, use disorders, and motivations for misuse among US adults. J Clin Psychiatry. 2018;79:15323 Physicians Postgraduate Press, Inc.

    Article  Google Scholar 

  64. Hylan TR, Von Korff M, Saunders K, Masters E, Palmer RE, Carrell D, et al. Automated prediction of risk for problem opioid use in a primary care setting. J Pain. 2015;16:380–7.

    Article  PubMed  Google Scholar 

  65. Inacio MCS, Hansen C, Pratt NL, Graves SE, Roughead EE. Risk factors for persistent and new chronic opioid use in patients undergoing total hip arthroplasty: a retrospective cohort study. BMJ Open. 2016;6:e010664 British Medical Journal Publishing Group.

    Article  PubMed  PubMed Central  Google Scholar 

  66. Karhade AV, Schwab JH, Bedair HS. Development of machine learning algorithms for prediction of sustained postoperative opioid prescriptions after total hip arthroplasty. J Arthroplast. 2019;34:2272–2277.e1 Elsevier.

    Article  Google Scholar 

  67. Kim SC, Choudhry N, Franklin JM, Bykov K, Eikermann M, Lii J, et al. Patterns and predictors of persistent opioid use following hip or knee arthroplasty. Osteoarthr Cartil. 2017;25:1399–406.

    Article  CAS  Google Scholar 

  68. Lalic S, Gisev N, Bell JS, Korhonen MJ, Ilomäki J. Predictors of persistent prescription opioid analgesic use among people without cancer in Australia. Br J Clin Pharmacol. 2018;84:1267–78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Lee D, Armaghani S, Archer KR, Bible J, Shau D, Kay H, et al. Preoperative opioid use as a predictor of adverse postoperative self-reported outcomes in patients undergoing spine surgery. JBJS. 2014;96:e89.

    Article  Google Scholar 

  70. Leece P, Cavacuiti C, Macdonald EM, Gomes T, Kahan M, Srivastava A, et al. Predictors of opioid-related death during methadone therapy. J Subst Abus Treat. 2015;57:30–5.

    Article  Google Scholar 

  71. Leroux TS, Saltzman BM, Sumner SA, Maldonado-Rodriguez N, Agarwalla A, Ravi B, et al. Elective shoulder surgery in the opioid Naïve: rates of and risk factors for long-term postoperative opioid use. Am J Sports Med. 2019;47:1051–6.

    Article  PubMed  Google Scholar 

  72. Levine AR, Lundahl LH, Ledgerwood DM, Lisieski M, Rhodes GL, Greenwald MK. Gender-specific predictors of retention and opioid abstinence during methadone maintenance treatment. J Subst Abus Treat. 2015;54:37–43.

    Article  Google Scholar 

  73. Olfson M, Wall MM, Liu S-M, Blanco C. Cannabis use and risk of prescription opioid use disorder in the United States. Am J Psychiatry. 2018;175:47–53 American Psychiatric Publishing.

    Article  PubMed  Google Scholar 

  74. Rosenbloom BN, McCartney CJL, Canzian S, Kreder HJ, Katz J. Predictors of prescription opioid use 4 months after traumatic musculoskeletal injury and corrective surgery: a prospective study. J Pain. 2017;18:956–63.

    Article  PubMed  Google Scholar 

  75. Samples H, Williams AR, Olfson M, Crystal S. Risk factors for discontinuation of buprenorphine treatment for opioid use disorders in a multi-state sample of Medicaid enrollees. J Subst Abus Treat. 2018;95:9–17.

    Article  Google Scholar 

  76. Saunders KW, Von Korff M, Campbell CI, Banta-Green CJ, Sullivan MD, Merrill JO, et al. Concurrent use of alcohol and sedatives among persons prescribed chronic opioid therapy: prevalence and risk factors. J Pain. 2012;13:266–75.

    Article  PubMed  PubMed Central  Google Scholar 

  77. Schoenfeld AJ, Nwosu K, Jiang W, Yau AL, Chaudhary MA, Scully RE, et al. Risk factors for prolonged opioid use following spine surgery, and the association with surgical intensity, among opioid-naive patients. JBJS. 2017;99:1247–52.

    Article  Google Scholar 

  78. Schoenfeld AJ, Belmont PJJ, Blucher JA, Jiang W, Chaudhary MA, Koehlmoos T, et al. Sustained preoperative opioid use is a predictor of continued use following spine surgery. JBJS. 2018;100:914–21.

    Article  Google Scholar 

  79. Sun J, Bi J, Chan G, Oslin D, Farrer L, Gelernter J, et al. Improved methods to identify stable, highly heritable subtypes of opioid use and related behaviors. Addict Behav. 2012;37:1138–44.

    Article  PubMed  PubMed Central  Google Scholar 

  80. von Oelreich E, Eriksson M, Brattström O, Sjölund K-F, Discacciati A, Larsson E, et al. Risk factors and outcomes of chronic opioid use following trauma. Br J Surg. 2020;107:413–21.

    Article  Google Scholar 

  81. Zedler B, Xie L, Wang L, Joyce A, Vick C, Kariburyo F, et al. Risk factors for serious prescription opioid-related toxicity or overdose among veterans health administration patients. Pain Med. 2014;15:1911–29.

    Article  PubMed  Google Scholar 

  82. Berrettini W. A brief review of the genetics and pharmacogenetics of opioid use disorders. Dialogues Clin Neurosci. 2017;19:9.

    Article  Google Scholar 

  83. Li D, Zhao H, Kranzler HR, Li MD, Jensen KP, Zayats T, et al. Genome-wide association study of copy number variations (CNVs) with opioid dependence. Neuropsychopharmacology. 2015;40:1016–26.

    Article  CAS  PubMed  Google Scholar 

  84. Browne CJ, Godino A, Salery M, Nestler EJ. Epigenetic mechanisms of opioid addiction. Biol Psychiatry. 2020;87:22–33.

    Article  CAS  PubMed  Google Scholar 

  85. Kember RL, Vickers-Smith R, Xu H, Toikumo S, Niarchou M, Zhou H, et al. Cross-ancestry meta-analysis of opioid use disorder uncovers novel loci with predominant effects on brain. medRxiv. 2021:2021.12.13.21267480 [cited 2022 May 6]. Available from: https://www.medrxiv.org/content/10.1101/2021.12.13.21267480v1.

  86. Hoffman S, Podgurski A. Big bad data: law, public health, and biomedical databases. J Law Med Ethics. 2013;41:56–60 Cambridge University Press.

    Article  PubMed  Google Scholar 

  87. Kohane IS, Aronow BJ, Avillach P, Beaulieu-Jones BK, Bellazzi R, Bradford RL, et al. What every reader should know about studies using electronic health record data but may be afraid to ask. J Med Internet Res. 2021;23:e22219.

    Article  PubMed  PubMed Central  Google Scholar 

  88. Green TC, Grau LE, Carver HW, Kinzly M, Heimer R. Epidemiologic trends and geographic patterns of fatal opioid intoxications in Connecticut, USA: 1997–2007. Drug Alcohol Depend. 2011;115:221–8.

    Article  PubMed  Google Scholar 

  89. Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet. 2020;21:493–502.

    Article  CAS  PubMed  Google Scholar 

  90. Carrell DS, Cronkite D, Palmer RE, Saunders K, Gross DE, Masters ET, et al. Using natural language processing to identify problem usage of prescription opioids. Int J Med Inf. 2015;84:1057–64.

    Article  Google Scholar 

  91. Weiner JA, Snavely JE, Johnson DJ, Hsu WK, Patel AA. Impact of preoperative opioid use on postoperative patient-reported outcomes in lumbar spine surgery patients. Clin Spine Surg. 2021;34:E154–9.

    Article  PubMed  Google Scholar 

  92. You DS, Hah JM, Collins S, Ziadni MS, Domingue BW, Cook KF, et al. Evaluation of the preliminary validity of misuse of prescription pain medication items from the patient-reported outcomes measurement information system (PROMIS)®. Pain Med. 2019;20:1925–33.

    Article  PubMed  PubMed Central  Google Scholar 

  93. Correa D, Farney RJ, Chung F, Prasad A, Lam D, Wong J. Chronic opioid use and central sleep apnea: a review of the prevalence, mechanisms, and perioperative considerations. Anesth Analg. 2015;120:1273–85.

    Article  CAS  PubMed  Google Scholar 

  94. Haight SC, Ko JY, Tong VT, Bohm MK, Callaghan WM. Opioid use disorder documented at delivery hospitalization — United States, 1999–2014. Morb Mortal Wkly Rep. 2018;67:845–9.

    Article  Google Scholar 

  95. Centers for Disease Control and Prevention. 2018 Annual Surveillance Report of Drug-Related Risks and Outcomes — United States. Surveillance Special Report. Centers for Disease Control and Prevention, U.S. Department of Health and Human Services. 2018. Accessed from https://www.cdc.gov/.

  96. World Health Organization. Defined daily dose (DDD). 2022. [cited 2022 Apr 21]. Available from: https://www.who.int/tools/atc-ddd-toolkit/about-ddd.

    Google Scholar 

  97. Zin C, Chen L-C, Knaggs R. Changes in trends and pattern of strong opioid prescribing in primary care. Eur J Pain Lond Engl. 2014;18:1343–51.

    Article  CAS  Google Scholar 

  98. Glare P, Aubrey KR, Myles PS. Transition from acute to chronic pain after surgery. Lancet. 2019;393:1537–46.

    Article  PubMed  Google Scholar 

  99. Nicol AL, Hurley RW, Benzon HT. Alternatives to opioids in the pharmacologic management of chronic pain syndromes: a narrative review of randomized, controlled, and blinded clinical trials. Anesth Analg. 2017;125:1682–703.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Duncan RW, Smith KL, Maguire M, Stader DE. Alternatives to opioids for pain management in the emergency department decreases opioid usage and maintains patient satisfaction. Am J Emerg Med. 2019;37:38–44.

    Article  PubMed  Google Scholar 

  101. Kember RL, Hartwell EE, Xu H, Rotenberg J, Almasy L, Zhou H, et al. Phenome-wide association analysis of substance use disorders in a deeply phenotyped sample. medRxiv. 2022:2022.02.09.22270737 [cited 2022 May 6]. Available from: https://www.medrxiv.org/content/10.1101/2022.02.09.22270737v1.

  102. Zhou H, Sealock JM, Sanchez-Roige S, Clarke T-K, Levey DF, Cheng Z, et al. Genome-wide meta-analysis of problematic alcohol use in 435,563 individuals yields insights into biology and relationships with other traits. Nat Neurosci. 2020;23:809–18 Nature Publishing Group.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51:584–91 Nature Publishing Group.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Kim MS, Patel KP, Teng AK, Berens AJ, Lachance J. Genetic disease risks can be misestimated across global populations. Genome Biol. 2018;19:179.

    Article  PubMed  PubMed Central  Google Scholar 

  105. Mostafavi H, Harpak A, Agarwal I, Conley D, Pritchard JK, Przeworski M. Variable prediction accuracy of polygenic scores within an ancestry group. eLife. 2020;9:e48376.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Polubriaginof FCG, Vanguri R, Quinnies K, Belbin GM, Yahi A, Salmasian H, et al. Disease heritability inferred from familial relationships reported in medical records. Cell. 2018;173:1692–1704.e11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. DeBoever C, Tanigawa Y, Aguirre M, McInnes G, Lavertu A, Rivas MA. Assessing digital phenotyping to enhance genetic studies of human diseases. Am J Hum Genet. 2020;106:611–22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Halpern Y, Horng S, Choi Y, Sontag D. Electronic medical record phenotyping using the anchor and learn framework. J Am Med Inform Assoc. 2016;23:731–40.

    Article  PubMed  PubMed Central  Google Scholar 

  109. Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc. 2016;23:1046–52.

    Article  PubMed  PubMed Central  Google Scholar 

  110. Le TT, Gong H, Orzechowski P, Manduchi E, Moore JH. Expanding polygenic risk scores to include automatic genotype encodings and gene-gene interactions. Proc 13th Int Jt Conf Biomed Eng Syst Technol BIOSTEC. 2020;3:79–84.

  111. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44 Nature Publishing Group.

    Article  CAS  PubMed  Google Scholar 

  112. Ma J, Yu MK, Fong S, Ono K, Sage E, Demchak B, et al. Using deep learning to model the hierarchical structure and function of a cell. Nat Methods. 2018;15:290–8 Nature Publishing Group.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Hutter F, Kotthoff L, Vanschoren J, editors. Automated machine learning: methods, systems, challenges: Springer International Publishing; 2019. [cited 2020 Apr 28]. Available from: https://www.springer.com/de/book/9783030053178

    Google Scholar 

  114. Le TT, Fu W, Moore JH. Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics. 2020;36:250–6.

    Article  CAS  PubMed  Google Scholar 

  115. Olson RS, Urbanowicz RJ, Andrews PC, Lavender NA, Kidd LC, Moore JH. Automating biomedical data science through tree-based pipeline optimization. In: Squillero G, Burelli P, editors. Appl Evol Comput. Cham: Springer International Publishing; 2016. p. 123–37.

    Google Scholar 

  116. Olson RS, Bartley N, Urbanowicz RJ, Moore JH. Evaluation of a tree-based pipeline optimization tool for automating data science. Proc Genet Evol Comput Conf 2016. New York: Association for Computing Machinery; 2016. p. 485–92. [cited 2022 Apr 21]. Available from: https://doi.org/10.1145/2908812.2908918

    Google Scholar 

  117. Gustafson E, Pacheco J, Wehbe F, Silverberg J, Thompson W. A machine learning algorithm for identifying atopic dermatitis in adults from electronic health records, 2017 IEEE Int Conf Healthc Inform ICHI; 2017. p. 83–90.

    Google Scholar 

  118. Zhou S-M, Fernandez-Gutierrez F, Kennedy J, Cooksey R, Atkinson M, Denaxas S, et al. Defining disease phenotypes in primary care electronic health records by a machine learning approach: a case study in identifying rheumatoid arthritis. PLoS One. 2016;11:e0154515 Public Library of Science.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  119. Carroll RJ, Eyler AE, Denny JC. Naïve electronic health record phenotype identification for rheumatoid arthritis. AMIA Annu Symp Proc. 2011;2011:189–96.

    PubMed  PubMed Central  Google Scholar 

  120. Zheng T, Xie W, Xu L, He X, Zhang Y, You M, et al. A machine learning-based framework to identify type 2 diabetes through electronic health records. Int J Med Inf. 2017;97:120–7.

    Article  Google Scholar 

  121. Maddali MV, Churpek M, Pham T, Rezoagli E, Zhuo H, Zhao W, et al. Validation and utility of ARDS subphenotypes identified by machine-learning models using clinical data: an observational, multicohort, retrospective analysis. Lancet Respir Med. 2022;10:367–77.

    Article  PubMed  Google Scholar 

  122. Koutroulis I, Velez T, Wang T, Yohannes S, Galarraga JE, Morales JA, et al. Pediatric sepsis phenotypes for enhanced therapeutics: an application of clustering to electronic health records. J Am Coll Emerg Physicians Open. 2022;3:e12660.

    PubMed  PubMed Central  Google Scholar 

  123. O’Malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. Measuring diagnoses: ICD code accuracy. Health Serv Res. 2005;40:1620–39.

    Article  PubMed  PubMed Central  Google Scholar 

  124. Palmer RE, Carrell DS, Cronkite D, Saunders K, Gross DE, Masters E, et al. The prevalence of problem opioid use in patients receiving chronic opioid therapy: computer-assisted review of electronic health record clinical notes. PAIN. 2015;156:1208–14.

    Article  PubMed  Google Scholar 

  125. Haller IV, Renier CM, Juusola M, Hitz P, Steffen W, Asmus MJ, et al. Enhancing risk assessment in patients receiving chronic opioid analgesic therapy using natural language processing. Pain Med. 2017;18:1952–60 Oxford Academic.

    PubMed  Google Scholar 

  126. Fast E, Chen B, Bernstein MS. Empath: understanding topic signals in large-scale text, Proc 2016 CHI Conf Hum Factors Comput Syst. New York: Association for Computing Machinery; 2016. p. 4647–57. [cited 2022 Apr 21]. Available from: https://doi.org/10.1145/2858036.2858535

    Google Scholar 

  127. Bodenreider O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Chapman BE, Lee S, Kang HP, Chapman WW. Document-level classification of CT pulmonary angiography reports based on an extension of the ConText algorithm. J Biomed Inform. 2011;44:728–37.

    Article  PubMed  PubMed Central  Google Scholar 

  129. Erickson N, Mueller J, Shirkov A, Zhang H, Larroy P, Li M, et al. AutoGluon-tabular: robust and accurate AutoML for structured data. ArXiv200306505 Cs Stat. 2020. [cited 2022 Apr 20]. Available from: http://arxiv.org/abs/2003.06505.

    Google Scholar 

Download references

Funding

This work was supported by the Commonwealth of PA Dept. of Health Tobacco Settlement Act 2001-77 grant # 4100083337 to JHM, the Ruth L. Kirschstein National Research Service Award (T32 HG009495) to PJF, and NIH grant P30 DA046345 to HRK.

Author information

Authors and Affiliations

Authors

Contributions

PJF wrote the original manuscript, performed the literature review, and helped the design overall scope of the manuscript. HRK provided feedback, edits, and corrections. JHM wrote the subsection on machine learning/artificial intelligence, provided feedback, and helped design the overall scope of the manuscript. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Philip J. Freda Jr.

Ethics declarations

Competing interests

Dr. Kranzler is a member of an advisory boards for Dicerna Pharmaceuticals, Sophrosyne Pharmaceuticals, and Enthion Pharmaceuticals; a consultant for Sobrera Pharmaceuticals; a recipient from Alkermes of funds and study medication for investigator-initiated research; a member of the American Society of Clinical Psychopharmacology’s Alcohol Clinical Trials Initiative, which was supported in the last 3 years by Alkermes, Dicerna, Ethypharm, Lundbeck, Mitsubishi, and Otsuka; and holds U.S. Patent 10,900,082: Genotype-guided Dosing of Opioid Receptor Agonists, 26 Jan. 2021. The other authors have no disclosures to make.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Freda, P.J., Kranzler, H.R. & Moore, J.H. Novel digital approaches to the assessment of problematic opioid use. BioData Mining 15, 14 (2022). https://doi.org/10.1186/s13040-022-00301-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13040-022-00301-1

Keywords