ADE data source
We retrieved drug event reports from the Food and Drug Administration’s openFDA [46] download page, utilizing an API key with extended permissions, containing the FAERS data. Using custom python notebooks and scripts available in the ‘openFDA_drug_event-parsing’ github repository (DOI: https://doi.org/10.5281/zenodo.4464544), we extracted and formatted all drug event reports prior to the third quarter of 2019. Data fields included the safety report identifier, age value, age code e.g. year, adverse event MedDRA concept code (preferred terms), and drug RxNorm code (various) used in our analyses. The age value was standardized to year units for categorizing reports into the 7 child development stages according to the Eunice Kennedy Shriver National Institute of Child and Human Development [45]. Adverse drug event MedDRA codes were mapped to standard concept identifiers using concept tables [47] from the OMOP common data model. The drug RxNorm code was similarly translated to the standard RxNorm concept identifier (ingredient level) in OMOP and was further mapped to the equivalent ATC concept identifier (ATC 5th level) using the concept relationship table. The occurrence of an adverse drug event is defined as any safety report where both the adverse event and drug concepts are reported together. The pediatric report space for any adverse drug event is all reports which have age above zero and less than or equal to 21 years old which is the upper bound for the late adolescence child development stage. The drug event data for a given drug-event pair composed of 339,741 safety reports with a binary indicator for reports of the event and drug, as well as the category of NICHD child development stage for the report’s patient.
Simulated ADE dynamics
The objective of this study was to evaluate detection of drug-event reporting as the reporting rate changes across child development stages with varying dynamics and effect sizes. We assert that reporting dynamics during childhood reflect ontogenic profiles observed on molecular, functional, and structural levels [7, 37, 38].
We simulated dynamic ADE reporting by combining hyperbolic tangent functions that produced symmetric probability distributions around a given effect size to define the probability of event reporting at drug reports. These dynamic reporting classes represent nonlinear trends of drug-event reports across childhood. The average drug and event reporting across reports equaled the event reporting rate multiplied by a fold change factor resulting in the effect size of dynamic drug event reporting. The fold change followed a negative exponential distribution with rate parameter 0.75 resulting in a fold change distribution ranging from 1 to 10 (Figure S13). The simulated reporting probabilities set the Bernoulli random variable to assign the presence (1) or absence (0) of the event being reported for each of the 339,741 safety reports. We designed 5 different dynamic reporting rates, namely ‘uniform’ (random), ‘increase’, ‘decrease’, ‘plateau’, and ‘inverse_plateau’ (Figure S14).
ADE data augmentation
We augmented the original drug event data from FAERS with the simulated drug event reporting dynamics. We randomly selected 500 drug-event pairs to be the positive control set for drug-events with putative drug event reporting dynamics. We augmented the drug event data for each pair with the previously described dynamics that we want to detect. We then randomly selected 10,000 mutually exclusive drug-event pairs to be the negative control set which were not augmented and represented observed reporting of drugs with events within FAERS. Differences of the average drug event reporting between the drug-event sets was computed by comparing 10 million resamples of each distribution.
Augmenting the positive control drug-event pairs resulted in 5 sets of 500 drug-event pairs, forming (drug-event, stage, dynamic) triples. The (drug-event, stage, uniform dynamic) triple scores were the reference distribution for comparing the average difference in scores, after 20 resamples, with ADE risk scores from the other dynamics classes.
ADE detection methods
Our study evaluates population stratification versus modeling statistical methods to detect co-occurring adverse events and drugs in observational data. The proportional reporting ratio (PRR) stratified the pediatric population into child development stages to quantify the odds for event reporting prevalence with a drug compared to without a drug reported. The logistic generalized additive model (GAM) quantifies the log odds for event reporting with the drug compared to without the drug across child development stages. The methods are applied onto the drug event data to evaluate adverse drug event detection in the presence versus the absence of putative drug event reporting dynamics.
We employed the Proportional Reporting Ratio (PRR):
$$ \frac{\frac{a}{a+c}}{\frac{b}{b+d}} $$
where ‘a’ is the number of reports with the drug and event, ‘b’ is the number of reports without the drug and with the event, ‘c’ is the number of reports with the drug and without the event, and ‘d’ is the number of reports without the drug or event of interest. These are the four parameters of the PRR equation. For each drug-event pair, we applied the PRR to each of the 7 child development stages resulting in 7 scores. The PRR scores were log10 transformed when conducting the Shapiro-Wilk test for normality.
We also evaluated the logistic generalized additive model [48] (GAM):
$$ g\left(E(Event)\right)=s\left( nichd, by= Drug\right) $$
where g is a logit link function, E(Event) is the expected value of event reporting, s is a spline function with a penalized cubic basis, nichd is the child development stage of the report’s subject, and Drug is an indicator i.e. 0 or 1 of drug reporting. Details for GAMs can be found at references [42, 49] and we specified the model using the mgcv package in R.
Briefly, the GAM is a flexible statistical model that captures nonlinear effects of covariates onto a response. In this paper, we model the effect of the child development stage interacting with drug reporting on the reporting of an event where the event is the reporting of the MedDRA preferred term and the drug is the reporting of the ATC 5th level drug concept. The s() function is a spline function where the interaction of the child development stage (main effect) and the drug (interaction using the ‘by’ variable) is modeled according to a set of basis functions. Each development stage defines the knot (7 in total) in which the expectation of event reporting is quantified. In the spline function, a penalized cubic spline basis (bs = ‘cs’) is used for fitting the basis functions where the first and second derivative of the event expectation is zero at each knot, resulting in a smooth event expectation across stages. To mitigate overfitting or ‘wiggliness’, we used a penalized iterative restricted likelihood approach, called ‘fREML’, with a wiggliness penalty in the objective function. Fitting the GAM model (using the ‘bam’ function and discrete = T) produces coefficient terms, similar to beta coefficients in logistic regression, for each child development stage. This produces a model with 8 parameters, one for each coefficient and intercept term (the intercept term is not utilized in this study). For each drug-event, GAM scores are generated for each child development stage resulting in 7 scores. It is important to note that all GAM scores produced are finite, nonzero values.
The scores generated by each method have different variations and uncertainty in the estimated population value. We additionally determined the lower confidence bound in which the population-based score would be greater than 90% of score replicates. The population score and the 90% lower confidence bound, called ‘score’ and ‘90mse’ respectively, are the score types for each method.
ADE dynamic detection power analysis
We performed a power analysis to determine which of the positive control drug-event pairs could be detected for each method and score type. The generated scores may not show a drug and event association (score above the null statistic or another significance threshold) for a child development stage due to the method’s different assumptions and biases when applied onto observational data. To mitigate these issues, we determined the drug event data characteristics, namely the number of drug reports and the effect size, for each method in which reporting dynamics could be detected at or above t = 80% power (the power or true positive rate is the fraction of scores above a number of drug reports and effect size out of all scores; Figure S15). Specifically, for the (drug-event, stage, dynamic) triple scores in the positive control set, we determined the power to differentiate scores at high reporting rates about a given score threshold (GAM score threshold==0; PRR score threshold==1). The reporting rates were higher at different child development stages for each dynamics class e.g. the ‘increase’ dynamics class had higher reporting at the ‘early_adolescence’ and ‘late_adolescence’ stages (Table S1). The scores from (drug-event, stage, dynamic) triples with a high reporting rate were only considered for reflecting dynamic drug event reporting associations. The scores from (drug-event, stage, dynamic) triples with a low reporting rate were not considered further due to spurious scores generated at stages without injected signal. The drug event characteristics were determined for both the estimated population score (‘score’) and the 90% lower bound score (‘90mse’) that represent scores with lower and higher confidence, respectively, for the ‘true’ population score.
Choosing drug-event pairs at or exceeding the characteristics for each method and score type at or above t = 80% power resulted in a superset of (drug-event, stage, dynamic) triples designated as positives in a reference standard for each drug event reporting dynamics class (Table S2). The negative control set contained the same (drug-event, stage) doubles or 70,000 scores for each reference standard. Excluding the drug-event scores generated by the uniform class, there were 4 reference standards of positive and negative drug-event pairs for each ADE reporting dynamics class used for detection performance evaluation.
ADE dynamic detection performance
We evaluated the GAM and PRR methods to detect drug event reporting dynamics across the child development stages. Specifically, we determined the performance in differentiating scores from (drug-event, stage, dynamic) triples in the positive control set versus the negative control (drug-event, stage) score doubles. The positive control set contained a superset of the 500 (drug-event, stage, dynamic) score triples (Table S2). The negative control set contained the same (drug-event, stage) doubles or 70,000 scores for each reference standard. For each of the four reference standards, we quantified the area under the receiver operating characteristic (AUROC) curve using the R package ROCR for each detection method and score type. Confidence intervals for other performance metrics including the AUROC were calculated using 100 bootstraps of the risk score distributions. Performance metrics followed the conventional definitions:
-
AUROC was defined as the probability of a randomly chosen positive control score being greater than a randomly chosen negative control score.
-
Sensitivity (i.e. power and true positive rate or TPR) was defined as the fraction of positive drug-events risks greater than a score threshold out of all positive drug-events.
-
Positive predictive value (i.e. precision) was defined as the fraction of positive drug-events risks greater than a score threshold out of drug-events predicted as positive.
-
Negative predictive value was defined as the fraction of negative drug-events risks at or less than a score threshold out of drug-events predicted as negative.
Dynamics sensitivity analysis
We assessed the sensitivity of the ADE detection methods to detect drug event reporting dynamics within child development stages. We artificially removed, at 10% decrements, random drug reports at each child development stage separately. Removing drug reports lowers the rate of drug reporting while maintaining the event reporting rate. Specifically, at each stage, we determined the performance of each method and score type to detect (drug-event, stage, dynamic) score triples compared to the same negative control (drug-event, stage) score doubles at that same reduced stage. Sensitivity was assessed iteratively at the 10% decrements within each child development stage. We calculated performance metrics such as the above to quantify detection of drug event reporting dynamics for each method and score type.
Real-world ADE validation
We applied the ADE detection methods on observed FAERS data for drug-event pairs within the pediatric drug-event reference standard from the Global Research in Pediatrics consortium [50]. A machine-readable dataset can be found at the ‘GRiP_pediatric_ADE-reference_set’ github repository (DOI: https://doi.org/10.5281/zenodo.4453379). We assigned drug-event pairs with epidemiological or mechanistic evidence in children (Control==‘C’) as the positive class (N = 26), and the cross-product of all drugs and events that were complementary to drug-event pairs in the reference set as the negative class (N = 123). We calculated the AUROC using the ROCR package in R and the true positive rate using the null statistic of each method as the prediction threshold.