Skip to main content
Fig. 1 | BioData Mining

Fig. 1

From: Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods

Fig. 1

Performance of select models on Stroke Service holdout test set ((a): AUROC (circle: median, bars: 50% CI), (b): F1 (circle: median, bars: 50% CI)). Different combinations of cases and controls are shown on the y-axis. (LR) logistic regression with l1 penalty, (RF) random forest, (AB) AdaBoost, (GB) gradient boosting, (EN) logistic regression with elastic net penalty. Different combinations of cases and controls are shown on the y-axis. Cases (first letter) may be one of cerebrovascular (C), T-L (T), or Stroke Service (S). Controls (second and third letters) may be one of random (R), cerebrovascular disease but no AIS code (CI), no cerebrovascular disease (C), no AIS code (I), or a stroke mimetic disease (N), See Methods and Supplementary Table 1 for definitions of sets. Threshold to compute the F1 score on the testing set was chosen as the threshold that yielded the maximum F1 in cross-validation on the training set (Methods, Supplementary Table 10)

Back to article page