Skip to main content
Fig. 4 | BioData Mining

Fig. 4

From: Comparative analysis, applications, and interpretation of electronic health record-based stroke phenotyping methods

Fig. 4

Schematic of Model Training, Testing, Evaluation, and Application to UK Biobank. See methods for case/control abbreviations. Case: Control ratio was 1:1, subjects overlapping in the case and control definitions were removed from the control set, and subjects overlapping between the training and testing sets were removed from the testing set before any training or testing. Models included Random Forest (RF), Logistic Regression with L1 penalty (LR), Neural Network (NN), Gradient Boosting (GB), Logistic Regression with Elastic Net Penalty (EN) and Adaboost (AB). AUROC: Area Under the Receiver Operating Curve, AUPR: Area under the Precision-Recall Curve, Sens: Sensitivity, Spec: Specificity, PPV: Positive Predictive Value, NPV: Negative Predictive Value

Back to article page