Skip to main content

Prediction of the risk of developing end-stage renal diseases in newly diagnosed type 2 diabetes mellitus using artificial intelligence algorithms



Type 2 diabetes mellitus (T2DM) imposes a great burden on healthcare systems, and these patients experience higher long-term risks for developing end-stage renal disease (ESRD). Managing diabetic nephropathy becomes more challenging when kidney function starts declining. Therefore, developing predictive models for the risk of developing ESRD in newly diagnosed T2DM patients may be helpful in clinical settings.


We established machine learning models constructed from a subset of clinical features collected from 53,477 newly diagnosed T2DM patients from January 2008 to December 2018 and then selected the best model. The cohort was divided, with 70% and 30% of patients randomly assigned to the training and testing sets, respectively.


The discriminative ability of our machine learning models, including logistic regression, extra tree classifier, random forest, gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine were evaluated across the cohort. XGBoost yielded the highest area under the receiver operating characteristic curve (AUC) of 0.953, followed by extra tree and GBDT, with AUC values of 0.952 and 0.938 on the testing dataset. The SHapley Additive explanation summary plot in the XGBoost model illustrated that the top five important features included baseline serum creatinine, mean serum creatine within 1 year before the diagnosis of T2DM, high-sensitivity C-reactive protein, spot urine protein-to-creatinine ratio and female gender.


Because our machine learning prediction models were based on routinely collected clinical features, they can be used as risk assessment tools for developing ESRD. By identifying high-risk patients, intervention strategies may be provided at an early stage.

Peer Review reports


Type 2 diabetes mellitus (T2DM) is a major challenge to public health worldwide, and the assessment and management of this chronic disease impose a heavy economic burden [1, 2]. T2DM is associated with many complications and problematic symptoms, including micro- and macrovascular complications [3, 4]. Among these complications, diabetic kidney disease (DKD) is a leading cause of chronic kidney disease (CKD) and is associated with a future risk of progression to end-stage renal disease (ESRD) [5, 6]. However, a diagnosis of DKD is often delayed, particularly in the early stages of the disease, because most patients remain asymptomatic with respect to kidney dysfunction [7]. Therefore, identifying DKD patients with a rapid decline in the estimated glomerular filtration rate (eGFR) might be helpful for allowing early nephroprotective treatment to be administered to delay or prevent the progression of kidney failure.

Previous large-scale population-based cohort studies have identified multiple factors potentially contributing to rapid eGFR decline, such as hypertension [8, 9], proteinuria [10], demographic factors, and underlying comorbidities [7]. A meta-analysis of demographic and clinical laboratory data from twenty cohorts representing 41,271 T2DM patients was conducted to develop a categorization point system for DKD prediction [11]. The prediction model achieved an average area under the receiver operating characteristic curve (AUC) of 0.765. Because electronic health record usage provides hundreds of clinical features and a large volume of data, prediction models using a categorization point system may be insufficient to effectively make use of unaligned and correlated data structures. Recently, artificial intelligence (AI) has changed modern procedures, and the progress of machine learning with big data analysis has improved the capacity of predictive model development [12].

In a cohort study consisting of diabetic patients, an AI model using logistic regression was developed to predict the progression of DKD according to 3073 features [13], and it achieved an AUC of 0.743 and an average accuracy of 71%. However, only logistic regression was applied in this study, and the predictive ability of other machine learning models with respect to renal function progression in diabetic patients remains unknown. In addition, in the abovementioned study, the AI model predicted DKD progression for 6 months after the enrollment period; therefore, its predictive ability for a longer follow-up period is unknown.

In our study, we used a large-scale newly diagnosed DM cohort to perform machine learning models by using clinical features, including demographic characteristics, comorbidities, laboratory data and concomitant medications from outpatient department and emergency room visits as well as hospital admissions, to predict the risks of developing ESRD with a long follow-up period. We also used SHapley Additive exPlanation (SHAP) values to evaluate the accurate attribution values for each important feature within machine learning models.


Data sources and study population

During the period of January 2008 to December 2018, we constructed a T2DM 10-year retrospective longitudinal cohort based on the information of patients with newly diagnosed T2DM from the Big Data Center, which includes the detailed patient demographic, underlying comorbidities, medication prescriptions, and laboratory data from all inpatient, outpatient and emergency services [14]. Patients without at least two eGFR values were excluded from our analyses. In addition, we excluded T2DM patients who had undergone renal replacement therapy, such as hemodialysis, peritoneal dialysis, and kidney transplant, before the enrollment points. This study was approved by the Institutional Review Board (Taipei Veterans General Hospitals, Approval no. 2022–03-006 AC), and the need for informed consent was waived because the data were deidentified.

Feature selection

We extracted 78 features used for machine learning, including demographic characteristics, underlying comorbidities, laboratory data and concomitant drugs. The demographic characteristics included age, gender, smoking and alcohol consumption. Underlying comorbidities included histories of hypertension, transient ischemic attack, ischemic stroke, hemorrhagic stroke, myocardial infarction, coronary artery disease, congestive heart failure, chronic liver disease, cirrhosis, peptic ulcer disease, autoimmune disease, chronic obstructive pulmonary disease, asthma, peripheral arterial occlusive disease, cancer, gout, atrial fibrillation, valvular heart disease and diabetic retinopathy. The laboratory data included baseline serum creatinine, mean serum creatinine assessed within 1 year before the diagnosis of T2DM, cholesterol, triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, uric acid, calcium, phosphate, white blood cells, hemoglobin, albumin, alanine aminotransferase, aspartate aminotransferase, total bilirubin, direct bilirubin, alkaline phosphatase, gamma-glutamyl transferase, glycated hemoglobin, glucose, the international normalized ratio, activated partial thromboplastin time, high-sensitivity C-reactive protein, iron, thyroid-stimulating hormone, free thyroxine, and spot urine protein-to-creatinine ratio (UPCR). Concomitant medications included renin-angiotensin-aldosterone system (RAAS) inhibitors, alpha blockers, beta blockers, calcium channel blockers, warfarins, direct oral anticoagulants, aspirins, clopidogrels, nitrates, statins, diuretics, spironolactones, metformins, sulfonylureas, meglitinides, sodium–glucose cotransporter 2 inhibitors, glucagon-like peptide-1 receptor agonists, dipeptidyl peptidase-4 inhibitors, thiazolidinediones, alpha-glucosidase inhibitors, insulins, nonsteroidal anti-inflammatory drugs, cyclooxygenase-2 inhibitors, proton pump inhibitors, steroids, allopurinols, febuxostats and benzbromarones.

Class definition

In our study, the class was annotated as 1 if there was ESRD occurrence during the follow-up periods (defined as eGFR < 15 ml/min/1.73 m2 or the receipt of maintenance dialysis or kidney transplant), and the class was annotated as 0 if there was no ESRD occurrence. We calculated eGFR using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equations [15].

Data cleaning and machine learning model development

Categorical variables are presented as numbers (proportions) and continuous parametric variables are shown as the median (interquartile ranges [IQRs]). To impute the missing values of the clinical features, the K-nearest neighbor (KNN) algorithm was used before the machine learning methods [16, 17]. For model development, the study cohort was randomly divided to create a 70%:30% training set to test set ratio. Because the number of ESRD cases was much smaller than the number of non-ESRD cases, we performed the synthetic minority over-sampling technique (SMOTE)-Tomek algorithms to balance the number of samples taken for imbalanced data [18, 19]. Six machine learning models, including logistic regression, extra trees [20], random forest [21], gradient boosting decision tree (GBDT) [22], extreme gradient boosting models (XGBoost) [23], and light gradient boosting machine (LGBM) [24], are performed. We used forward-feature selection for the reduction in dimensions, which selects the most useful subset of features from all available features [25, 26]. Five-fold cross-validation is performed on the training set to estimate the performance and validate the stability of the applied machine learning models [27, 28].

Hyperparameter optimization

A grid search in combination with the five-fold cross-validation was conducted to optimize the hyperparameters of logistic regression, extra trees, random forest, GBDT, XGBoost, and LGBM to achieve the best F1 score [29,30,31]. The details of hyperparameter optimization for each ensemble model are listed in Table 1. Grid searches determine the best hyperparameter value based on a set of given values.

Table 1 Hyperparameters of machine learning models

Model evaluation

The discriminative abilities of the different machine learning models were compared based on their AUCs. In addition, the F1 score, accuracy, precision, recall, average precision and log loss values of each model by using testing dataset were also presented. SHapley Additive exPlanations (SHAP) was used to evaluate the risk of developing ESRD in T2DM and to provide explanations for the attribution values of clinical features in a unified framework to interpret model predictions.

Software and package applicating for modeling

We used Python (Python Software Foundation version 3.7.6, available at and open-source Scikit-learn library for the establishment of machine learning models and SAS version 9.4 (SAS Institute, Cary, NC) for statistical analysis [32]. We used Python and Scikit-learn library packages, including sklearn.impute.KNNImputer for missing value imputation, sklearn.model_selection.train_test_split for randomly dividing data into train and test sets, sklearn.model_selection.GridSearchCV for hyperparameter optimization, sklearn.linear_model.LogisticRegression for development of the logistic regression model, sklearn.ensemble.ExtraTreesClassifier for development of the extra tree model, sklearn.ensemble.RandomForestClassifier for development of the random forest model, sklearn.ensemble.GradientBoostingClassifier for development of the GBDT model, XGBoost Python package for development of the XGBoost model, lightgbm.LGBMClassifier Python package for development of the LGBM model, and sklearn.model_selection.StratifiedKFold for cross-validation. A P value of 0.05 was considered statistically significant.


Characteristics and distribution of patients

A total of 105,234 T2DM patients aged > 20 years old were identified during the 10-year study period, of whom 34,059 had no eGFR measurements, 16,351 did not have at least two eGFR values, and 1347 patients receiving renal replacement therapy were excluded, which resulted in a final cohort of 53,477 T2DM patients. The detailed patient demographic data are provided in Table 2. The median patient age was 67.05 years (IQR 57.37 to 77.74 years), and 41.4% of the patients were female. In addition, 58.2% of patients had hypertension, 19.8% had coronary artery disease, and 23.4% had cancer. Regarding renal function, T2DM patients had baseline serum creatinine levels of 0.94 mg/dL (IQR 0.75 to 1.27 mg/dL), mean serum creatinine of 0.95 mg/dL (IQR 0.76 to 1.26 mg/dL) within 1 year before the diagnosis of T2DM. The dataset was randomly divided into a training set (70%) and a testing set (30%). Of all the T2DM patients, 4769 (8.9%) patients developed ESRD. A total of 3334 (8.9%) patients developed ESRD on the training set, and 1435 (8.9%) patients developed ESRD on the testing set.

Table 2 Demographics and clinical features between T2DM patients

Model prediction ability

Six machine learning models, i.e., logistic regression, extra tree classifier, random forest, GBDT, XGBoost, and LGBM, were performed, and the AUCs and other performance indices, such as accuracy, F1 score, precision, recall and average precision achieved by the machine learning models after data augmentation are presented in Supplementary Table 1. The AUCs resulting from 5-fold cross-validation of XGBoost models with a mean of 0.984 (Supplementary Fig. 1). On the testing dataset, AUCs showed that the XGBoost model had the highest predictive ability, with an AUC of 0.953, followed by the extra tree model with an AUC of 0.952 (Fig. 1).

Fig. 1
figure 1

A Receiver operating characteristic curves and B precision–recall curves of machine learning models on the testing dataset. C XGBoost yielded the highest area under the ROC curve for prediction of end-stage renal disease followed by extra trees classifier and GBDT on the testing dataset. Abbreviations: ROC, receiver operating characteristic; PR, precision–recall; AUC, area under curve of receiver operating characteristic curve; A.precision, average precision; AUC PRC, area under curve of precision-recall curve; GBDT, gradient boosting decision tree; XGBoost, extreme gradient boosting; LGBM, light gradient boosting machine

Ranks of feature importance and SHAP values in the XGBoost model

We performed feature importance plots of the XGBoost model based on the SHAP values and listed the top important features sorted by the impacts in descending order (Fig. 2A). The top five important features were baseline serum creatinine, mean serum creatinine within 1 year before the diagnosis of T2DM, high-sensitivity C-reactive protein, UPCR and female gender. The impacts of feature importance on model output were also illustrated in the SHAP summary plot (Fig. 2B). Higher SHAP values of important features indicate a higher probability of impacts of the prediction in the XGBoost model. SHAP values in red dots indicate an increase in prediction, while those in blue dots indicate a decrease in prediction. Baseline serum creatinine, mean serum creatinine within 1 year before the diagnosis of T2DM, high-sensitivity C-reactive protein, and UPCR showed positive impacts on the prediction of developing ESRD risk, while the female gender showed a negative impact.

Fig. 2
figure 2

A The feature importance plot and B SHAP summary plot showed the top clinical important features for predicting risks of developing end-stage renal disease in the XGBoost model. Abbreviations: XGBoost, extreme gradient boosting; HSCRP, high-sensitivity C-reactive protein; UPCR, spot urine protein-to-creatinine ratio; ALT, alanine transaminase; DPP4i, dipeptidyl peptidase 4 inhibitors; HGB, hemoglobin; HbA1c, glycated hemoglobin; ALB, albumin; NSAID, nonsteroidal anti-inflammatory drug; HTN, hypertension; INR, international normalized ratio; PI, phosphate

The dependent plots of interactions between serum creatinine, high-sensitivity C-reactive protein, UPCR and female gender

As shown in Fig. 3, the dependent plots illustrated the SHAP values and the interactions between serum creatinine, high-sensitivity C-reactive protein, UPCR and female gender in the XGBoost model. The risks of developing ESRD increased as baseline or mean serum creatinine increased and then reached a plateau when creatinine > 5 mg/dL (Fig. 3A–B). Figure 3C–F illustrates the interaction between the SHAP values of baseline serum creatinine, mean serum creatinine, high-sensitivity C-reactive protein, UPCR and female gender. The values on the y-axis indicate the interaction SHAP values between baseline serum creatinine and other important features, and values on the x-axis are the levels of baseline serum creatinine. Mean serum creatinine, high-sensitivity C-reactive protein, UPCR and female gender were positively correlated with the predictive value of baseline serum creatinine.

Fig. 3
figure 3

The plots of SHAP value of (A) baseline serum creatinine and (B) mean serum creatinine within 1 year before diagnosis showed increased creatinine levels were associated with increased SHAP values. SHAP interaction plots showed the interaction impacts between baseline serum creatinine and (C) mean serum creatinine (D) HSCRP, (E) UPCR, and (F) female gender on the prediction model’s output. Abbreviations: SHAP, SHapley Additive exPlanations; HSCRP, high-sensitivity C-reactive protein; UPCR, spot urine protein-creatinine ratio


In the current study, we developed machine learning models to predict the development of ESRD among T2DM patients based on electronic medical records. We used the machine learning system to conduct feature selection and compare the AUCs among the different machine learning models. We found that the XGBoost model had the highest predictive performance with the highest AUC of 0.953 on the testing dataset compared to other machine learning algorithms. The top five important features were baseline serum creatinine, mean serum creatinine within 1 year before the diagnosis of T2DM, high-sensitivity C-reactive protein, UPCR and female gender.

Previous studies in nondiabetic populations have attempted to find useful markers to predict ESRD. A Norwegian large-scale general health study including 65,589 adults aged > 20 years from 1995 through 1997 established a clinical predictive model (incorporating age, gender, physical activity, diabetes, systolic blood pressure, antihypertensive medication, and high-density lipoprotein) for the future risk of ESRD, and the AUC reached 0.864 [33]. After adding albuminuria and eGFR, the AUC of the model was increased to 0.936. Ishani et al [34]. studied 12,866 men who were at high risk for heart disease and found that dipstick proteinuria, eGFR < 60 ml/min/1.73 m2, and hematocrit were related to the development of ESRD. Because the study populations were limited to nondiabetic populations, the findings of these studies may not be generalizable to T2DM groups. For diabetic patients, proteinuria [35, 36], diabetic retinopathy [37, 38], increased glycated hemoglobin levels [39], hypertension [40], and cardiovascular diseases [41, 42] may precede kidney function decline and have been demonstrated to be associated with renal function progression.

A customized software program for CKD risk identification in Australia (the Electronic Diagnosis and Management Assistance to Primary Care in Chronic Kidney Disease (eMAP:CKD) program) was developed to integrate primary care electronic health records from more than 150,000 patients [43]. After the initiation of the program, there was a significant improvement in CKD documentation from 0.48 to 1.55%. In addition, the proportions of at-risk patients diagnosed with CKD at 15 months were found to be significantly increased from 7.8 to 24.40%. Furthermore, recent studies have applied AI to predict the risks of CKD. Kanda et al. [44] conducted a study including 7465 subjects and found that AI models with support vector machine (SVM) models can help predict CKD progression in both high-risk and low-risk subjects. After the 3-year follow-up, the accuracy of the SVM models was increased. Chen et al. [45] used three different models, i.e., K-nearest neighbor (KNN), SVM, and soft independent modeling of class analogy (SIMCA), to analyze data from 386 patients with or without CKD for clinical risk assessment and achieved accuracies over 93%. In their study, KNN and SVM achieved better performance than SIMCA. Almansour et al. [46] studied data from 400 patients with the goal of diagnosing CKD at an early stage and found that artificial neural networks (accuracy: 99.75%) performed better than SVMs (accuracy: 97.75%).

Although several studies have developed machine-learning models to detect diabetes and diabetic complications, to date, only one machine learning model has been developed to detect renal function progression in diabetic patients. Makino et al. [13] conducted a longitudinal data analysis with big data representing diabetes patients with stage 1 to 2 diabetic nephropathy and found that logistic regression models can predict DKD aggravation with 71% accuracy. A higher risk of hemodialysis was associated with DKD aggravation than with nonaggravation. However, the study was limited to the early stage of DKD and a single machine learning model with logistic regression. In our study, we found that the machine learning XGBoost model predicted the risk of developing ESRD, achieving an AUC value of 0.953 on the testing dataset.

With a positive SHAP value, the machine learning models revealed that baseline serum creatinine showed the greatest impact on predicting the risk of developing ESRD. A previous study found that better baseline renal function was protective against renal function decline [47]. Our models also found that mean serum creatinine within 1 year before diagnosis of T2DM was an important predictor of developing ESRD. The possible explanation may be that mean serum creatinine is reflective of the usual renal status. According to the SHAP dependence plots, the interaction with high-sensitivity C-reactive protein increases the prediction of risks of developing ESRD. Elevated high-sensitivity C-reactive protein was found to be independently associated with an increased risk of renal function decline in patients with diabetes and the general non-diabetic population [48, 49]. Higher UPCR levels at the time of diagnosis of T2DM were also associated with higher risks of developing ESRD, which was similar to previous research that found a positive correlation between UPCR and ESRD [50]. In contrast, female gender was associated with lower SHAP values and decreased risks of developing ESRD. A previous study also found that renal function decline in women was slower compared to men among middle-aged and elderly individuals [51].

Our study has several strengths. We established a predictive model by inputting big EMR data into the machine learning algorithm. The novelty of this study is the use of a 10-year longitudinal cohort to predict the risk of developing ESRD in newly diagnosed T2DM patients with baseline median creatinine of 0.94 mg/dL. The machine learning algorithm compared discriminative ability among different machine learning models and selected the best models. This approach offers not only improvement in AUCs but also selection of the best predicting model in cases where it is unclear what machine learning models are most suitable. In addition, the SHAP algorithm was used to interpret the model predictions, and the impacts of important features on developing ESRD were explored. Using SHAP summary plots, we demonstrated the strength and direction of each feature (positive or negative effects).

Our study also has real and perceived limitations. First, as patient information, including demographic data, underlying comorbidities and concomitant medications, was obtained from electronic health record systems and coding procedures, we could not identify mild diseases without coding in T2DM patients. Second, the inclusion of data on the duration and frequency of laboratory visits was not uniform but varied among patients. Finally, the training data and testing data were from the same dataset. Further validation in other cohorts is necessary.


Our machine learning models employing longitudinal data from electronic health records were effective in predicting the risks of developing ESRD in T2DM patients in real-world clinical scenarios over a 10-year study period of observation. In addition, we used the SHAP method to provide explanations for the selected features to interpret model predictions. The developed model has the potential to predict the T2DM patients at increased risks for developing ESRD and thus, consequently initiating prevention or treatment plans for patients. In the future, external validation studies are necessary to convenient machine learning models to be developed for widespread use in clinical practice.

Availability of data and materials

The datasets generated and/or analysed during the current study are available from the corresponding author on reasonable request.


  1. Al-Lawati JA. Diabetes mellitus: a local and global public health emergency! Oman Med J. 2017;32:177–9.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Ganasegeran K, Hor CP, Jamil MFA, Loh HC, Noor JM, Hamid NA, et al. A systematic review of the economic burden of type 2 diabetes in Malaysia. Int J Environ Res Public Health. 2020:17(16):5723.

  3. Nazimek-Siewniak B, Moczulski D, Grzeszczak W. Risk of macrovascular and microvascular complications in type 2 diabetes: results of longitudinal study design. J Diabetes Complicat. 2002;16:271–6.

    Article  Google Scholar 

  4. Krentz AJ, Clough G, Byrne CD. Interactions between microvascular and macrovascular disease in diabetes: pathophysiology and therapeutic implications. Diabetes Obes Metab. 2007;9:781–91.

    Article  CAS  PubMed  Google Scholar 

  5. Garla V, Kanduri S, Yanes-Cardozo L, Lién LF. Management of diabetes mellitus in chronic kidney disease. Minerva Endocrinol. 2019;44:273–87.

    Article  PubMed  Google Scholar 

  6. Navaneethan SD, Schold JD, Jolly SE, Arrigain S, Winkelmayer WC, Nally JV Jr. Diabetes control and the risks of ESRD and mortality in patients with CKD. Am J Kidney Dis. 2017;70:191–8.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Żyłka A, Gala-Błądzińska A, Rybak K, Dumnicka P, Drożdż R, Kuśnierz-Cabala B. Role of new biomarkers for the diagnosis of nephropathy associated with diabetes type 2. Folia Med Cracov. 2015;55:21–33.

    PubMed  Google Scholar 

  8. Polonia J, Azevedo A, Monte M, Silva JA, Bertoquini S. Annual deterioration of renal function in hypertensive patients with and without diabetes. Vasc Health Risk Manag. 2017;13:231–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Hobeika L, Hunt KJ, Neely BA, Arthur JM. Comparison of the rate of renal function decline in NonProteinuric patients with and without diabetes. Am J Med Sci. 2015;350:447–52.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Lim CTS, Nordin NZ, Fadhlina NZ, Anim MS, Kalaiselvam T, Haikal WZ, et al. Rapid decline of renal function in patients with type 2 diabetes with heavy proteinuria: a report of three cases. BMC Nephrol. 2019;20:22.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Jiang W, Wang J, Shen X, Lu W, Wang Y, Li W, et al. Establishment and validation of a risk prediction model for early diabetic kidney disease based on a systematic review and Meta-analysis of 20 cohorts. Diabetes Care. 2020;43:925–33.

    Article  CAS  PubMed  Google Scholar 

  12. Arnold MH. Teasing out artificial intelligence in medicine: an ethical critique of artificial intelligence and machine learning in medicine. J Bioethical Inquiry. 2021:18(1):121–139.

  13. Makino M, Yoshimoto R, Ono M, Itoko T, Katsuki T, Koseki A, et al. Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning. Sci Rep. 2019;9:11862.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Kuan AS, Chen TJ. Healthcare data research: the inception of the Taipei veterans general hospital big data center. J Chinese Med Assoc. 2019;82:679.

    Article  Google Scholar 

  15. Levey AS, Stevens LA, Schmid CH, Zhang YL, Castro AF 3rd, Feldman HI, et al. A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009;150:604–12.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Stevens JR, Suyundikov A, Slattery ML. Accounting for missing data in clinical research. JAMA. 2016;315:517–8.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Li YM, Zhao P, Yang YH, Wang JX, Yan H, Chen FY. Simulation study on missing data imputation methods for longitudinal data in cohort studies. Zhonghua Liu Xing Bing Xue Za Zhi. 2021;42:1889–94.

    CAS  PubMed  Google Scholar 

  18. Kibria HB, Nahiduzzaman M, Goni MOF, Ahsan M, Haider J. An ensemble approach for the prediction of diabetes mellitus using a soft voting classifier with an explainable AI. Sensors. Basel. 2022;22(19):7268.

  19. Ijaz MF, Attique M, Son Y. Data-driven cervical Cancer prediction model with outlier detection and over-sampling methods. Sensors. Basel. 2020;15;20(10):2809.

  20. Ghiasi MM, Zendehboudi S. Application of decision tree-based ensemble learning in the classification of breast cancer. Comput Biol Med. 2021;128:104089.

    Article  PubMed  Google Scholar 

  21. Rigatti SJ. Random Forest. J Insur Med. 2017;47:31–9.

    Article  PubMed  Google Scholar 

  22. Natekin A, Knoll A. Gradient boosting machines, a tutorial. Front Neurorobot. 2013;7:21.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Babajide Mustapha I, Saeed F. Bioactive molecule prediction using extreme gradient boosting. Molecules. 2016;28;21(8):983.

  24. Karabayir I, Goldman SM, Pappu S, Akbilgic O. Gradient boosting for Parkinson's disease diagnosis from voice recordings. BMC Med Inform Decis Mak. 2020;20:228.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Rahman MM, Usman OL, Muniyandi RC, Sahran S, Mohamed S, Razak RA. A review of machine learning methods of feature selection and classification for autism Spectrum disorder. Brain Sci. 2020:7;10(12):949.

  26. Kulan H, Dag T. In silico identification of critical proteins associated with learning process and immune system for Down syndrome. PLoS One. 2019;14:e0210954.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Jung Y, Hu J. A K-fold averaging cross-validation procedure. J Nonparametric Stat. 2015;27:167–79.

    Article  Google Scholar 

  28. Little MA, Varoquaux G, Saeb S, Lonini L, Jayaraman A, Mohr DC, et al. Using and understanding cross-validation strategies. Perspectives on Saeb et al GigaScience. 2017;6:1–6.

    PubMed  Google Scholar 

  29. Chadha A, Kaushik B. A hybrid deep learning model using grid search and cross-validation for effective classification and prediction of suicidal ideation from social network data. N Gener Comput. 2022;40:1–26.

    Google Scholar 

  30. Diao X, Huo Y, Zhao S, Yuan J, Cui M, Wang Y, et al. Automated ICD coding for primary diagnosis via clinically interpretable machine learning. Int J Med Inform. 2021;153:104543.

    Article  PubMed  Google Scholar 

  31. Jiang X, Xu C. Deep learning and machine learning with grid search to predict later occurrence of breast Cancer metastasis using clinical data. J Clin Med. 2022:29;11(19):5772.

  32. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;2825–30.

  33. Hallan SI, Ritz E, Lydersen S, Romundstad S, Kvenild K, Orth SR. Combining GFR and albuminuria to classify CKD improves prediction of ESRD. J Am Soc Nephrol. 2009;20:1069–77.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Ishani A, Grandits GA, Grimm RH, Svendsen KH, Collins AJ, Prineas RJ, et al. Association of single measurements of dipstick proteinuria, estimated glomerular filtration rate, and hematocrit with 25-year incidence of end-stage renal disease in the multiple risk factor intervention trial. J Am Soc Nephrol. 2006;17:1444–52.

    Article  CAS  PubMed  Google Scholar 

  35. Hovind P, Rossing P, Tarnow L, Smidt UM, Parving HH. Progression of diabetic nephropathy. Kidney Int. 2001;59:702–9.

    Article  CAS  PubMed  Google Scholar 

  36. Viberti GC, Hill RD, Jarrett RJ, Argyropoulos A, Mahmud U, Keen H. Microalbuminuria as a predictor of clinical nephropathy in insulin-dependent diabetes mellitus. Lancet. 1982;1:1430–2.

    Article  CAS  PubMed  Google Scholar 

  37. Trevisan R, Vedovato M, Mazzon C, Coracina A, Iori E, Tiengo A, et al. Concomitance of diabetic retinopathy and proteinuria accelerates the rate of decline of kidney function in type 2 diabetic patients. Diabetes Care. 2002;25:2026–31.

    Article  PubMed  Google Scholar 

  38. Sakata M, Oniki K, Kita A, Kajiwara A, Uchiyashiki Y, Saruwatari J, et al. Clinical features associated with a rapid decline in renal function among Japanese patients with type 2 diabetes mellitus: microscopic hematuria coexisting with diabetic retinopathy. Diabetes Res Clin Pract. 2013;100:e39–41.

    Article  PubMed  Google Scholar 

  39. Bash LD, Selvin E, Steffes M, Coresh J, Astor BC. Poor glycemic control in diabetes and the risk of incident chronic kidney disease even in the absence of albuminuria and retinopathy: atherosclerosis risk in communities (ARIC) study. Arch Intern Med. 2008;168:2440–7.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Bakris GL, Weir MR, Shanifar S, Zhang Z, Douglas J, van Dijk DJ, et al. Effects of blood pressure level on progression of diabetic nephropathy: results from the RENAAL study. Arch Intern Med. 2003;163:1555–65.

    Article  PubMed  Google Scholar 

  41. Pálsson R, Patel UD. Cardiovascular complications of diabetic kidney disease. Adv Chronic Kidney Dis. 2014;21:273–80.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Jenks SJ, Conway BR, McLachlan S, Teoh WL, Williamson RM, Webb DJ, et al. Cardiovascular disease biomarkers are associated with declining renal function in type 2 diabetes. Diabetologia. 2017;60:1400–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Pefanis A, Botlero R, Langham RG, Nelson CL. eMAP:CKD: electronic diagnosis and management assistance to primary care in chronic kidney disease. Nephrol Dial Transplant. 2018;33:121–8.

    PubMed  Google Scholar 

  44. Kanda E, Kanno Y, Katsukawa F. Identifying progressive CKD from healthy population using Bayesian network and artificial intelligence: a worksite-based cohort study. Sci Rep. 2019;9:5082.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Chen Z, Zhang X, Zhang Z. Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models. Int Urol Nephrol. 2016;48:2069–75.

    Article  PubMed  Google Scholar 

  46. Almansour NA, Syed HF, Khayat NR, Altheeb RK, Juri RE, Alhiyafi J, et al. Neural network and support vector machine for the prediction of chronic kidney disease: a comparative study. Comput Biol Med. 2019;109:101–11.

    Article  PubMed  Google Scholar 

  47. Takagi M, Babazono T, Uchigata Y. Differences in risk factors for the onset of albuminuria and decrease in glomerular filtration rate in people with type 2 diabetes mellitus: implications for the pathogenesis of diabetic kidney disease. Diabet Med. 2015;32:1354–60.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Liu L, Gao B, Wang J, Yang C, Wu S, Wu Y, et al. Clinical significance of single and persistent elevation of serum high-sensitivity C-reactive protein levels for prediction of kidney outcomes in patients with impaired fasting glucose or diabetes mellitus. J Nephrol. 2021;34:1179–88.

    Article  CAS  PubMed  Google Scholar 

  49. Schei J, Stefansson VT, Eriksen BO, Jenssen TG, Solbu MD, Wilsgaard T, et al. Association of TNF receptor 2 and CRP with GFR decline in the general nondiabetic population. Clin J Am Soc Nephrol. 2017;12:624–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Ying T, Clayton P, Naresh C, Chadban S. Predictive value of spot versus 24-hour measures of proteinuria for death, end-stage kidney disease or chronic kidney disease progression. BMC Nephrol. 2018;19:55.

    Article  PubMed  PubMed Central  Google Scholar 

  51. Melsom T, Norvik JV, Enoksen IT, Stefansson V, Mathisen UD, Fuskevåg OM, et al. Sex differences in age-related loss of kidney function. J Am Soc Nephrol. 2022;33:1891–902.

    Article  CAS  PubMed  Google Scholar 

Download references


Not applicable.


This work was supported in part by the Ministry of Science and Technology, Taiwan (MOST 106–2314-B-010-039-MY3, MOST 107–2314-B-075-052, MOST 108–2314- B-075-008, NSTC 109–2314-B-075-067-MY3, MOST 109–2320-B-075-006, MOST 109–2314-B-075-097-MY3, MOST 110–2312-B-075-002, MOST 110–2634-F-A49–005, MOST 110–2320-B-075- 004-MY3, NSTC 111–2634-F-A49-014); Taipei Veterans General Hospital (V107B- 027, V108B-023, V108C-103, V108D42–004-MY3–2, V109B-022, V109C-114, V109D50–001-MY3–1, V109D50–001-MY3–2, V109D50–001-MY3–3, V109D50–002-MY3–3, V109E-008- 5(110), V110C-152, V110E-003–2,V110E-003–2, V111E-002-3,V111C-171, V111C-151, V111D60–004-MY3–1, V112C-175, V112D66–002-MY2–1, V112E-001-2); Taipei Veterans General Hospital-National Yang-Ming University Excellent Physician Scientists Cultivation Program (No.104-V-B-044). Taipei, Taichung, Kaohsiung Veterans General Hospital, Tri-Service General Hospital, Academia Sinica Joint Research Program (VTA110-V1–3-1) and Foundation for Poison Control (FPC-109-002). The funders did not play any role in the study design, data collection or analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations



Conception and study design: Shuo-Ming Ou, Ming-Tsun Tsai, Kuo-Hua Lee, Wei-Cheng Tseng, Chih-Yu Yang, Tz-Heng Chen, Pin-Jie Bin, Tzeng-Ji Chen, Yao-Ping Lin, Wayne Huey-Herng Sheu, Yuan-Chia Chu, Der-Cherng Tarng; data acquisition: Shuo-Ming Ou, Yuan-Chia Chu, Der-Cherng Tarng; data analysis and interpretation: Shuo-Ming Ou, Ming-Tsun Tsai, Kuo-Hua Lee, Yuan-Chia Chu, Der-Cherng Tarng; statistical analysis: Shuo-Ming Ou, Yuan-Chia Chu; drafting of the manuscript: Shuo-Ming Ou, Yuan-Chia Chu, Der-Cherng Tarng. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yuan-Chia Chu or Der-Cherng Tarng.

Ethics declarations

Ethics approval and consent to participate

The electronic data for this study was approval from the Institutional Review Board (Taipei Veterans General Hospitals, Approval no. 2022–03-006 AC) and informed consent was waived due to the de-identified data were analyzed.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The corresponding authors have the right to grant on behalf of all authors and do grant on behalf of all authors a worldwide license to the Publishers and its licensees in perpetuity.

All authors take responsibility for all aspects of the reliability and freedom from bias of the data presented and their discussed interpretation.

Supplementary Information

Additional file 1: Supplementary Table 1.

The performance of machine learning models after data augmentation for predicting the risk of end-stage renal disease in newly diagnosed type 2 diabetes mellitus. Supplementary Figure 1. Area under the receiver operating characteristic curve for the 5-fold cross-validation of the XGBoost machine learning models.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ou, SM., Tsai, MT., Lee, KH. et al. Prediction of the risk of developing end-stage renal diseases in newly diagnosed type 2 diabetes mellitus using artificial intelligence algorithms. BioData Mining 16, 8 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Artificial intelligence
  • Diabetes mellitus
  • End-stage renal disease
  • Machine learning