Original Paper
Abstract
Background: Most artificial intelligence–based research on acute kidney injury (AKI) prediction has focused on intensive care unit settings, limiting their generalizability to general wards. The lack of standardized AKI definitions and reliance on intensive care units further hinder the clinical applicability of these models.
Objective: This study aims to develop and validate a machine learning–based framework to assist in managing AKI and acute kidney disease (AKD) in general ward patients, using a refined operational definition of AKI to improve predictive performance and clinical relevance.
Methods: This retrospective multicenter cohort study analyzed electronic health record data from 3 hospitals in South Korea. AKI and AKD were defined using a refined version of the Kidney Disease: Improving Global Outcomes criteria, which included adjustments to baseline serum creatinine estimation and a stricter minimum increase threshold to reduce misclassification due to transient fluctuations. The primary outcome was the development of machine learning models for early prediction of AKI (within 3 days before onset) and AKD (nonrecovery within 7 days after AKI).
Results: The final analysis included 135,068 patients. A total of 7658 (8%) patients in the internal cohort and 2898 (7.3%) patients in the external cohort developed AKI. Among the 5429 patients in the internal cohort and 1998 patients in the external cohort for whom AKD progression could be assessed, 896 (16.5%) patients and 287 (14.4%) patients, respectively, progressed to AKD. Using the refined criteria, 2898 cases of AKI were identified, whereas applying the standard Kidney Disease: Improving Global Outcomes criteria resulted in the identification of 5407 cases. Among the 2509 patients who were not classified as having AKI under the refined criteria, 2242 had a baseline serum creatinine level below 0.6 mg/dL, while the remaining 267 experienced a decrease in serum creatinine before the onset of AKI. The final selected early prediction model for AKI achieved an area under the receiver operating characteristic curve of 0.9053 in the internal cohort and 0.8860 in the external cohort. The early prediction model for AKD achieved an area under the receiver operating characteristic curve of 0.8202 in the internal cohort and 0.7833 in the external cohort.
Conclusions: The proposed machine learning framework successfully predicted AKI and AKD in general ward patients with high accuracy. The refined AKI definition significantly reduced the classification of patients with transient serum creatinine fluctuations as AKI cases compared to the previous criteria. These findings suggest that integrating this machine learning framework into hospital workflows could enable earlier interventions, optimize resource allocation, and improve patient outcomes.
doi:10.2196/66568
Keywords
Introduction
Acute kidney injury (AKI) is an escalating critical health and socioeconomic issue, marked by prolonged hospital stays, elevated medical costs, and high mortality rates [
]. AKI is a secondary condition arising from hospital interventions, such as medication, surgery, and infections, with a prevalence of 10%-15% among general inpatients [ ]. Studies have shown that the duration of AKI correlates with increased risks of complications and mortality [ ]. AKI and its progression to acute kidney disease (AKD) are associated with significant increases in postdischarge morbidity and mortality. The prolonged duration of AKI in general wards has been associated with higher risks of complications and increased mortality [ - ]. Early prediction of AKI and its progression to AKD using artificial intelligence (AI) models improve patient outcomes through timely intervention and personalized management strategies [ - ]. There are research findings indicating that applying a clinical decision support system for actual AKI occurrences has significantly improved patient outcomes [ - ]. However, there is a lack of standardized operational definitions for AKI, particularly in retrospective studies, where data collection may not be systematic [ , ]. This inconsistency in defining AKI, especially in terms of baseline serum creatinine (SCr) and recovery criteria for AKD, leads to ambiguity in labeling and makes it difficult to compare the performance of different AI models across studies [ , ]. The variability in baseline SCr determination further complicates the issue [ - ]. Studies that investigated the impact of baseline SCr on model performance are limited to the intensive care unit (ICU) setting and do not address real-time early prediction in general wards [ - ], in differing AKI labels for the same patient. Defining the recovery for AKD is even more challenging [ , - ].General ward patients typically exhibit milder disease severity, less frequent laboratory measurements, and different baseline characteristics compared with ICU patients [
, , - ]. Applying the criterion of baseline SCr being 1.5 times higher requires more caution, especially in cases with low SCr levels. For instance, in a patient with 3 SCr measurements within 48 hours showing values of 0.9 mg/dL, 0.6 mg/dL, and 0.9 mg/dL in chronological order, additional consideration may be needed to determine whether the patient should be classified as experiencing AKI. Furthermore, when the baseline SCr is set below 0.6 mg/dL, even small fluctuations can easily meet the 1.5-fold criterion within 7 days, leading to the classification of AKI occurrence [ ]. This study aimed to develop and validate a machine learning-based framework for early prediction of AKI and AKD, applicable to general ward patients, using a refined operational definition of AKI.Methods
Study Setting and Participants
This multicenter, retrospective study was conducted across the general wards of 3 different hospitals. Data were collected from patients admitted to Korea University Guro Hospital and Anam Hospital between January 1, 2015, and December 31, 2021 (internal cohort), and from Soonchunhyang University Cheonan Hospital between March 1, 2016, and March 31, 2021 (external cohort). Patients younger than 19 years, those with fewer than 3 SCr measurements during their hospital stay, and those with an estimated glomerular filtration rate (eGFR) ≤60 mL/minute on the first day of SCr measurement were excluded.
Operative Definition
AKI was defined based on the Kidney Disease: Improving Global Outcomes (KDIGO) criteria as follows [
].- An increase in SCr level ≥0.3 mg/dL (or ≥26.5 μmol/L) within 48 h.
- An increase in SCr ≥1.5 times the baseline within 7 days.
The KDIGO criteria are widely used; however, they are sometimes modified based on the characteristics of the cohort or the objectives of the study [
]. Compared with ICU patients, general ward patients are in relatively better condition with less frequent measurements of vital signs and laboratory data. Additionally, individuals with healthy kidneys and low SCr levels can be identified as having AKI due to simple fluctuations [ ]. There may be more bias in real-time predictions, as SCr estimation is more frequently performed for patients with missing SCr values. Therefore, to prevent these issues, we restricted the application of baseline SCr levels and refined the labeling. To improve the labeling criteria, baseline SCr was defined as the lowest SCr measured within the previous 7 days. If the patient’s baseline SCr was ≥0.3 mg/dL lower than the median and recent SCr values, it was excluded as a baseline to avoid errors due to SCr measurement inaccuracies or temporary fluctuations caused by medication, measurement error, and other factors. AKI was defined as a minimum increase of 0.3 mg/dL to ensure appropriate identification without mislabeling due to minor variations [ , , ]. AKD was defined as the persistence of AKI for more than 7 days. Cases in which SCr did not return to <1.5 times the baseline SCr within 7 days were identified as AKD [ , - ]. None of the previous definitions or studies have taken these aspects into consideration.To develop an early prediction model for AKI occurrence, data from 1 to 3 days before AKI onset were labeled as 1, and the remaining data were labeled as 0. The data from the day of AKI onset were not used. For patients who did not develop AKI, all data were labeled as 0. Days without SCr measurements within 7 days were excluded from training and evaluation. Additionally, patients with ambiguous AKI labeling were excluded from the training and evaluation. Data from the day of AKI onset were used to predict AKD progression. Patients were labeled 0 or 1 based on whether they recovered within 7 days post AKI onset. Patients with insufficient SCr measurements after AKI onset were excluded.
illustrates the previous KDIGO criteria, the improved AKI criteria proposed in this study, and examples of AKI and AKD labeling.Data Preprocessing
The data collected from the electronic health records contained numerous missing values, including measurement values, times, and specific variables. To address this issue, data were summarized at 24-hour intervals, a method validated in previous studies [
]. Vital signs, laboratory data, and variables such as nephrotoxic drugs (eg, nonsteroidal anti-inflammatory drugs, nephrotoxic antibiotics [ANTIs], and cytotoxic chemotherapeutic agents), vascular imaging examinations, surgeries under general anesthesia, contrast-enhanced computed tomography (CECT), and ICU transfers within the past 7 days were collected.Vital sign data measured multiple times within 24 hours were summarized using mean, maximum, and minimum values, and the number of measurements. The laboratory test results were determined based on recent measurements. Additionally, variables such as nephrotoxic drugs, vascular imaging examinations, surgeries under general anesthesia, CECT, and ICU transfers within the past 7 days were included. eGFR was calculated using the Chronic Kidney Disease Epidemiology Collaboration 2021 equation [
]. The “BUN/Cr ratio” was defined as blood urea nitrogen (BUN) level divided by serum creatinine level.Robust scaling was applied to all continuous variables and one-hot encoding was used for categorical variables. Approximately 120 features were extracted from the electronic health records, including basic patient information, vital sign data, laboratory test results, and other factors. Through a comprehensive literature review and consultation with specialists, a feature selection process was undertaken [
, , ], and 42 features were ultimately selected based on their correlation coefficients and missing value ratios.To handle outliers, the data distribution for each feature was reviewed along with the individual patient records. Outliers for some numerical variables were determined using histograms and quantiles with input from clinical experts. The missing values were imputed in 2 stages. First, where feasible, missing values were replaced with previous values to maintain data continuity. For variables with low missing values (less than 20%), the multiple imputation by chained equations method was used [
, ]. For variables with a missing rate exceeding 20%, the missing indicator method was used to denote missing values as unknown [ ]. Numerical variables were categorized into 2 to 4 groups based on data distribution and expert consultation, and missing values were assigned to the “missing” category. For laboratory test results not subjected to missing indicators, changes in each variable were calculated by subtracting the median of the previous values from the current value ( ).Model Training and Evaluation
Various traditional machine learning models have been used, including logistic regression [
], random forest [ ], eXtreme gradient boosting [ ], light gradient boosting machine [ ], and categorical boosting (CAT) [ ]. For detailed model training procedures ( ). The model evaluation metrics included accuracy, precision, recall, specificity, F1-score, the area under the receiver operating characteristic (AUROC), and the area under the precision-recall curve. The primary outcomes were the development of models for early AKI prediction within 3 days and its progression to AKD, along with the establishment of a framework. Simulations of the designed framework were repeatedly performed using the external validation cohort by dividing the period into 1-year increments and assessing the generalization performance of the models across different institutions and over time.Statistical Analysis
Descriptive statistics were used to present the baseline differences between the days with and without AKI. The distribution of continuous and categorical variables was expressed as means and SDs and counts and percentages, respectively. Normally distributed continuous variables were evaluated using 2-tailed t tests. For those that did not follow a normal distribution, the Mann-Whitney U test was used. The chi-square test was used to analyze categorical variables. Statistical significance was set at P<.05. Calibration plots were used to assess the agreement between the predicted probabilities and observed outcomes [
]. The Cox proportional hazards model was used to compare hazard ratios (HRs) between patients with actual AKD and those predicted by the model [ ].Ethical Considerations
The study was conducted in accordance with the ethical principles of the Declaration of Helsinki and was approved by the institutional review boards of Soonchunhyang University Cheonan Hospital, Korea University Anam Hospital, and Guro Hospital (approvals 2019-10-023, 2023AN0145, and 2023GR0425, respectively). The need for individual consent was waived due to the retrospective nature of the study and the use of anonymized clinical data. To ensure privacy and data security, only fully anonymized data were used, and all analyses were conducted within a designated secure environment with restricted access. No financial compensation was provided to participants, as the study was purely observational and used deidentified retrospective data. All processes adhered to the guidelines developed for machine learning model development in the biomedical field [
] and followed the STROE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines for observational studies.Results
Labeling
is a flowchart of cohort formation and labeling. To compare the previous labeling criteria [ , , - ] with the labeling criteria used in this study, we developed and evaluated the model using the same methodology. Additionally, we calculated and presented the HRs for a 30% or 40% reduction in eGFR at the time of AKI onset ( and ). In Table S7 in , individuals with a baseline SCr<0.6 mg/dL were included, leading to the classification of AKI in patients with very low changes in SCr or very low baseline SCr levels. Additionally, subtracting the baseline median from the SCr at this point yielded a median of 0.10 (IQR 0.00-0.20).
A illustrates this trend. Of the 2509 patients, 2242 (89.36%) had a baseline SCr<0.6 mg/dL at the time of AKI occurrence. The remaining 267 patients experienced decreased SCr levels before being diagnosed with AKI. This trend is illustrated in Figure S5 in . Additionally, of the 2072 patients whose AKD status could be determined, only 50 (2.41%) progressed to AKD, and only 1 patient experienced a more than 30% decrease in eGFR within 30 days of AKI onset.


Early AKI Prediction Model
The final analysis included 95,555 and 39,513 cases in the internal and external cohorts, respectively. Within these cohorts, AKI was identified in 7658 (8.1%) and 2898 (7.3%) patients, respectively. The median length of hospital stay was 11 (IQR 7-20) days for the internal cohort and 10 (IQR 6-18) days for the external cohort. The median number of days to AKI occurrence from admission was 8 (IQR 3-17) days for the internal cohort and 7 (IQR 3-16) days for the external cohort (
). Tables S9 and Table S10 in present the basic statistics of patients with and without AKI. Comparing the days on which AKI occurred to the days on which it did not, there were statistically significant differences in most characteristics between the internal and external cohorts. However, differences in blood pressure and sodium and chloride levels were observed between the internal and external cohorts.The models were evaluated based on the results of 5-fold cross-validation, with all performance metrics presented at a cutoff of 0.5. The CAT model demonstrated a strong predictive capability (AUROC=0.9134). The logistic regression model performed poorly (AUROC=0.7754). Model evaluation was conducted through both internal and external validations, showing a slight decrease in performance based on the AUROC (range 0.0097-0.0216).
presents the performance evaluation results of the early AKI prediction model. The hyperparameters used in the model are listed in Table S11 in .shows the Shapley additive explanations (SHAP) analysis for the early AKI prediction model. The interpretation of the CAT model using SHAP values indicated that the most important features were the SCr level and its changes. Other significant features included heart rate, BUN level, activated partial thromboplastin time (aPTT), total bilirubin level, and BMI, all of which increased the AKI risk. Conversely, lower eGFR, platelet count, body temperature, and pH were associated with a higher risk of AKI. Figure S7 in shows the results of testing the probability of the early AKI prediction model. Calibration plots were generated for both the internal and external cohorts. The slope of the calibration plot was 1.15 for the internal cohort and 1.08 for the external cohort, indicating a good calibration of the model's predicted probabilities. Additionally, the model probabilities were evaluated by dividing them into five groups, which demonstrated excellent calibration. A box plot of the probabilities in relation to the AKI status showed a clear distinction between AKI and non-AKI, further validating the predictive value of the model (Figure S8 in ).
Validation and model | Accuracy | Precision | Recall | F1-score | AUROCa | AUPRCb | |
Cross-validation, mean (SD) | |||||||
LRc | 0.9490 (0.0004) | 0.3764 (0.0193) | 0.0747 (0.0074) | 0.1246 (0.0104) | 0.7754 (0.0049) | 0.1810 (0.0045) | |
RFd | 0.9739 (0.0004) | 0.9875 (0.0020) | 0.4687 (0.0091) | 0.6356 (0.0080) | 0.9076 (0.0060) | 0.6830 (0.0076) | |
XGBe | 0.9720 (0.0006) | 0.8149 (0.0148) | 0.5485 (0.0091) | 0.6555 (0.0066) | 0.907 (0.0057) | 0.6830 (0.0062) | |
LGBMf | 0.9734 (0.0006) | 0.8679 (0.0184) | 0.5362 (0.0066) | 0.6627 (0.0064) | 0.9132 (0.0053) | 0.6916 (0.0061) | |
CATg | 0.9747 (0.0005) | 0.9423 (0.0095) | 0.5115 (0.0093) | 0.6630 (0.0078) | 0.9134 (0.0053) | 0.6924 (0.0059) | |
Internal | |||||||
LR | 0.9485 | 0.3728 | 0.0709 | 0.1192 | 0.7584 | 0.1712 | |
RF | 0.9736 | 0.9902 | 0.4675 | 0.6352 | 0.9021 | 0.6826 | |
XGB | 0.9724 | 0.8322 | 0.5490 | 0.6616 | 0.9027 | 0.6834 | |
LGBM | 0.9738 | 0.8863 | 0.5348 | 0.6671 | 0.9058 | 0.6910 | |
CAT | 0.9749 | 0.9527 | 0.5142 | 0.6680 | 0.9053 | 0.6890 | |
External | |||||||
LR | 0.9560 | 0.3612 | 0.0865 | 0.1395 | 0.7487 | 0.1630 | |
RF | 0.9769 | 0.9789 | 0.4502 | 0.6168 | 0.8833 | 0.6303 | |
XGB | 0.9707 | 0.6938 | 0.5208 | 0.5950 | 0.8811 | 0.6196 | |
LGBM | 0.9732 | 0.7651 | 0.5071 | 0.6099 | 0.8853 | 0.6238 | |
CAT | 0.9755 | 0.8644 | 0.4823 | 0.6191 | 0.8860 | 0.6290 |
aAUROC: area under the receiver operating characteristic curve.
bAUPRC: area under the precision-recall curve.
cLR: logistic regression.
dRF: random forest.
eXGB: eXtreme gradient boosting.
fLGBM: light gradient boosting machine.
gCAT: categorical boosting.

Early AKD Prediction Model
For the analysis of AKD progression, 5429 and 1998 patients in the internal and external cohorts, respectively were included after excluding those with unclear AKD status. Among them, 896 (16.5%) patients in the internal cohort and 287 (14.4%) in the external cohort were diagnosed with AKD. The median duration from AKI to discharge was 10 (IQR 4-23) days in the internal cohort and 9 (IQR 3-21) days in the external cohort. For patients who recovered within 7 days post AKI, the median recovery time was 5 (IQR 3-7) days in both cohorts (Figure S9 in
). The characteristics of progression to AKD in each cohort are presented in Tables S12 and S13 . Variables that demonstrated statistical significance in both cohorts included hemoglobin, albumin, SCr, eGFR, glucose, alanine aminotransferase, blood sugar, calcium, urine specific gravity, C-reactive protein (CRP), ANTIs, and surgery. Some variables such as sex, diastolic blood pressure (DBP), white blood cell (WBC) count, BUN/Cr ratio, and total carbon dioxide showed different trends between the cohorts.The CAT model outperformed the other models in both the internal (AUROC=0.8202) and external validations (AUROC=0.7833). The external validation showed a slight decrease in performance based on the AUROC (range 0.0262-0.0416).
presents the performance evaluation results of the early prediction model for AKD. The hyperparameters used in the model are listed in Table S14 in .shows the SHAP analysis of the early-stage AKD prediction model. For AKD prediction, an increase in SCr level at the time of AKI was a strong predictor of delayed recovery. Exposure to ANTIs, high DBP, CRP, aPTT, low eGFR, low pH, and elevated alkaline phosphatase were associated with an increased risk of AKD. Patients who underwent surgery or CECT tended to recover quicker from AKI.
Figure S10 in
shows the results of testing the probability of the early AKD prediction model. Calibration plots for both the internal and external cohorts were generated to assess model performance. The slopes of the calibration plots were 1.27 for the internal cohort and 1.20 for the external cohort, indicating a good calibration of the model’s predicted probabilities. The evaluation of the model probabilities by box plots showed excellent calibration, similar to that of the early prediction model for AKI (Figure S11 in ).Validation and model | Accuracy | Precision | Recall | F1-score | AUROCa | AUPRCb | |
Cross-validation, mean (SD) | |||||||
LRc | 0.8442 (0.0097) | 0.6146 (0.0400) | 0.1972 (0.0248) | 0.2980 (0.0321) | 0.7710 (0.0197) | 0.4528 (0.0228) | |
RFd | 0.8370 (0.0089) | 0.7711 (0.0728) | 0.0432 (0.0066) | 0.0819 (0.0123) | 0.7702 (0.0158) | 0.4346 (0.0166) | |
XGBe | 0.8439 (0.0165) | 0.5972 (0.0848) | 0.2372 (0.0327) | 0.3382 (0.0406) | 0.7622 (0.0250) | 0.4490 (0.0409) | |
LGBMf | 0.8436 (0.0095) | 0.6066 (0.0545) | 0.2142 (0.0173) | 0.3149 (0.0109) | 0.7623 (0.0178) | 0.4421 (0.0067) | |
CATg | 0.8402 (0.0113) | 0.6873 (0.0779) | 0.0942 (0.0110) | 0.1655 (0.0186) | 0.7806 (0.0181) | 0.4695 (0.0255) | |
Internal | |||||||
LR | 0.8460 | 0.7347 | 0.2182 | 0.3364 | 0.7967 | 0.5376 | |
RF | 0.8297 | 0.9000 | 0.0545 | 0.1029 | 0.8099 | 0.5302 | |
XGB | 0.8547 | 0.7460 | 0.2848 | 0.4123 | 0.8097 | 0.5721 | |
LGBM | 0.8557 | 0.7667 | 0.2788 | 0.4089 | 0.7964 | 0.5598 | |
CAT | 0.8341 | 0.8000 | 0.0970 | 0.1730 | 0.8202 | 0.5558 | |
External | |||||||
LR | 0.8475 | 0.5893 | 0.1019 | 0.1737 | 0.7659 | 0.3959 | |
RF | 0.8446 | 0.6429 | 0.0278 | 0.0533 | 0.7837 | 0.4203 | |
XGB | 0.8494 | 0.5761 | 0.1636 | 0.2548 | 0.7682 | 0.3990 | |
LGBM | 0.8475 | 0.5543 | 0.1574 | 0.2452 | 0.7685 | 0.3945 | |
CAT | 0.8475 | 0.6786 | 0.0586 | 0.1080 | 0.7833 | 0.4368 |
aAUROC: area under the receiver operating characteristic curve.
bAUPRC: area under the precision-recall curve.
cLR: logistic regression.
dRF: random forest.
eXGB: eXtreme gradient boosting.
fLGBM: light gradient boosting machine.
gCAT: categorical boosting.

Framework
The framework to assist in managing AKI is shown in
. A presents the framework simulation results for the early AKI prediction model. The designed framework was simulated using an external validation cohort at yearly intervals of 5 years. Among the 7284 patients included annually in the external validation cohort, 501 (6.9%) experienced AKI. Of these, 439 (87.68%) were predicted at least 1 day in advance. The F1-score showed a decreasing trend after 2018 compared to evaluations between 2016 and 2018, with a minimum of 0.39 and a maximum of 0.46. B presents the framework simulation results for the early prediction model for AKD. Excluding patients with insufficient SCr tracking from an average of 410 patients annually, 64 (15.6%) did not recover from AKI within 7 days. Among them, approximately 38 (58.4%) were predicted early.
Discussion
Principal Findings
In this study, we developed a framework to assist in the management of AKI. The early AKI prediction model demonstrated high performance with an AUROC of 0.9053, while the early AKD prediction model achieved an AUROC of 0.8202. External validation results showed excellent model performance, although slight variations over time were observed, likely due to changes in disease incidence rates. Our study represents an integrated effort to identify patients with AKI early and, in cases where AKI occurs, classify them into high-risk and low-risk groups. Furthermore, we successfully refined and applied the KDIGO criteria to retrospective data, addressing its limitations and enhancing its applicability for AKI-related research in general ward patients.
Furthermore, our model predicted AKD in patients with AKI and showed an adjusted HR of 2.03 (95% CI 1.38-3.00) for events where eGFR decreased by 30% within 30 days from the onset of AKI. This adjusted HR was higher than that of the model developed using previous criteria, which had an HR of 1.66 (95% CI 1.17-2.34). This indicates that the refined criteria more precisely identify high-risk groups and suggests that the model developed using these criteria performs better. The AKI incidence was higher using the previous criteria, whereas the AKD incidence was higher using the refined criteria. This indicated that the previous criteria identified milder cases. Despite including milder cases, the model developed using the refined criteria showed a higher risk of a poorer prognosis than the previous criteria. Among patients tracked for more than 30 days after AKI, there were 934 and 482 patients before and after criteria advancement, respectively, with 138 and 103 showing poor prognosis. In other words, before the criteria advancement, 35 more patients with a poor prognosis were identified, but 452 additional AKI cases were identified. As shown in Figure S5 in
, many patients who met only the previous criteria included those whose AKI status was difficult to determine owing to the significant sensitivity to fluctuations. Regardless of how the baseline SCr level is imputed, it is important in general wards to set a minimum increase criterion when applying a relative standard. Additionally, it is crucial to ensure that the baseline SCr level does not become too low when the SCr decreases. Advancements in labeling criteria should consider both patient prognosis and clinical settings. Excessive identification of AKI can lead to the detection of more patients with poor prognoses but may also result in inefficient allocation of medical resources.Using the SHAP, factors contributing to AKI and AKD were identified and quantitatively measured. These results are largely aligned with the trends suggested by previous studies on risk factors and variable tendencies. Factors indicating poor patient status, such as low albumin or high alkaline phosphatase levels, were associated with increased AKI and AKD risks. Specifically, albumin is considered a critical biomarker closely related to renal function and may decrease in conditions associated with liver dysfunction [
]. Elevated WBC, heart rate, body temperature, and respiratory rate may indicate infection in patients [ ], while the use of nephrotoxic drugs such as ANTIs, nonsteroidal anti-inflammatory drugs, and cytotoxic chemotherapeutic agents increases the risk of AKI [ , ]. Analysis of SHAP for AKD risk factors indicated that patients with increased WBC and high BT tended to recover renal function relatively quickly. Resolving infections appears to reduce the risk of AKD development in patients with AKI. ANTIs have emerged as significant risk factors for AKI and AKD. The difficulty in discontinuing these drugs due to patient conditions may exacerbate negative outcomes. Patients undergoing surgery are closely associated with AKI, although AKI following general anesthesia appears to show transient SCr fluctuations and overall health improvement following surgical resolution [ ]. Renal function emerged as a crucial factor, consistent with existing clinical studies that state that baseline renal function is well-known as a major factor in AKI [ ]. Additionally, higher age, heart rate, respiratory rate, total T-bil, aPTT, and CRP showed increased AKI risk, indicating that poorer patient condition may correlate with increased AKI incidence.Similarly, poor renal function has been identified as a major risk factor for AKD. Elevated DBP reflects a tendency toward increased blood vessel volume after AKI, suggesting a possible correlation between increased volume and AKD [
, ]. AKI accompanied by surgery may recover relatively quickly due to transient hemodynamic changes during surgery, while cardiac surgery is known to have a relationship with AKI [ ]. Contrast-induced nephropathy often shows peak levels approximately 3-5 days after exposure and often returns to baseline within 7-14 days, indicating relatively good recovery [ , ]. Conversely, ANTIs are associated with intrinsic AKIs such as acute tubular necrosis and slower recovery despite expected renal function impairment [ - ]. A high urine specific gravity suggests dehydration, which can often be corrected through fluid supplementation alone, leading to fast recovery [ , ]. Indicators reflecting infection showed that higher values tended to be associated with good renal recovery, while the significant infection marker CRP yielded ambiguous results. Because the interpretation of our model aligns with that of previous clinical studies, the patterns learned by the model are reasonable. Although biomarkers such as cystatin-C are commonly suggested in various studies, they were not used in this study due to their low measurement frequency. Future research should expand the features used in this study to include such biomarkers [ - ].Limitations
This study had several limitations. First, owing to the lack of consensus on the AKI recovery criteria, the definitions had to rely on previous research findings. Second, there was a lack of data related to dialysis or kidney transplantation, which was addressed by excluding patients with an eGFR<60 mL/minute. As a result, caution is required when applying and interpreting the model for patients with preexisting chronic kidney disease or poor kidney function from the initial stages of hospitalization. However, the primary aim of this study was to predict “unexpected AKI.” Therefore, the analysis focused on patients with relatively preserved kidney function, who were considered to be at lower risk of AKI. Third, the study did not account for interventions or treatments before or post-AKI, which is crucial because patients’ preexisting conditions are significant factors in AKI and AKD. Fourth, in this study, the model was applied only in cases where AKI occurrences could be clearly identified, specifically when baseline SCr could be estimated at a specific time point and SCr measurements were available at that time. However, the proportion of AKI labels may differ from the actual occurrences. Therefore, despite the frequent measurement of SCr, caution is needed when applying and interpreting the model in situations where SCr measurements have not been conducted. Fifth, although our study used data extracted from different hospital information systems in various regions, it predominantly included data from Korean individuals. Therefore, we could not sufficiently consider racial diversity. Since there may be various differences, including kidney function, depending on ethnicity, future studies should include a multiethnic population.
Conclusions
This study introduces a machine learning framework aimed at assisting in the early management of AKI in general ward patients. To develop the model, we used retrospective data from general wards, refining the operational definition of AKI and externally validating our approach. Our findings demonstrate that AI-driven methods can enhance risk stratification and enable timely interventions. Beyond improving predictive accuracy, this study underscores the potential of AI to streamline clinical workflows, optimize resource allocation, and ultimately reduce the burden of AKI-related complications. Integrating such models into routine hospital practice may support proactive decision-making, allowing physicians to implement tailored interventions based on individual patient risk profiles. Future research should focus on prospective validation, real-time clinical integration, and incorporating additional biomarkers to improve model generalizability and clinical relevance.
Acknowledgments
This research was supported by the Ministry of Science and Information and Communication Technology, Korea, under the ICAN (Information and Communication Technology Challenge and Advanced Network of Human Resource Development) support program (IITP-2025-RS-2022-00156439) supervised by the Institute for Information & Communications Technology Planning & Evaluation and the Korea Institute for Advancement of Technology grant funded by the Korea Government (P0023675, Human Resource Development Program for Industrial Innovation).
Data Availability
The dataset used in this study and the models developed are not publicly available due to ethical restrictions, patient confidentiality, and institutional policies. However, they can be made available on reasonable request to the corresponding author.
Authors' Contributions
HL had full access to all the data in the study and was responsible for data integrity and the accuracy of the data analysis. This manuscript was written and edited entirely by the authors without the use of any generative AI tools. NC played a key role in conceptualizing and designing the study, contributed substantially to data acquisition, conducted statistical analyses, and drafted the manuscript. IJ focused on the study design, performed in-depth data analysis, and contributed to manuscript drafting. SA assisted with manuscript preparation and provided administrative and technical support. HG was instrumental in the study's concept and design, provided critical data, and supervised the research process. HL provided significant data for the study, secured funding, and offered administrative and technical support. All authors participated in data interpretation, critically reviewed the manuscript for important intellectual content, and approved the final version.
Conflicts of Interest
None declared.
References
- Khwaja A. KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin Pract. 2012;120(4):c179-c184. [CrossRef] [Medline]
- Hoste EAJ, Bagshaw SM, Bellomo R, Cely CM, Colman R, Cruz DN, et al. Epidemiology of acute kidney injury in critically ill patients: the multinational AKI-EPI study. Intensive Care Med. Aug 2015;41(8):1411-1423. [CrossRef] [Medline]
- Kellum JA, Sileanu FE, Bihorac A, Hoste EAJ, Chawla LS. Recovery after Acute Kidney Injury. Am J Respir Crit Care Med. Mar 15, 2017;195(6):784-791. [FREE Full text] [CrossRef] [Medline]
- Wang H, Lambourg E, Guthrie B, Morales DR, Donnan PT, Bell S. Patient outcomes following AKI and AKD: a population-based cohort study. BMC Med. 2022;20(1):229. [FREE Full text] [CrossRef] [Medline]
- Koyner JL, Carey KA, Edelson DP, Churpek MM. The development of a machine learning inpatient acute kidney injury prediction model. Crit Care Med. 2018;46(7):1070-1077. [CrossRef] [Medline]
- Churpek MM, Carey KA, Edelson DP, Singh T, Astor BC, Gilbert ER, et al. Internal and external validation of a machine learning risk score for acute kidney injury. JAMA Netw Open. 2020;3(8):e2012892. [FREE Full text] [CrossRef] [Medline]
- Song X, Yu ASL, Kellum JA, Waitman LR, Matheny ME, Simpson SQ, et al. Cross-site transportability of an explainable artificial intelligence model for acute kidney injury prediction. Nat Commun. 2020;11(1):5668. [FREE Full text] [CrossRef] [Medline]
- Al-Jaghbeer M, Dealmeida D, Bilderback A, Ambrosino R, Kellum JA. Clinical decision support for in-hospital AKI. J Am Soc Nephrol. 2018;29(2):654-660. [FREE Full text] [CrossRef] [Medline]
- Sun H, Depraetere K, Meesseman L, Cabanillas Silva P, Szymanowsky R, Fliegenschmidt J, et al. Machine learning-based prediction models for different clinical risks in different hospitals: evaluation of live performance. J Med Internet Res. 2022;24(6):e34295. [FREE Full text] [CrossRef] [Medline]
- Heo S, Kang EA, Yu JY, Kim HR, Lee S, Kim K, et al. Time series aI model for acute kidney injury detection based on a multicenter distributed research network: development and verification study. JMIR Med Inform. 2024;12:e47693. [FREE Full text] [CrossRef] [Medline]
- Zhang H, Wang AY, Wu S, Ngo J, Feng Y, He X, et al. Artificial intelligence for the prediction of acute kidney injury during the perioperative period: systematic review and meta-analysis of diagnostic test accuracy. BMC Nephrol. 2022;23(1):405. [FREE Full text] [CrossRef] [Medline]
- Kamel Rahimi A, Ghadimi M, van der Vegt AH, Canfell OJ, Pole JD, Sullivan C, et al. Machine learning clinical prediction models for acute kidney injury: the impact of baseline creatinine on prediction efficacy. BMC Med Inform Decis Mak. 2023;23(1):207. [FREE Full text] [CrossRef] [Medline]
- Nateghi Haredasht F, Antonatou M, Cavalier E, Delanaye P, Pottel H, Makris K. The effect of different consensus definitions on diagnosing acute kidney injury events and their association with in-hospital mortality. J Nephrol. 2022;35(8):2087-2095. [FREE Full text] [CrossRef] [Medline]
- Jeong I, Cho NJ, Ahn SJ, Lee H, Gil HW. Machine learning approaches toward an understanding of acute kidney injury: current trends and future directions. Korean J Intern Med. 2024;39(6):882-897. [FREE Full text] [CrossRef] [Medline]
- Li Y, Yao L, Mao C, Srivastava A, Jiang X, Luo Y. Early prediction of acute kidney injury in critical care setting using clinical notes. 2018. Presented at: IEEE International Conference on Bioinformatics and Biomedicine (BIBM); December 6, 2018:683-686; Madrid, Spain. [CrossRef]
- Zimmerman LP, Reyfman PA, Smith ADR, Zeng Z, Kho A, Sanchez-Pinto LN, et al. Early prediction of acute kidney injury following ICU admission using a multivariate panel of physiological measurements. BMC Med Inform Decis Mak. 2019;19:16. [FREE Full text] [CrossRef] [Medline]
- Sato N, Uchino E, Kojima R, Hiragi S, Yanagita M, Okuno Y. Prediction and visualization of acute kidney injury in intensive care unit using one-dimensional convolutional neural networks based on routinely collected data. Comput Methods Programs Biomed. 2021;206:106129. [FREE Full text] [CrossRef] [Medline]
- Machado GD, Santos LL, Libório AB. Redefining urine output thresholds for acute kidney injury criteria in critically Ill patients: a derivation and validation study. Crit Care. 2024;28(1):272. [FREE Full text] [CrossRef] [Medline]
- Porschen C, Ernsting J, Brauckmann P, Weiss R, Würdemann T, Booke H, et al. pyAKI-An open source solution to automated acute kidney injury classification. PLoS One. 2025;20(1):e0315325. [FREE Full text] [CrossRef] [Medline]
- Sun S, Annadi RR, Chaudhri I, Munir K, Hajagos J, Saltz J, et al. Short- and long-term recovery after moderate/severe AKI in patients with and without COVID-19. Kidney360. 2022;3(2):242-257. [CrossRef]
- Luo XQ, Yan P, Zhang NY, Luo B, Wang M, Deng Y, et al. Machine learning for early discrimination between transient and persistent acute kidney injury in critically ill patients with sepsis. Sci Rep. 2021;11(1):20269. [FREE Full text] [CrossRef] [Medline]
- Liu CL, Tain YL, Lin YC, Hsu CN. Prediction and clinically important factors of acute kidney injury non-recovery. Front Med. 2021;8:789874. [FREE Full text] [CrossRef] [Medline]
- Cho NJ, Jeong I, Kim Y, Kim DO, Ahn S, Kang S, et al. A machine learning-based approach for predicting renal function recovery in general ward patients with acute kidney injury. Kidney Res Clin Pract. 2024;43(4):538-547. [FREE Full text] [CrossRef] [Medline]
- Rank N, Pfahringer B, Kempfert J, Stamm C, Kühne T, Schoenrath F, et al. Deep-learning–based real-time prediction of acute kidney injury outperforms human predictive performance. NPJ Digit Med. 2020;3:139. [FREE Full text] [CrossRef] [Medline]
- Jiang X, Hu Y, Guo S, Du C, Cheng X. Prediction of persistent acute kidney injury in postoperative intensive care unit patients using integrated machine learning: a retrospective cohort study. Sci Rep. 2022;12(1):17134. [FREE Full text] [CrossRef] [Medline]
- Shawwa K, Ghosh E, Lanius S, Schwager E, Eshelman L, Kashani KB. Predicting acute kidney injury in critically ill patients using comorbid conditions utilizing machine learning. Clin Kidney J. 2021;14(5):1428-1435. [FREE Full text] [CrossRef] [Medline]
- Zheng L, Lin Y, Fang K, Wu J, Zheng M. Derivation and validation of a risk score to predict acute kidney injury in critically ill cirrhotic patients. Hepatol Res. 2023;53(8):701-712. [CrossRef] [Medline]
- Sparrow HG, Swan JT, Moore LW, Gaber AO, Suki WN. Disparate outcomes observed within Kidney Disease: Improving Global Outcomes (KDIGO) acute kidney injury stage 1. Kidney Int. 2019;95(4):905-913. [FREE Full text] [CrossRef] [Medline]
- Chawla LS, Bellomo R, Bihorac A, Goldstein SL, Siew ED, Bagshaw SM, et al. Acute kidney disease and renal recovery: consensus report of the Acute Disease Quality Initiative (ADQI) 16 Workgroup. Nat Rev Nephrol. 2017;13(4):241-257. [FREE Full text] [CrossRef] [Medline]
- Bellomo R, Kellum JA, Ronco C. Acute kidney injury. Lancet. 2012;380(9843):756-766. [CrossRef] [Medline]
- He J, Lin J, Duan M. Application of machine learning to predict acute kidney disease in patients with sepsis associated acute kidney injury. Front Med. 2021;8:792974. [FREE Full text] [CrossRef] [Medline]
- Neyra JA, Ortiz-Soriano V, Liu LJ, Smith TD, Li X, Xie D, et al. Prediction of mortality and major adverse kidney events in critically ill patients with acute kidney injury. Am J Kidney Dis. 2023;81(1):36-47. [FREE Full text] [CrossRef] [Medline]
- Daniels B, Havard A, Myton R, Lee C, Chidwick K. Evaluating the accuracy of data extracted from electronic health records into MedicineInsight, a national Australian general practice database. Int J Popul Data Sci. 2022;7(1):1713. [FREE Full text] [CrossRef] [Medline]
- Inker LA, Eneanya ND, Coresh J, Tighiouart H, Wang D, Sang Y, et al. New creatinine- and cystatin c-based equations to estimate GFR without race. N Engl J Med. 2021;385(19):1737-1749. [FREE Full text] [CrossRef] [Medline]
- Zhu K, Song H, Zhang Z, Ma B, Bao X, Zhang Q, et al. Acute kidney injury in solitary kidney patients after partial nephrectomy: incidence, risk factors and prediction. Transl Androl Urol. 2020;9(3):1232-1243. [FREE Full text] [CrossRef] [Medline]
- White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377-399. [CrossRef] [Medline]
- Mera-Gaona M, Neumann U, Vargas-Canas R, López DM. Evaluating the impact of multivariate imputation by MICE in feature selection. PLoS One. 2021;16(7):e0254720. [FREE Full text] [CrossRef] [Medline]
- Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019;572(7767):116-119. [FREE Full text] [CrossRef] [Medline]
- Fisher RA. The use of multiple measurements in taxonomic problems. Ann Eugen. 1936;7(2):179-188. [CrossRef]
- Breiman L. Random forests. Mach Learn. 2001;45(1):5-32. [CrossRef]
- Chen T, Guestrin C. Xgboost: A scalable tree boosting system. 2016. Presented at: KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; August 13-17, 2016; San Francisco, CA. [CrossRef]
- Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: A highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems. Unknown. Curran Associates, Inc; 2017. Presented at: NeurIPS; December 4-9, 2017; Long Beach Convention Center, Long Beach.
- Prokhorenkova L, Gusev G, Vorobev A, Dorogush AV, Gulin A. CatBoost: unbiased boosting with categorical features. ArXiv. Preprint posted online on June 28, 2017. [CrossRef]
- Haredasht FN, Vanhoutte L, Vens C, Pottel H, Viaene L, De Corte W. Validated risk prediction models for outcomes of acute kidney injury: a systematic review. BMC Nephrol. 2023;24(1):133. [FREE Full text] [CrossRef] [Medline]
- Deo SV, Deo V, Sundaram V. Survival analysis-part 2: Cox proportional hazards model. Indian J Thorac Cardiovasc Surg. 2021;37(2):229-233. [FREE Full text] [CrossRef] [Medline]
- Luo W, Phung D, Tran T, Gupta S, Rana S, Karmakar C, et al. Guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view. J Med Internet Res. 2016;18(12):e323. [FREE Full text] [CrossRef] [Medline]
- Lei G, Wang G, Zhang C, Chen Y, Yang X. Using machine learning to predict acute kidney injury after aortic arch surgery. J Cardiothorac Vasc Anesth. 2020;34(12):3321-3328. [CrossRef] [Medline]
- Ko S, Jo C, Chang CB, Lee YS, Moon Y, Youm JW, et al. A web-based machine-learning algorithm predicting postoperative acute kidney injury after total knee arthroplasty. Knee Surg Sports Traumatol Arthrosc. 2022;30(2):545-554. [CrossRef] [Medline]
- Bredt LC, Peres LAB, Risso M, Barros LCAL. Risk factors and prediction of acute kidney injury after liver transplantation: logistic regression and artificial neural network approaches. World J Hepatol. 2022;14(3):570-582. [FREE Full text] [CrossRef] [Medline]
- Cheng T, Wang X, Han Y, Hao J, Hu H, Hao L. The level of serum albumin is associated with renal prognosis and renal function decline in patients with chronic kidney disease. BMC Nephrol. 2023;24(1):57. [FREE Full text] [CrossRef] [Medline]
- Riley LK, Rupert J. Evaluation of patients with leukocytosis. Am Fam Physician. 2015;92(11):1004-1011. [FREE Full text] [Medline]
- Naughton CA. Drug-induced nephrotoxicity. Am Fam Physician. 2008;78(6):743-750. [CrossRef]
- Zhu W, Barreto EF, Li J, Lee HK, Kashani K. Drug-drug interaction and acute kidney injury development: a correlation-based network analysis. PLoS One. 2023;18(1):e0279928. [FREE Full text] [CrossRef] [Medline]
- Mikkelsen TB, Schack A, Oreskov JO, Gögenur I, Burcharth J, Ekeloef S. Acute kidney injury following major emergency abdominal surgery—a retrospective cohort study based on medical records data. BMC Nephrol. 2022;23(1):94. [FREE Full text] [CrossRef] [Medline]
- Hu L, Gao L, Zhang D, Hou Y, He LL, Zhang H, et al. The incidence, risk factors and outcomes of acute kidney injury in critically ill patients undergoing emergency surgery: a prospective observational study. BMC Nephrol. 2022;23(1):42. [FREE Full text] [CrossRef] [Medline]
- Jiang YJ, Xi XM, Jia HM, Zheng X, Wang M, Li W, et al. Risk factors, clinical features and outcome of new-onset acute kidney injury among critically ill patients: a database analysis based on prospective cohort study. BMC Nephrol. 2021;22(1):289. [FREE Full text] [CrossRef] [Medline]
- Mercado MG, Smith DK, Guard EL. Acute kidney injury: diagnosis and management. Am Fam Physician. 2019;100(11):687-694. [FREE Full text] [Medline]
- Cheruku SR, Raphael J, Neyra JA, Fox AA. Acute kidney injury after cardiac surgery: prediction, prevention, and management. Anesthesiology. 2023;139(6):880-898. [CrossRef] [Medline]
- Isaka Y, Hayashi H, Aonuma K, Horio M, Terada Y, Doi K, et al. Guideline on the use of iodinated contrast media in patients with kidney disease 2018. Jpn J Radiol. 2020;38(1):3-46. [CrossRef] [Medline]
- Kaliyaperumal Y, Sivadasan S, Aiyalu R. Contrast-induced nephropathy: an overview. Dr Sulaiman Al Habib Med J. 2023;5(4):118-127. [CrossRef]
- Paquette F, Bernier-Jean A, Brunette V, Ammann H, Lavergne V, Pichette V, et al. Acute kidney injury and renal recovery with the use of aminoglycosides: a large retrospective study. Nephron. 2015;131(3):153-160. [FREE Full text] [CrossRef] [Medline]
- Dalfino L, Puntillo F, Ondok MJM, Mosca A, Monno R, Coppolecchia S, et al. Colistin-associated acute kidney injury in severely ill patients: a step toward a better renal care? A prospective cohort study. Clin Infect Dis. 2015;61(12):1771-1777. [CrossRef] [Medline]
- Molitoris BA, Berns JS. Pathogenesis and Prevention of Aminoglycoside Nephrotoxicity and Ototoxicity. Waltham, MA. UpToDate; 2022.
- Yang X, Wu H, Li H. Dehydration-associated chronic kidney disease: a novel case of kidney failure in China. BMC Nephrol. 2020;21(1):159. [FREE Full text] [CrossRef] [Medline]
- Mohsenin V. Practical approach to detection and management of acute kidney injury in critically ill patient. J Intensive Care. 2017;5:57. [FREE Full text] [CrossRef] [Medline]
- Farrington DK, Surapaneni A, Matsushita K, Seegmiller JC, Coresh J, Grams ME. Discrepancies between cystatin C-based and creatinine-based eGFR. Clin J Am Soc Nephrol. 2023;18(9):1143-1152. [CrossRef] [Medline]
- Nakano FK, Åkesson A, de Boer J, Dedja K, D'hondt R, Haredasht FN, et al. Comparison between the EKFC-equation and machine learning models to predict glomerular filtration rate. Sci Rep. 2024;14(1):26383. [FREE Full text] [CrossRef] [Medline]
- Nateghi Haredasht F, Viaene L, Vens C, Callewaert N, De Corte W, Pottel H. Comparison between cystatin C- and creatinine-based estimated glomerular filtration rate in the follow-up of patients recovering from a stage-3 AKI in ICU. J Clin Med. 2022;11(24):7264. [FREE Full text] [CrossRef] [Medline]
Abbreviations
AI: artificial intelligence |
AKD: acute kidney disease |
AKI: acute kidney injury |
ANTI: nephrotoxic antibiotic |
aPTT: activated partial thromboplastin time |
AUROC: area under the receiver operating characteristic |
BUN: blood urea nitrogen |
CAT: categorical boosting |
CECT: contrast-enhanced computed tomography |
CRP: C-reactive protein |
DBP: diastolic blood pressure |
eGFR: estimated glomerular filtration rate |
HR: hazard ratio |
ICU: intensive care unit |
KDIGO: Kidney Disease: Improving Global Outcomes |
SCr: serum creatinine |
SHAP: Shapley additive explanations |
STROBE: Strengthening the Reporting of Observational Studies in Epidemiology |
WBC: white blood cell |
Edited by A Mavragani; submitted 17.09.24; peer-reviewed by K Shawwa, F Nateghi Haredasht; comments to author 26.12.24; revised version received 10.01.25; accepted 14.02.25; published 18.03.25.
Copyright©Nam-Jun Cho, Inyong Jeong, Se-Jin Ahn, Hyo-Wook Gil, Yeongmin Kim, Jin-Hyun Park, Sanghee Kang, Hwamin Lee. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 18.03.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.