Improving Prediction of Survival for Extremely Premature Infants Born at 23 to 29 Weeks Gestational Age in the Neonatal Intensive Care Unit: Development and Evaluation of Machine Learning Models

Background Infants born at extremely preterm gestational ages are typically admitted to the neonatal intensive care unit (NICU) after initial resuscitation. The subsequent hospital course can be highly variable, and despite counseling aided by available risk calculators, there are significant challenges with shared decision-making regarding life support and transition to end-of-life care. Improving predictive models can help providers and families navigate these unique challenges. Objective Machine learning methods have previously demonstrated added predictive value for determining intensive care unit outcomes, and their use allows consideration of a greater number of factors that potentially influence newborn outcomes, such as maternal characteristics. Machine learning–based models were analyzed for their ability to predict the survival of extremely preterm neonates at initial admission. Methods Maternal and newborn information was extracted from the health records of infants born between 23 and 29 weeks of gestation in the Medical Information Mart for Intensive Care III (MIMIC-III) critical care database. Applicable machine learning models predicting survival during the initial NICU admission were developed and compared. The same type of model was also examined using only features that would be available prepartum for the purpose of survival prediction prior to an anticipated preterm birth. Features most correlated with the predicted outcome were determined when possible for each model. Results Of included patients, 37 of 459 (8.1%) expired. The resulting random forest model showed higher predictive performance than the frequently used Score for Neonatal Acute Physiology With Perinatal Extension II (SNAPPE-II) NICU model when considering extremely preterm infants of very low birth weight. Several other machine learning models were found to have good performance but did not show a statistically significant difference from previously available models in this study. Feature importance varied by model, and those of greater importance included gestational age; birth weight; initial oxygenation level; elements of the APGAR (appearance, pulse, grimace, activity, and respiration) score; and amount of blood pressure support. Important prepartum features also included maternal age, steroid administration, and the presence of pregnancy complications. Conclusions Machine learning methods have the potential to provide robust prediction of survival in the context of extremely preterm births and allow for consideration of additional factors such as maternal clinical and socioeconomic information. Evaluation of larger, more diverse data sets may provide additional clarity on comparative performance.


Background and objectives 3a
Explain the medical context (including whether diagnostic or prognostic) and rationale for developing or validating the multivariable prediction model, including references to existing models.

3b
Specify the objectives, including whether the study describes the development or validation of the model or both.

Source of data 4a
Describe the study design or source of data (e.g., randomized trial, cohort, or registry data), separately for the development and validation data sets, if applicable.
4b Specify the key study dates, including start of accrual; end of accrual; and, if applicable, end of follow-up.

Participants 5a
Specify key elements of the study setting (e.g., primary care, secondary care, general population) including number and location of centres.5b Describe eligibility criteria for participants.5c Give details of treatments received, if relevant.

Outcome 6a
Clearly define the outcome that is predicted by the prediction model, including how and when assessed.6b Report any actions to blind assessment of the outcome to be predicted.

Predictors 7a
Clearly define all predictors used in developing or validating the multivariable prediction model, including how and when they were measured. 7b Report any actions to blind assessment of predictors for the outcome and other predictors.Sample size 8 Explain how the study size was arrived at.

Missing data 9
Describe how missing data were handled (e.g., complete-case analysis, single imputation, multiple imputation) with details of any imputation method.

Statistical analysis methods 10a
Describe how predictors were handled in the analyses.
10b Specify type of model, all model-building procedures (including any predictor selection), and method for internal validation.
10d Specify all measures used to assess model performance and, if relevant, to compare multiple models.Risk groups 11 Provide details on how risk groups were created, if done.

Participants 13a
Describe the flow of participants through the study, including the number of participants with and without the outcome and, if applicable, a summary of the follow-up time.A diagram may be helpful.13b Describe the characteristics of the participants (basic demographics, clinical features, available predictors), including the number of participants with missing data for predictors and outcome.

Model development 14a
Specify the number of participants and outcome events in each analysis. 14b If done, report the unadjusted association between each candidate predictor and outcome.

Model specification 15a
Present the full prediction model to allow predictions for individuals (i.e., all regression coefficients, and model intercept or baseline survival at a given time point).15b Explain how to the use the prediction model.Model performance 16 Report performance measures (with CIs) for the prediction model.

Discussion
Limitations 18 Discuss any limitations of the study (such as nonrepresentative sample, few events per predictor, missing data).

Interpretation 19b
Give an overall interpretation of the results, considering objectives, limitations, and results from similar studies, and other relevant evidence.
Implications 20 Discuss the potential clinical use of the model and implications for future research.

Other information
Supplementary information 21 Provide information about the availability of supplementary resources, such as study protocol, Web calculator, and data sets.Funding 22 Give the source of funding and the role of the funders for the present study.
We recommend using the TRIPOD Checklist in conjunction with the TRIPOD Explanation and Elaboration document.