Summary
Objectives: Our goal was to develop predictive models for sepsis and in-hospital mortality using
electronic health records (EHRs). We showcased the efficiency of these algorithms
in patients diagnosed with pneumonia, a group that is highly susceptible to sepsis.
Methods: We retrospectively analyzed the Health Facts® (HF) dataset to develop models to predict mortality and sepsis using the data from
the first few hours after admission. In addition, we developed models to predict sepsis
using the data collected in the last few hours leading to sepsis onset. We used the
random forest classifier to develop the models.
Results: The data collected in the EHR system is generally sporadic, making feature extraction
and selection difficult, affecting the accuracies of the models. Despite this fact,
the developed models can predict sepsis and in-hospital mortality with accuracies
of up to 65.26±0.33% and 68.64±0.48%, and sensitivities of up to 67.24±0.36% and 74.00±1.22%,
respectively, using only the data from the first 12 hours after admission. The accuracies
generally remain consistent for similar models developed using the data from the first
24 and 48 hours after admission. Lastly, the developed models can accurately predict
sepsis patients (with up to 98.63±0.17% accuracy and 99.74%±0.13% sensitivity) using
the data collected within the last 12 hours before sepsis onset. The results suggest
that if such algorithms continuously monitor patients, they can identify sepsis patients
in a manner comparable to current screening tools, such as the rulebased Systemic
Inflammatory Response Syndrome (SIRS) criteria, while often allowing for early detection
of sepsis shortly after admission.
Conclusions: The developed models showed promise in early prediction of sepsis, providing an opportunity
for directing early intervention efforts to prevent/treat sepsis.
Keywords
Predictive analytics - sepsis - in-hospital mortality - electronic health records