CC BY-NC-ND 4.0 · Yearb Med Inform 2021; 30(01): 139-140
DOI: 10.1055/s-0041-1726517
Section 3: Clinical Information Systems
Best Paper Selection

Best Paper Selection

 

Fabregat A, Magret M, Ferré JA, Vernet A, Guasch N, Rodríguez A, Gómez J, Bodí M. A Machine Learning decision-making tool for extubation in Intensive Care Unit patients. https://www.sciencedirect.com/science/article/abs/pii/S0169260720317028?via%3Dihub

Kempa-Liehr AW, Lin CYC, Britten R, Armstrong D, Wallace J, Mordaunt D, O’Sullivan M. Healthcare pathway discovery and probabilistic machine learning. https://www.sciencedirect.com/science/article/abs/pii/S1386505619308068?via%3Dihub

Li Y, Nair P, Lu XH, Wen Z, Wang Y, Dehaghi AAK, Miao Y, Liu W, Ordog T, Biernacka JM, Ryu E, Olson JE, Frye MA, Liu A, Guo L, Marelli A, Ahuja Y, Davila-Velderrain J, Kellis M. Inferring multimodal latent topics from electronic health records. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7242436/

Weemaes M, Martens S, Cuypers L, van Elslande J, Hoet K, Welkenhuysen J, Goossens R, Wouters S, Houben E, Jeuris E, Jeuris K, Laenen L, Bruyninckx K, Beuselinck K, André E, Depypere M, Desmet S, Lagrou K, Van Ranst M, Verdonck AKLC, Goveia J. Laboratory information system requirements to manage the COVID-19 pandemic: A report from the Belgian national reference testing center. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7197526/


#

Appendix: Content Summaries of Selected Best Papers for the IMIA Yearbook 2021 Section “Clinical Information Systems”

Fabregat A, Magret M, Ferré JA, Vernet A, Guasch N, Rodríguez A, Gómez J, Bodí M

A Machine Learning decision-making tool for extubation in Intensive Care Unit patients

Comput Methods Programs Biomed 2021;200:105869

Invasive Mechanical Ventilation (IMV) is central to treating patients who are unable to maintain adequate pulmonary ventilation and oxygenation to allow patients to recover. Although IMV can be a life-saving procedure, it also bears significant risks such as ventilator-induced lung injuries or infections as well as long-term problems after recovery. One of the critical decisions regarding IMV is weaning. This includes, amongst other steps, the removal of the endotracheal tube. Patients that need to be reintubated bear several risks and problems associated, including increased mortality (25%-50%). The goal of the current work was to create a machine learning (ML) model that can increase the successful extubation rate in adult Intensive Care Unit (ICU) patients. The model is based on data routinely collected from patients' health record data. Patients included were admitted to an ICU in a Spanish hospital between 2015 and 2019, and received at least 12 consecutive hours of IMV. Variables used for prediction were categorized in four types: T1: time series data (averaged over 20 minutes) (e.g., heart rate); T2: derived variables from T1 (e.g., respiratory rate); T3: discrete event information, (e.g., Glasgow coma scale (GCS)); T4: demographics and admission information (e.g., gender). In total, 20 predictors were used. The resulting dataset had a strong imbalance with regard to successful extubations (1,108 versus 100). Therefore, randomly selected data points of the most frequent class (successful extubation) were removed for the training data set and/or a weight was assigned to data points. Seven-fold cross-validation was determined appropriate. Extubation/Reintubation was basically identified by finding gaps on the IMV monitor signal larger than 48 hours. As several possible errors have an influence on this gap, a comparison with medical records was necessary to correct the numbers (final dataset: 647 successful and 50 failed). Three different ML classifiers were compared: support vector machine (SVM) with radial basis, gradient boosting machine (GBM) with Bernoulli loss, and Linear Discriminant Analysis (LDA). Mean Accuracy and AUROC were used to determine performance. The following scores were achieved: SVM 94.6% and 98.3%; GBM 87% and 96%; LDA 72% and 79%. The results suggest that the top five predictors in descending order of importance are time, GCS, body mass index, respiratory rate-oxygenation index, and plateau pressure. On the other hand, the least relevant predictors in descending order of importance are Spanish Society of Intensive, Critical Medicine and Coronary Units classification code for ICU admission reason, gender, total cumulative dose, total given dose, and ventilation mode. The models should not be applied as a general-purpose predictor of success for programmed extubations or as a monitoring alarm system but as a support tool to validate the medical staff's decision. With the predictive accuracy achieved, the rate of failed extubation (currently 9%) could be reduced to a theoretical 1%. The results suggest that ML tools are especially well suited to support the decision-making protocol based on spontaneous breathing trials to decide about extubation.

Kempa-Liehr AW, Lin CYC, Britten R, Armstrong D, Wallace J, Mordaunt D, O'Sullivan M

Healthcare pathway discovery and probabilistic machine learning

Int J Med Inform 2020;137:104087

The success of electronic health records has also driven several other research areas such as knowledge management in healthcare, which basically involves four steps: (1) data access; (2) knowledge discovery; (3) knowledge translation and interpretation, as well as (4) knowledge description, integration and sharing. An important role hereby is played by healthcare pathways that incorporate the operational knowledge of a healthcare organization by defining the execution sequence of clinical activities as patients move through a treatment process. In many cases, these pathways result from clinician-led practice rather than explicit design, which leads to several problems (e.g., lack of update). The study aims to combine healthcare pathway discovery with predictive models of individualized recovery times after appendicectomy. Particular emphasis is set on easy to interpret models for clinicians. The predictive model takes the stochastic volatility of pathway performance indicators into account and can replicate the dominant mode as well as the fat tail of the empirical recovery time distribution. To mine the pathways, the ProM software was used. First, healthcare pathway variations were discovered and then reduced (clustering, merging consecutive activities, condense repetitive patterns) to meaningful models. In a second step, conformance of these models with actual patient traces is evaluated, including new findings into the model leads to an iterative approach between pathway discovery and conformance analysis. The third step involves data enrichment, which comprises two stages: healthcare pathway performance evaluation and healthcare pathway performance analysis. The main objectives of evaluating healthcare pathway performance are to understand the strengths and weaknesses of the current pathway design. Analyzing the performance of healthcare pathways with respect to pathway variants and other possible influencing factors like demographics or patient-specific pathway characteristics (e.g., surgery duration) is the final step of the proposed process mining pipeline. For the appendicitis model, 13 pathway variants were discovered, whereas the top four variants accounted for approximately 88% of the patient traces. In a next step, it was analyzed if the variants are relevant features or covariates for explaining the stochastic volatility of postoperative length of stay. To build two probabilistic machine learning models, 415 individual patient traces were used. The two models showed promising results to explain the length of stay. Summarizing, the proposed process mining pipeline successfully constructed concise pathway models for the appendicitis case study and, therefore, supported the generation of probabilistic machine learning models.

Li Y, Nair P, Lu XH, Wen Z, Wang Y, Dehaghi AAK, Miao Y, Liu W, Ordog T, Biernacka JM, Ryu E, Olson JE, Frye MA, Liu A, Guo L, Marelli A, Ahuja Y, Davila-Velderrain J, Kellis M

Inferring multimodal latent topics from electronic health records

Nat Commun 2020;11(1):2536

Electronic health records (EHRs) are heterogeneous collections of patient health information that would support multiple uses such as risk prediction, clinical recommendations, or individual therapeutic concepts. However, raw data in EHRs is in many cases not directly processable, especially when building formal models. Different challenges such as non-standardized clinical notes, heterogeneous data types, missing standardization, or diagnosis-driven lab tests pose challenges. Appropriate and effective computational methods have the potential to overcome those challenges and provide access to an encyclopedia of diseases, disorders, injuries, and other related health conditions, uncovering a modular phenotypic network. The paper introduces MixEHR to: (1) distill meaningful disease topics from otherwise highly sparse, biased, and heterogeneous EHR data; and (2) provide clinical recommendations by predicting undiagnosed patient phenotypes based on their disease mixture membership. MixEHR builds on collaborative filtering and latent topic modeling and can model various EHR categories with separate discrete distributions. A variational inference algorithm that scales to large-scale EHR data was created. The model was applied to three EHR datasets: (1) Medical Information Mart for Intensive Care (MIMIC)-III (50,000 intensive care unit admissions); (2) Mayo Clinic EHR dataset containing 187 patients, including with 93 bipolar disorders and 94 controls; (3) The Régie de l'assurance maladie du Québec Congenital Heart Disease Dataset (Quebec CHD Database; more than 80,000 patients with congenital heart disease). The authors followed a probabilistic joint matrix factorization approach. The high dimensional and heterogeneous clinical record was projected onto a low dimension probabilistic meta-phenotype signature, reflecting the patient's mixed memberships across diverse latent disease topics. Factorization is carried out at two levels. At the lower level, data-type-specific topic models, learning a set of basis matrices for each data type, were applied. A common loading matrix that connects the multiple data types for each patient was used at the higher level. The approach was used, among others, to define a disease comorbidity network, create patient risk prioritization, EHR code predictions, or mortality predictions from the given datasets. Overall, the MixEHR approach's accuracy scores top compared to other existing approaches. MixEHR can infer expected phenotypes of a patient conditioned only on a subset of clinical variables that are perhaps easier and cheaper to measure. Currently, data are a set of two-dimensional matrices of patients by measurements in the model. To model higher dimensional objects such as patient by lab test by diagnoses, MixEHR could be extended to a probabilistic tensor-decomposition framework.

Weemaes M, Martens S, Cuypers L, van Elslande J, Hoet K, Welkenhuysen J, Goossens R, Wouters S, Houben E, Jeuris E, Jeuris K, Laenen L, Bruyninckx K, Beuselinck K, André E, Depypere M, Desmet S, Lagrou K, Van Ranst M, Verdonck AKLC, Goveia J

Laboratory information system requirements to manage the COVID-19 pandemic: A report from the Belgian national reference testing center

J Am Med Inform Assoc 2020;27(8):1293–9

The paper describes the challenges faced by the Belgian National Reference Center for COVID-19 testing at the University Hospitals Leuven, when demand passed allocated surge capacity during the initial phases of the COVID-19 pandemic. This includes the design, implementation and requirements of laboratory information system (LIS) functionality related to managing increased test demand during the COVID-19 crisis. In particular, all phases in laboratory testing were streamlined: the pre-laboratory phase (test ordering, sample packaging, and shipping); the pre-analytical phase (sample registration, tracking, and test prioritization); and the post-analytical phase (automated reporting and facilitating data-driven policy-making). Apart from COVID-19 testing, the laboratory concerned performs more than 12,000,000 lab tests a year. The LIS is in-house developed and maintained by a dedicated team. The system includes a computerized physician order entry (CPOE) module for in-house test ordering, which is fully integrated into the electronic health record (EHR). All external orders were initially paper-based and required that request forms accompany the sample. In the course of the analysis, 17 major challenges were identified in the different phases of the testing process. Selected solutions included: a COVID-19 specific CPOE module was linked to both the LIS and EHR, allowing to automatically retrieve demographic information, which dramatically improved metadata completeness; a “COVID-19 status” button on the main page of the EHR of each patient was displayed, showing in real-time the results of SARS-CoV-2 laboratory testing; a database with contact details and preferred reporting methods (e.g., fax, email, electronic mailbox system) of every laboratory in Belgium was compiled, to enable automated test reporting (resulted in more than 98% automated reporting). To successfully implement such changes in a short time, several prerequisites apply. The authors, therefore, recommend that crisis management teams not only consist of staff focused on increasing analytical capacity but also information technology-staff and to apply change management frameworks. To summarize, the most effective solutions reported were to streamline sample ordering through a CPOE system and reporting by developing a database with contact details of all laboratories in Belgium. In addition, the implementation of R/Shiny-based statistical tools facilitated epidemiological reporting and enabled explorative data mining.


#
#

No conflict of interest has been declared by the author(s).

Publication History

Article published online:
03 September 2021

© 2021. IMIA and Thieme. This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany