Aims Nonsteroidal anti-inflammatory drugs (NSAIDs) are widely used for treating musculoskeletal
disorders but are associated with peptic ulcers (PUs). Predicting PU risk in NSAID
users is vital to minimize adverse effects. This study aims to develop and validate
predictive models for NSAID-induced PUs using longitudinal electronic health record
(EHR) data.
Methods We utilized EHR data of 737,826 patients who were prescribed NSAIDs for at least
seven days to create a cohort. Laboratory tests, medication history, and demographic
information were used to train various machine learning (ML) and deep learning (DL)
models, including random forest, gradient boosting machine (GBM), recurrent neural
network (RNN), long short-term memory (LSTM), gated recurrent unit (GRU), and Transformer.
Endoscopy reports were employed to more accurately determine PU incidence. Model performance
was evaluated using the area under the receiver operating characteristic curve (AUROC)
and the area under the precision-recall curve (AUPRC).
Results The GRU model achieved the highest performance with an AUROC of 0.941 for internal
validation and 0.964 for external validation. Significant predictors included hemoglobin
levels, duration of medication, and the use of aspirin. Risk score analysis showed
a sharp increase in risk two months before PU occurrence.
Conclusions We developed and validated robust predictive models for NSAID-induced PUs using longitudinal
EHR data. These models can aid in clinical decision-making for NSAID management and
PU prevention. Further studies are necessary to refine these models and extend their
application to diverse datasets.