Methods Inf Med 2023; 62(01/02): 049-059
DOI: 10.1055/s-0042-1760248
Original Article

Automatic Identification of Self-Reported COVID-19 Vaccine Information from Vaccine Adverse Events Reporting System

Jay S. Patel
1   Health Services Administration and Policy, Temple University, College of Public Health, Philadelphia, Pennsylvania, United States
,
Sonya Zhan
1   Health Services Administration and Policy, Temple University, College of Public Health, Philadelphia, Pennsylvania, United States
,
Zasim Siddiqui
2   Pharmaceutical Systems and Policy, West Virginia University, Morgantown, West Virginia, United States
,
Bari Dzomba
1   Health Services Administration and Policy, Temple University, College of Public Health, Philadelphia, Pennsylvania, United States
,
Huanmei Wu
1   Health Services Administration and Policy, Temple University, College of Public Health, Philadelphia, Pennsylvania, United States
› Author Affiliations

Abstract

Background The short time frame between the coronavirus disease 2019 (COVID-19) pandemic declaration and the vaccines authorization led to concerns among public regarding the safety and efficacy of the vaccines. The Food and Drug Administration uses the Vaccine Adverse Events Reporting System (VAERS) where general population can report their vaccine side effects in the text box. This information could be utilized to determine self-reported vaccine side effects.

Objective To develop a supervised and unsupervised natural language processing (NLP) pipeline to extract self-reported COVID-19 vaccination side effects, location of the side effects, medications, and possibly false/misinformation seeking further investigation in a structured format for analysis and reporting.

Methods We utilized the VAERS dataset of COVID-19 vaccine reports from November 2020 to August 2022 of 725,246 individuals. We first developed a gold-standard (GS) dataset of randomly selected 1,500 records. Second, the GS was split into training, testing, and validation sets. The training dataset was used to develop the NLP applications (supervised and unsupervised) and testing and validation datasets were used to test the performances of the NLP application.

Results The NLP application automatically extracted vaccine side effects, body locations of the side effects, medication, and possibly misinformation with moderate to high accuracy (84% sensitivity, 82% specificity, and 83% F-1 measure). We found that 23% people (386,270) faced arm soreness, 31% body swelling (226,208), 23% fatigue/body weakness (168,160), and 22% (159,873) cold/flue-like symptoms. Most of the complications occurred in the body locations such as the arm, back, chest, neck, face, and head. Over-the-counter pain medications such as Tylenol and Ibuprofen and allergy medication like Benadryl were most reported self-reported medications. Death due to COVID-19, changes in the DNA, and infertility were possible false/misinformation reported by people.

Conclusion Some self-reported side effects such as syncope, arthralgia, and blood clotting need further clinical investigations. Our NLP application may help in extracting information from big free-text electronic datasets to help policy makers and other researchers with decision making.

Ethical Approval Statement

The study was performed using VAERS publicly available de-identified data; hence it does not require ethical committee approval.


Supplementary Material



Publication History

Received: 26 June 2022

Accepted: 21 November 2022

Article published online:
09 January 2023

© 2023. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany