Abstract
Objective Clinical informatics researchers depend on the availability of high-quality data
from the electronic health record (EHR) to design and implement new methods and systems
for clinical practice and research. However, these data are frequently unavailable
or present in a format that requires substantial revision. This article reports the
results of a review of informatics literature published from 2010 to 2016 that addresses
these issues by identifying categories of data content that might be included or revised
in the EHR.
Materials and Methods We used an iterative review process on 1,215 biomedical informatics research articles.
We placed them into generic categories, reviewed and refined the categories, and then
assigned additional articles, for a total of three iterations.
Results Our process identified eight categories of data content issues: Adverse Events, Clinician
Cognitive Processes, Data Standards Creation and Data Communication, Genomics, Medication
List Data Capture, Patient Preferences, Patient-reported Data, and Phenotyping.
Discussion These categories summarize discussions in biomedical informatics literature that
concern data content issues restricting clinical informatics research. These barriers
to research result from data that are either absent from the EHR or are inadequate
(e.g., in narrative text form) for the downstream applications of the data. In light
of these categories, we discuss changes to EHR data storage that should be considered
in the redesign of EHRs, to promote continued innovation in clinical informatics.
Conclusion Based on published literature of clinical informaticians' reuse of EHR data, we characterize
eight types of data content that, if included in the next generation of EHRs, would
find immediate application in advanced informatics tools and techniques.
Keywords
electronic health records - information storage and retrieval - health system - clinical
informatics research - data quality