Subscribe to RSS

DOI: 10.1055/a-2441-3677
Discrepancies in Aggregate Patient Data between Two Sources with Data Originating from the Same Electronic Health Record: A Case Study
Funding None.
Abstract
Background Data exploration in modern electronic health records (EHRs) is often aided by user-friendly graphical interfaces providing “self-service” tools for end users to extract data for quality improvement, patient safety, and research without prerequisite training in database querying. Other resources within the same institution, such as Honest Brokers, may extract data sourced from the same EHR but obtain different results leading to questions of data completeness and correctness.
Objectives Our objectives were to (1) examine the differences in aggregate output generated by a “self-service” graphical interface data extraction tool and our institution's clinical data warehouse (CDW), sourced from the same database, and (2) examine the causative factors that may have contributed to these differences.
Methods Aggregate demographic data of patients who received influenza vaccines at three static clinics and three drive-through clinics in similar locations between August 2020 and December 2020 was extracted separately from our institution's EHR data exploration tool and our CDW by our organization's Honest Brokers System. We reviewed the aggregate outputs, sliced by demographics and vaccination sites, to determine potential differences between the two outputs. We examined the underlying data model, identifying the source of each database.
Results We observed discrepancies in patient volumes between the two sources, with variations in demographic information, such as age, race, ethnicity, and primary language. These variations could potentially influence research outcomes and interpretations.
Conclusion This case study underscores the need for a thorough examination of data quality and the implementation of comprehensive user education to ensure accurate data extraction and interpretation. Enhancing data standardization and validation processes is crucial for supporting reliable research and informed decision-making, particularly if demographic data may be used to support targeted efforts for a specific population in research or quality improvement initiatives.
Protection of Human and Animal Subjects
The studies were performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects. This study did not constitute human subject research and met the criteria for NHSR self-determination at the University of California, Irvine, CA.
Publication History
Received: 14 July 2024
Accepted: 04 September 2024
Article published online:
12 February 2025
© 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
References
- 1 Chishtie J, Sapiro N, Wiebe N. et al. Use of Epic electronic health record system for health care research: scoping review. J Med Internet Res 2023; 25: e51003
- 2 Saini V, Jaber T, Como JD. et al. 623. Exploring ‘Slicer Dicer’, an extraction tool in EPIC, for clinical and epidemiological analysis. Open Forum Infect Dis 2021; 8: S414-S415
- 3 Baughman DJ, Jabbarpour Y, Westfall JM. et al. Comparison of quality performance measures for patients receiving in-person vs telemedicine primary care in a large integrated health system. JAMA Netw Open 2022; 5 (09) e2233267
- 4 Bui R, Kasabali A, Dewan K. A retrospective analysis of COVID-19 tracheostomies: early versus late tracheostomy. Laryngoscope Investig Otolaryngol 2023; 8 (05) 1154-1158
- 5 Shermon S, Fazio KM, Shim R, Abd-Elsayed A, Kim CH. Prescription trends in complex regional pain syndrome: a retrospective case-control study. Brain Sci 2023; 13 (07) 1012
- 6 van der Lei J. Use and abuse of computer-stored medical records. Methods Inf Med 1991; 30 (02) 79-80
- 7 Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 2013; 20 (01) 144-151
- 8 Köpcke F, Trinczek B, Majeed RW. et al. Evaluation of data completeness in the electronic health record for the purpose of patient recruitment into clinical trials: a retrospective analysis of element presence. BMC Med Inform Decis Mak 2013; 13: 37
- 9 Kahn MG, Callahan TJ, Barnard J. et al. A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. EGEMS (Wash DC) 2016; 4 (01) 1244
- 10 AbuHalimeh A. Improving data quality in clinical research informatics tools. Front Big Data 2022; 5: 871897
- 11 Tung TH, DeLaurentis P, Yih Y. Uncovering discrepancies in IV vancomycin infusion records between pump logs and EHR documentation. Appl Clin Inform 2022; 13 (04) 891-900
- 12 Lee SJ, Grobe JE, Tiro JA. Assessing race and ethnicity data quality across cancer registries and EMRs in two hospitals. J Am Med Inform Assoc 2016; 23 (03) 627-634
- 13 Mohamed Y, Song X, McMahon TM. et al; Greater Plains Collaborative. Electronic health record data quality variability across a multistate clinical research network. J Clin Transl Sci 2023; 7 (01) e130
- 14 Edmondson ME, Reimer AP. Challenges frequently encountered in the secondary use of electronic medical record data for research. Comput Inform Nurs 2020; 38 (07) 338-348
- 15 Verheij RA, Curcin V, Delaney BC, McGilchrist MM. Possible sources of bias in primary care electronic health record data use and reuse. J Med Internet Res 2018; 20 (05) e185
- 16 Ancker JS, Shih S, Singh MP, Snyder A, Edwards A, Kaushal R. HITEC investigators. Root causes underlying challenges to secondary use of data. AMIA Annu Symp Proc 2011; 2011: 57-62
- 17 Kornegay C, Segal JB. Chapter 8: Selection of Data Sources. In: Velentgas P, Dreyer NA, Nourjah P, Smith SR, Torchia MM. eds. Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide. Rockville (MD): Agency for Healthcare Research and Quality (US); 2013. . Accessed October 25, 2024 at: https://www.ncbi.nlm.nih.gov/books/NBK126195/
- 18 Wiley KK, Mendonca E, Blackburn J, Menachemi N, Groot M, Vest JR. Quantifying electronic health record data quality in telehealth and office-based diabetes care. Appl Clin Inform 2022; 13 (05) 1172-1180
- 19 Klinger EV, Carlini SV, Gonzalez I. et al. Accuracy of race, ethnicity, and language preference in an electronic health record. J Gen Intern Med 2015; 30 (06) 719-723
- 20 Magaña López M, Bevans M, Wehrlen L, Yang L, Wallen GR. Discrepancies in race and ethnicity documentation: a potential barrier in identifying racial and ethnic disparities. J Racial Ethn Health Disparities 2016; 4 (05) 812-818
- 21 Samalik JM, Goldberg CS, Modi ZJ. et al. Discrepancies in race and ethnicity in the electronic health record compared to self-report. J Racial Ethn Health Disparities 2023; 10 (06) 2670-2675
- 22 Cook LA, Sachs J, Weiskopf NG. The quality of social determinants data in the electronic health record: a systematic review. J Am Med Inform Assoc 2021; 29 (01) 187-196
- 23 Johnson JA, Moore B, Hwang EK, Hickner A, Yeo H. The accuracy of race & ethnicity data in US based healthcare databases: a systematic review. Am J Surg 2023; 226 (04) 463-470
- 24 Cook L, Espinoza J, Weiskopf NG. et al; N3C Consortium. Issues with variability in electronic health record data about race and ethnicity: descriptive analysis of the National COVID Cohort Collaborative Data Enclave. JMIR Med Inform 2022; 10 (09) e39235
- 25 Wang K, Grossetta Nardini H, Post L, Edwards T, Nunez-Smith M, Brandt C. Information loss in harmonizing granular race and ethnicity data: descriptive study of standards. J Med Internet Res 2020; 22 (07) e14591
- 26 Voss EA, Makadia R, Matcho A. et al. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. J Am Med Inform Assoc 2015; 22 (03) 553-564
- 27 Yin AL, Guo WL, Sholle ET. et al; Weill Cornell COVID-19 Data Abstraction Consortium. Comparing automated vs. manual data collection for COVID-specific medications from electronic health records. Int J Med Inform 2022; 157: 104622
- 28 Torres FBG, Gomes DC, Hino AAF, Moro C, Cubas MR. Comparison of the results of manual and automated processes of cross-mapping between nursing terms: quantitative study. JMIR Nurs 2020; 3 (01) e18501
- 29 Mohamed Y, Song X, McMahon TM. et al. Tailoring rule-based data quality assessment to the Patient-Centered Outcomes Research Network (PCORnet) Common Data Model (CDM). AMIA Annu Symp Proc 2023; 2022: 775-784
- 30 Rungvivatjarus T, Chong AZ, Patel A, Khare M, Bialostozky M, Kuelbs CL. Training pediatric physicians and staff to obtain data from the electronic health record. Healthcare (Amst) 2024; 12 (01) 100733
- 31 Ong TC, Kahn MG, Kwan BM. et al. Dynamic-ETL: a hybrid approach for health data extraction, transformation and loading. BMC Med Inform Decis Mak 2017; 17 (01) 134