CC BY-NC-ND 4.0 · Appl Clin Inform 2021; 12(04): 826-835
DOI: 10.1055/s-0041-1733847
Research Article

Linking a Consortium-Wide Data Quality Assessment Tool with the MIRACUM Metadata Repository

Lorenz A. Kapsner
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
2  Department of Radiology, Universitätsklinikum Erlangen, Friedrich-Alexander-University Erlangen-Nürnberg, Erlangen, Germany
,
Jonathan M. Mang
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
,
Sebastian Mate
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
,
Susanne A. Seuchter
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
,
Abishaa Vengadeswaran
3  Medical Informatics Group (MIG), Goethe University Frankfurt, University Hospital Frankfurt, Frankfurt am Main, Germany
,
Franziska Bathelt
4  Institute for Medical Informatics and Biometry, Carl Gustav Carus Faculty of Medicine, Technical University Dresden, Dresden, Germany
,
Noemi Deppenwiese
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
,
Dennis Kadioglu
3  Medical Informatics Group (MIG), Goethe University Frankfurt, University Hospital Frankfurt, Frankfurt am Main, Germany
5  Data Integration Center, University Hospital Frankfurt, Frankfurt am Main, Germany
,
Detlef Kraska
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
,
Hans-Ulrich Prokosch
1  Medical Center for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany
6  Department of Medical Informatics, Friedrich-Alexander-University Erlangen-Nürnberg (FAU), Erlangen, Germany
› Institutsangaben
Funding This work was funded in part by the German Federal Ministry of Education and Research (BMBF) within the Medical Informatics Initiative (MIRACUM Consortium) under the Funding Numbers FKZ: 01ZZ1801A (Erlangen), 01ZZ1801C (Frankfurt), and 01ZZ1801L (Dresden).

Abstract

Background Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites.

Objectives Major limitations of the former approach included manual interpretation of the results and hard coding of analyses, making their expansion to new data elements and databases time-consuming and error prone. We here present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized DQA framework from Kahn et al and its application within the MIRACUM consortium.

Methods Data quality checks were consequently aligned to a harmonized DQA terminology. Database-specific information were systematically identified and represented in an MDR. Furthermore, a structured representation of logical relations between data elements was developed to model plausibility-statements in the MDR.

Results The MIRACUM DQA tool was linked to data element definitions stored in a consortium-wide MDR. Additional databases used within MIRACUM were linked to the DQ checks by extending the respective data elements in the MDR with the required information. The evaluation of DQ checks was automated. An adaptable software implementation is provided with the R package DQAstats.

Conclusion The enhancements of the DQA tool facilitate the future integration of new data elements and make the tool scalable to other databases and data models. It has been provided to all ten MIRACUM partners and was successfully deployed and integrated into their respective data integration center infrastructure.

Protection of Human and Animal Subjects

Pseudonymized EHR data were used for developing and testing this software. No formal intervention was performed and no additional (patient-) data were collected. The authors declare that this research was performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects.




Publikationsverlauf

Eingereicht: 19. April 2021

Angenommen: 27. Juni 2021

Publikationsdatum:
25. August 2021 (online)

© 2021. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany