Data Integration for Future Medicine (DIFUTURE)

Fabian Prasser; Oliver Kohlbacher; Ulrich Mansmann; Bernhard Bauer; Klaus A. Kuhn

doi:10.3414/ME17-02-0022

RSS-Feed abonnieren

Bitte kopieren Sie die angezeigte URL und fügen sie dann in Ihren RSS-Reader ein.

https://www.thieme-connect.de/rss/thieme/de/10.1055-s-00035037.xml

Teilen / Bookmarken

Facebook Linkedin Weibo

PDF herunterladen

CC BY-NC-ND 4.0 · Methods Inf Med 2018; 57(S 01): e57-e65
DOI: 10.3414/ME17-02-0022

Focus Theme – Original Articles

Schattauer GmbH

Data Integration for Future Medicine (DIFUTURE)

An Architectural and Methodological Overview

Fabian Prasser^*

¹Institute of Medical Informatics, Statistics and Epidemiology, University Hospital rechts der Isar, Technical University of Munich, Munich, Germany

,

Oliver Kohlbacher^*

²Department of Computer Science, Center for Bioinformatics and Quantitative Biology Center, Eberhard-Karls-Universität Tübingen, Tübingen, Germany

³Max Planck Institute for Developmental Biology, Tübingen, Germany

,

Ulrich Mansmann^*

⁴Institute for Medical Information Processing, Biometry, and Epidemiology, Faculty of Medicine, Ludwig-Maximilians-University Munich, Munich, Germany

,

Bernhard Bauer^*

⁵Department of Computer Science, University of Augsburg, Augsburg, Germany

,

Klaus A. Kuhn^*

¹Institute of Medical Informatics, Statistics and Epidemiology, University Hospital rechts der Isar, Technical University of Munich, Munich, Germany

› Institutsangaben
The work of the DIFUTURE consortium during the conceptual phase was funded by the German Federal Ministry of Education and Research (BMBF) within the “Medical Informatics Funding Scheme” under reference numbers 01ZZ1603[A-D].

Weitere Informationen

Publikationsverlauf

received: 01. Dezember 2017

accepted: 17. April 2018

Publikationsdatum:
17. Juli 2018 (online)

Abstract
Volltext
Referenzen

Lizenzen und Reprints

Summary

Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on the German Medical Informatics Initiative. Future medicine will be predictive, preventive, personalized, participatory and digital. Data and knowledge at comprehensive depth and breadth need to be available for research and at the point of care as a basis for targeted diagnosis and therapy. Data integration and data sharing will be essential to achieve these goals. For this purpose, the consortium Data Integration for Future Medicine (DIFUTURE) will establish Data Integration Centers (DICs) at university medical centers.

Objectives: The infrastructure envisioned by DIFUTURE will provide researchers with cross-site access to data and support physicians by innovative views on integrated data as well as by decision support components for personalized treatments. The aim of our use cases is to show that this accelerates innovation, improves health care processes and results in tangible benefits for our patients. To realize our vision, numerous challenges have to be addressed. The objective of this article is to describe our concepts and solutions on the technical and the organizational level with a specific focus on data integration and sharing.

Governance and Policies: Data sharing implies significant security and privacy challenges. Therefore, state-of-the-art data protection, modern IT security concepts and patient trust play a central role in our approach. We have established governance structures and policies safeguarding data use and sharing by technical and organizational measures providing highest levels of data protection. One of our central policies is that adequate methods of data sharing for each use case and project will be selected based on rigorous risk and threat analyses. Interdisciplinary groups have been installed in order to manage change.

Architectural Framework and Methodology: The DIFUTURE Data Integration Centers will implement a three-step approach to integrating, harmonizing and sharing structured, unstructured and omics data as well as images from clinical and research environments. First, data is imported and technically harmonized using common data and interface standards (including various IHE profiles, DICOM and HL7 FHIR). Second, data is preprocessed, transformed, harmonized and enriched within a staging and working environment. Third, data is imported into common analytics platforms and data models (including i2b2 and tranSMART) and made accessible in a form compliant with the interoperability requirements defined on the national level. Secure data access and sharing will be implemented with innovative combinations of privacy-enhancing technologies (safe data, safe settings, safe outputs) and methods of distributed computing.

Use Cases: From the perspective of health care and medical research, our approach is disease-oriented and use-case driven, i.e. following the needs of physicians and researchers and aiming at measurable benefits for our patients. We will work on early diagnosis, tailored therapies and therapy decision tools with focuses on neurology, oncology and further disease entities. Our early uses cases will serve as blueprints for the following ones, verifying that the infrastructure developed by DIFUTURE is able to support a variety of application scenarios.

Discussion: Own previous work, the use of internationally successful open source systems and a state-of-the-art software architecture are cornerstones of our approach. In the conceptual phase of the initiative, we have already prototypically implemented and tested the most important components of our architecture.

Keywords

Health information systems - data warehousing - information dissemination - data sharing - privacy

^* for the DIFUTURE Consortium

References
1 Flores M, Glusman G, Brogaard K, Price ND, Hood L. P4 medicine: how systems medicine will transform the healthcare sector and society. Per Med 2013; 10 (06) 565-576.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
2 Dyke SO, Philippakis AA, Rambla De Argila J. et al. Consent Codes: Upholding Standard Data Use Conditions. PLoS Genet 2016; 12 (01) e1005772.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
3 Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016; 03: 160018.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
4 Prasser F, Kohlmayer F, Spengler H, Kuhn KA. A scalable and pragmatic method for the safe sharing of high-quality health data. IEEE J Biomed Health Inform 2018; 22 (02) 611-622.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
5 DIFUTURE – Scientific Advisory Board. [cited 2017 Nov 27]. Available from: https://difuture.de/advisory-board/

MissingFormLabel
PubMed
6 IHE IT Infrastructure Technical Framework. IHE International Inc. 2017 Jul 21 [cited 2017 Nov 30]. Available from: https://www.ihe.net/Technical_Frameworks/#IT

MissingFormLabel
PubMed
7 7® Version 9.1, an Open Group Standard. The Open Group. [cited 2017 Oct 27]. Available from: http://www.opengroup.org/subjectareas/enterprise/7/

MissingFormLabel
PubMed
8 Federal Enterprise Architecture Framework Version 2. The White House. 2013 Jan 29 [cited 2017 Nov 17]. Available from: https://obamawhitehouse.archives.gov/sites/default/files/omb/assets/egov_docs/fea_v2.pdf

MissingFormLabel
PubMed
9 Fielding RT. Architectural styles and the design of network-based software architectures [dissertation]. Irvine: University of California; 2000

MissingFormLabel
Suche in Google Scholar
10 Boettiger C. An introduction to Docker for reproducible research. ACM SIGOPS Operating Systems Review 2015; 49 (01) 71-79.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
11 Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One 2017; 12 (05) e0177459.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
12 Bender D, Sartipi K. HL7 FHIR: An Agile and RESTful approach to healthcare information exchange. 26th IEEE International Symposium on Computer-Based Medical Systems 2013; 326-331.

MissingFormLabel
PubMed Suche in Google Scholar
13 OpenESB – The Open Enterprise Service Bus. [cited 2017 Nov 27]. Available from: http://www.open-esb.net/

MissingFormLabel
PubMed
14 MIRC Clinical Trials Processor. Radiological Society of North America, Inc. [cited 2017 Nov 27]. Available from: http://mircwiki.rsna.org/index.php?title=MIRC_CTP

MissingFormLabel
PubMed
15 Jodogne S, Bernard C, Devillers M, Lenaerts E, Coucke P. Orthanc – A lightweight, restful DICOM server for healthcare and medical research. 10th IEEE International Symposium on Biomedical Imaging 2013; 190-193.

MissingFormLabel
PubMed Suche in Google Scholar
16 Stein B, Morrison A. The enterprise data lake: Better integration and deeper analytics. PwC Technology Forecast: Rethinking integration 2014; 01: 1-9.

MissingFormLabel
PubMed Suche in Google Scholar
17 Bauch A, Adamczyk I, Buczek P, Elmer FJ, Enimanev K, Glyzewski P. et al. openBIS: a flexible framework for managing and analyzing complex data in biology research. BMC Bioinformatics 2011; 12 (01) 468.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
18 Marcus DS, Olsen TR, Ramaratnam M, Buckner RL. The Extensible Neuroimaging Archive Toolkit: an informatics platform for managing, exploring, and sharing neuroimaging data. Neuroinformatics 2007; 05 (01) 11-34.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
19 Casters M, Bouman R, Van Dongen J. Pentaho Kettle solutions: building open source ETL solutions with Pentaho Data Integration. Indianapolis: John Wiley Publishing Incorporated; 2010

MissingFormLabel
Suche in Google Scholar
20 Bowen J. Getting Started with Talend Open Studio for Data Integration. Birmingham: Packt Publishing Limited; 2012

MissingFormLabel
Suche in Google Scholar
21 Bauer C, Ganslandt T, Baum B, Christoph J, Engel I, Löbe M. et al. The integrated data repository toolkit (IDRT): accelerating translational research infrastructures. J Clin Bioinforma 2015; 05 (Suppl. 01) S6.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
22 de la Garza L, Veit J, Szolek A, Röttig M, Aiche S, Gesing S. et al. From the desktop to the grid: scalable bioinformatics via workflow conversion. BMC Bioinformatics 2016; 17 (01) 127.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
23 Streit A, Bala P, Beck-Ratzka A, Benedyczak K, Bergmann S, Breu R. et al. UNICORE 6 – recent and future advancements. Ann Telecommun 2010; 65 (11–12): 757-762.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
24 Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T. et al. KNIME: The Konstanz information miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R. editors: Data Analysis, Machine Learning and Applications. Berlin: Springer; 2008: 319-326.

MissingFormLabel
Suche in Google Scholar
25 ISO/IEC 11179, Information Technology – Metadata registries (MDR). International Organization of Standardization (ISO). [cited 2017 Nov 30]. Available from: http://metadata-standards.org/11179/

MissingFormLabel
PubMed
26 DIFUTURE – Partners. [cited 2017 Nov 28]. Available from: https://difuture.de/partners/

MissingFormLabel
PubMed
27 Averbis Information Discovery. [cited 2017 Nov 27]. Available from: https://averbis.com/information-discovery/

MissingFormLabel
PubMed
28 Ragan ED, Endert A, Sanyal J, Chen J. Characterizing Provenance in Visualization and Data Analysis: An Organizational Framework of Provenance Types and Purposes. IEEE Trans Vis Comput Graph 2016; 22 (01) 31-40.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
29 Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S. et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc 2010; 17 (02) 124-130.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
30 Scheufele E, Aronzon D, Coopersmith R, McDuffie MT, Kapoor M, Uhrich CA. et al. tranSMART: An Open Source Knowledge Management and High Content Data Analytics Platform. AMIA Jt Summits Transl Sci Proc 2014; 2014: 96-101.

MissingFormLabel
PubMed Suche in Google Scholar
31 Schumacher A, Rujan T, Hoefkens J. A collaborative approach to develop a multi-omics data analytics platform for translational research. Appl Transl Genom 2014; 03 (04) 105-108.

MissingFormLabel
PubMed Suche in Google Scholar
32 Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, Sumer SO. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 2013; 06 (269) 1.

MissingFormLabel
PubMed Suche in Google Scholar
33 Loraine AE, Blakley IC, Jagadeesan S, Harper J, Miller G, Firon N. Analysis and visualization of NA-Seq expression data using RStudio, Bioconductor, and Integrated Genome Browser. Methods Mol Biol 2015; 1284: 481-501.

MissingFormLabel
PubMed Suche in Google Scholar
34 Lautenschläger R, Kohlmayer F, Prasser F, Kuhn KA. A generic solution for web-based management of pseudonymized data. BMC Med Inform Decis Mak 2015; 15: 100.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
35 Bialke M, Penndorf P, Wegner T. et al. A workflowdriven approach to integrate generic software modules in a Trusted Third Party. J Transl Med 2015; 13: 176.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
36 Automatable Discovery and Access Matrix. GA4GH. [cited 2017 Nov 30]. Available from: https://www.ga4gh.org/ga4ghtoolkit/regulatoryandethics/

MissingFormLabel
PubMed
37 Durham EA, Kantarcioglu M, Xue Y, Toth C, Kuzu M, Malin B. Composite Bloom Filters for Secure Record Linkage. IEEE Trans Knowl Data Eng 2014; 26 (12) 2956-2968.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
38 Schera F, Weiler G, Neri E, Kiefer S, Graf N. The p-medicine portal – a collaboration platform for research in personalised medicine. Ecancermedicalscience 2014; 08: 398.

MissingFormLabel
PubMed Suche in Google Scholar
39 McMurry AJ, Murphy SN, MacFadden D, Weber G, Simons WW, Orechia J. et al. SHRINE: enabling nationally scalable multi-site disease studies. PloS One 2013; 08 (03) e55811.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
40 Prasser F, Kohlmayer F, Lautenschläger R, Kuhn KA. ARX – A Comprehensive Tool for Anonymizing Biomedical Data. AMIA Annual Symposium 2014; 984-993.

MissingFormLabel
PubMed Suche in Google Scholar
41 Templ M, Kowarik A, Meindl B. Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro. J Stat Softw 2015; 67 (04) 1-36.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
42 Brandizi M, Melnichuk O, Bild R, Kohlmayer F, Rodriguez-Castro B, Spengler H. et al. Orchestrating differential data access for translational research: a pilot implementation. BMC Med Inform Decis Mak 2017; 17 (01) 30.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
43 Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S. et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform 2014; 52: 28-35.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
44 Boussadi A, Caruba T, Zapletal E, Sabatier B, Durieux P, Degoulet P. A clinical data warehouse–based process for refining medication orders alerts. J Am Med Inform Assoc 2012; 19 (05) 782-785.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
45 Jannot AS, Zapletal E, Avillach P, Mamzer MF, Burgun A, Degoulet P. The Georges Pompidou University Hospital Clinical Data Warehouse: A 8-years follow-up experience. Int J Med Inform 2017; 102: 21-28.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
46 Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL. et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc 2016; 23 (06) 1046-1052.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
47 Gottesman O, Kuivaniemi H, Tromp G, Faucett WA, Li R, Manolio TA. et al. The electronic medical records and genomics (eMERGE) network: past, present, and future. Genet Med 2013; 15 (10) 761-771.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
48 Ritchie F. Secure access to confidential microdata: four years of the Virtual Microdata Laboratory. The Labour Gazette 2008; 02 (05) 29-34.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
49 European Medicines Agency. EMA/90915/2016 (Version 1.3) – External guidance on the implementation of the European Medicines Agency Policy on Publication of Clinical Data for Medicinal Products for Human Use. 2017

MissingFormLabel
PubMed Suche in Google Scholar

RSS-Feed abonnieren

Teilen / Bookmarken

Data Integration for Future Medicine (DIFUTURE)

Publikationsverlauf

Summary

Keywords

References