Methods Inf Med 2019; 58(06): 229-234
DOI: 10.1055/s-0040-1709158
FAIR principles in Health Research
Georg Thieme Verlag KG Stuttgart · New York

Applying FAIRness: Redesigning a Biomedical Informatics Research Data Management Pipeline

Marcel Parciak
1   Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Niedersachsen, Germany
,
Theresa Bender
1   Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Niedersachsen, Germany
,
Ulrich Sax
1   Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Niedersachsen, Germany
,
Christian Robert Bauer
1   Department of Medical Informatics, University Medical Center Göttingen, Göttingen, Niedersachsen, Germany
› Author Affiliations
Funding This work was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the research and funding concepts of the Medical Informatics Initiative (01ZZ1802B/HiGHmed) and MyPathSem (BMBF 031L0024A).
Further Information

Publication History

30 July 2019

22 February 2020

Publication Date:
29 April 2020 (online)

Abstract

Background Managing research data in biomedical informatics research requires solid data governance rules to guarantee sustainable operation, as it generally involves several professions and multiple sites. As every discipline involved in biomedical research applies its own set of tools and methods, research data as well as applied methods tend to branch out into numerous intermediate and output data objects, making it very difficult to reproduce research results.

Objectives This article gives an overview of our implementation status applying the Findability, Accessibility, Interoperability and Reusability (FAIR) Guiding Principles for scientific data management and stewardship onto our research data management pipeline focusing on the software tools that are in use.

Methods We analyzed our progress FAIRificating the whole data management pipeline, from processing non-FAIR data up to data usage. We looked at software tools for data integration, data storage, and data usage as well as how the FAIR Guiding Principles helped to choose appropriate tools for each task.

Results We were able to advance the degree of FAIRness of our data integration as well as data storage solutions, but lack enabling more FAIR Guiding Principles regarding Data Usage. Existing evaluation methods regarding the FAIR Guiding Principles (FAIRmetrics) were not applicable to our analysis of software tools.

Conclusion Using the FAIR Guiding Principles, we FAIRificated relevant parts of our research data management pipeline improving findability, accessibility, interoperability and reuse of datasets and research results. We aim to implement the FAIRmetrics to our data management infrastructure and—where required—to contribute to the FAIRmetrics for research data in the biomedical informatics domain as well as for software tools to achieve a higher degree of FAIRness of our research data management pipeline.

 
  • References

  • 1 Bauer CR, Umbach N, Baum B. , et al. Architecture of a Biomedical Informatics Research Data Management Pipeline. Stud Health Technol Inform 2016; 228: 262-266
  • 2 Wilkinson MD, Dumontier M, Aalbersberg IJJ. , et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016; 3: 160018 . Doi: 10.1038/sdata.2016.18
  • 3 Wilkinson MD, Sansone S-A, Schultes E, Doorn P, Bonino da Silva Santos LO, Dumontier M. A design framework and exemplar metrics for FAIRness. Sci Data 2018; 5: 180118 . Doi: 10.1038/sdata.2018.118
  • 4 Stall S, Yarmey L, Cutcher-Gershenfeld J. , et al. Make scientific data FAIR. Nature 2019; 570 (7759): 27-29 . Doi: 10.1038/d41586-019-01720-7
  • 5 Wolstencroft K, Krebs O, Snoep JL. , et al. FAIRDOMHub: a repository and collaboration environment for sharing systems biology research. Nucleic Acids Res 2017; 45 (D1): D404-D407
  • 6 Hartig K, Fournier J. Handling of Research Data: DFG Guidelines on the Handling of Research Data. Available at: https://www.dfg.de/en/research_funding/proposal_review_decision/applicants/research_data/index.html . Updated August 2, 2019. Accessed January 14, 2019
  • 7 Hutchinson DR. Indexed ICH GCP guidelines with integrated addendum E6(R2), Step 4, November 2016: Integrated addendum to ICH E6(R1): Guideline for good clinical practice E6(R2): current step 4 version, dated 9 November 2016. First edition. Chobham, Surrey: Canary; 2016
  • 8 Parciak M, Bauer CR, Lodahl R. , et al. PROV@TOS, a Java Wrapper to capture provenance for Talend Open Studio jobs. In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, ed. 63. Jahrestagung der GMDS e.V.: Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018
  • 9 Missier P, Belhajjame K, Cheney J. The W3C PROV family of specifications for modelling provenance metadata. In: Paton NW, Guerrini G. , eds. Proceedings of the 16th International Conference on Extending Database Technology - EDBT '13. New York, New York, USA: ACM Press; 2013: 773
  • 10 Hoekstra R, Groth P. PROV-O-Viz - Understanding the Role of Activities in Provenance. In: Ludäscher B, Plale B. , eds. Provenance and Annotation of Data and Processes. Vol. 8628. Cham: Springer International Publishing; 2015: 215-220 Lecture Notes in Computer Science
  • 11 Bender T, Bauer CR, Parciak M, Lodahl R, Sax U. FAIR conform ETL processing in translational research. In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, ed. 63. Jahrestagung der GMDS e.V.: Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018
  • 12 Knopp C, Bauer CR, Kusch H, Sax U. Usage of persistent identifiers to implement the FAIR guiding principles in medical research data management systems. In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, ed. 63. Jahrestagung der GMDS e.V.: Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018
  • 13 Bauer CR, Knopp C, Bender T, Kusch H, Sax U. Application of basic research data management with FAIRDOM/SEEK from a medical informatics perspective. In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, ed. 63. Jahrestagung der GMDS e.V.: Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018
  • 14 Scheufele E, Aronzon D, Coopersmith R. , et al. tranSMART: an open source knowledge management and high content data analytics platform. AMIA Jt Summits Transl Sci Proc 2014; 2014: 96-101
  • 15 Murphy SN, Mendis M, Hackett K. , et al. Architecture of the open-source clinical research chart from Informatics for Integrating Biology and the Bedside. AMIA Annu Symp Proc 2007; 2007: 548-552
  • 16 Herzinger S, Gu W, Satagopam V. , et al; eTRIKS Consortium. SmartR: an open-source platform for interactive visual analytics for translational research data. Bioinformatics 2017; 33 (14) 2229-2231
  • 17 Data Science & Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. Indianapolis, IN: John Wiley & Sons; 2015 . Available at: http://proquest.tech.safaribooksonline.de/9781118876138
  • 18 German Council for Scientific Information Infrastructures. Performance through diversity: Recommendations regarding structures, processes, and financing for research data management in Germany. Göttingen. 2016 . Available at: http://d-nb.info/1121685978/34
  • 19 Wolinetz CD. AMIA: NIH Misses Mark on Data Sharing Proposals: Nation's health informatics experts urge NIH to dramatically revise draft data management and sharing policy to maximize the value of scientific data. Available at: https://www.amia.org/news-and-publications/press-release/amia-nih-misses-mark-data-sharing-proposals . Updated January 13, 2020. Accessed January 14, 2020
  • 20 Parciak M, Bauer CR, Bender T. , et al. Provenance solutions for medical research in heterogeneous IT-infrastructures: an implementation roadmap. Stud Health Informatics 2019; 298-302
  • 21 Semler SC, Wissing F, Heyder R. German Medical Informatics Initiative. Methods Inf Med 2018; 57 (S01): e50-e56
  • 22 Strodl S, Becker C, Neumayer R, Rauber A. How to choose a digital preservation strategy. In: Rasmussen E, Larson RR, Toms E, Sugimoto S. , eds. Proceedings of the 2007 conference on Digital libraries - JCDL '07. New York, New York, USA: ACM Press; 2007: 29
  • 23 Dickmann F, Grütz R, Rienhoff O. A “meta”-perspective on “bit rot” of biomedical research data. Stud Health Technol Inform 2012; 180: 260-264
  • 24 Dierkes J, Wuttke U. The Göttingen eResearch Alliance: a case study of developing and establishing institutional support for research data management. Int J Geo Inform 2016; 5 (08) 133 . Doi: 10.3390/ijgi5080133
  • 25 Bauer CR, Knopp C, Bender T, Kusch H, Sax U. Application of basic research data management with FAIRDOM/SEEK from a medical informatics perspective. In: Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie, ed. 63. Jahrestagung der GMDS e.V.: Osnabrück, 02.-06.09.2018. Düsseldorf: German Medical Science GMS Publishing House; 2018
  • 26 Goecks J, Nekrutenko A, Taylor J. ; Galaxy Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 2010; 11 (08) R86
  • 27 Oinn T, Addis M, Ferris J. , et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 2004; 20 (17) 3045-3054
  • 28 Boettiger C. An introduction to Docker for reproducible research. SIGOPS Oper Syst Rev 2015; 49 (01) 71-79
  • 29 Gruetz R, Franke T, Dickmann F. Concept for preservation and reuse of genome and biomedical imaging research data. Stud Health Technol Inform 2013; 192: 999
  • 30 Holub P, Kohlmayer F, Prasser F. , et al. Enhancing reuse of data and biological material in medical research: from FAIR to FAIR-Health. Biopreserv Biobank 2018; 16 (02) 97-105
  • 31 Haarbrandt B, Schreiweis B, Rey S. , et al. HiGHmed - an open platform approach to enhance care and research across institutional boundaries. Methods Inf Med 2018; 57 (S 01): e66-e81
  • 32 Wilkinson MD, Dumontier M, Sansone S-A. , et al. Evaluating FAIR-compliance through an objective, automated, community-governed framework. Bio Rxiv 2018; 418376 DOI: 10.1101/418376.