MAGICPL: A Generic Process Description Language for Distributed Pseudonymization Scenarios

Galina Tremper; Torben Brenner; Florian Stampe; Andreas Borg; Martin Bialke; David Croft; Esther Schmidt; Martin Lablans

doi:10.1055/s-0041-1731387

RSS-Feed abonnieren

Bitte kopieren Sie die angezeigte URL und fügen sie dann in Ihren RSS-Reader ein.

https://www.thieme-connect.de/rss/thieme/de/10.1055-s-00035037.xml

PDF herunterladen

Methods Inf Med 2021; 60(01/02): 021-031
DOI: 10.1055/s-0041-1731387

Original Article

MAGICPL: A Generic Process Description Language for Distributed Pseudonymization Scenarios

Autor*innen

Galina Tremper

¹Federated Information Systems, German Cancer Research Center (DKFZ), Heidelberg, Germany

²Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany
Torben Brenner

¹Federated Information Systems, German Cancer Research Center (DKFZ), Heidelberg, Germany

²Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany
Florian Stampe

¹Federated Information Systems, German Cancer Research Center (DKFZ), Heidelberg, Germany
Andreas Borg

³Institute of Medical Biostatistics, Epidemiology and Informatics, Johannes Gutenberg-Universität Mainz, Universitätsmedizin, Mainz, Germany
Martin Bialke

⁴Department Epidemiology of Health Care and Community Health, Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
David Croft

¹Federated Information Systems, German Cancer Research Center (DKFZ), Heidelberg, Germany

²Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany
Esther Schmidt

¹Federated Information Systems, German Cancer Research Center (DKFZ), Heidelberg, Germany

²Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany
Martin Lablans

¹Federated Information Systems, German Cancer Research Center (DKFZ), Heidelberg, Germany

²Complex Data Processing in Medical Informatics, University Medical Center Mannheim, Mannheim, Germany

Funding The MAGIC consortium was supported by the Deutsche Forschungsgemeinschaft (DFG) under grant number LA 3859/1-1.

Weitere Informationen

Lizenzen und Reprints

Abstract

Objectives Pseudonymization is an important aspect of projects dealing with sensitive patient data. Most projects build their own specialized, hard-coded, solutions. However, these overlap in many aspects of their functionality. As any re-implementation binds resources, we would like to propose a solution that facilitates and encourages the reuse of existing components.

Methods We analyzed already-established data protection concepts to gain an insight into their common features and the ways in which their components were linked together. We found that we could represent these pseudonymization processes with a simple descriptive language, which we have called MAGICPL, plus a relatively small set of components. We designed MAGICPL as an XML-based language, to make it human-readable and accessible to nonprogrammers. Additionally, a prototype implementation of the components was written in Java. MAGICPL makes it possible to reference the components using their class names, making it easy to extend or exchange the component set. Furthermore, there is a simple HTTP application programming interface (API) that runs the tasks and allows other systems to communicate with the pseudonymization process.

Results MAGICPL has been used in at least three projects, including the re-implementation of the pseudonymization process of the German Cancer Consortium, clinical data flows in a large-scale translational research network (National Network Genomic Medicine), and for our own institute's pseudonymization service.

Conclusions Putting our solution into productive use at both our own institute and at our partner sites facilitated a reduction in the time and effort required to build pseudonymization pipelines in medical research.

Keywords

pseudonymization - process description language - data protection

Note

The research reported in this article is of a purely technical nature. Neither human nor animal subjects were involved.

Publikationsverlauf

Eingereicht: 22. September 2020

Angenommen: 04. Mai 2021

Artikel online veröffentlicht:
05. Juli 2021

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

References
1 Busse R, Riesberg A. Health Care Systems in Transition. Germany. Copenhagen: WHO Regional Office for Europe on behalf of the European Observatory on Health Systems and Policies; 2004

Suche in Google Scholar
2 Berndt DJ, Fisher JW, Hevner AR, Studnicki J. Healthcare data warehousing and quality assurance. Computer 2001; 34 (12) 56-65

Crossref Suche in Google Scholar
Download RIS citation
3 Saltman RB. Decentralization, re-centralization and future European health policy. Eur J Public Health 2008; 18 (02) 104-106

Crossref PubMed Suche in Google Scholar
Download RIS citation
4 Weichert T. Gesundheitsdatenschutz in vernetzten Zeiten. Wien Klin Mag 2018; 21 (03) 130-135

Crossref Suche in Google Scholar
Download RIS citation
5 Datenschutz-Grundverordnung. DSGVO; 2018. Accessed May 25, 2021 at: https://dejure.org/gesetze/DSGVO/9.html
6 Fellegi IP, Sunter AB. A theory for record linkage. J Am Stat Assoc 1969; 64 (328) 1183-1210

Crossref Suche in Google Scholar
Download RIS citation
7 Vatsalan D, Christen P, Verykios VS. A taxonomy of privacy-preserving record linkage techniques. Inf Syst 2013; 38 (06) 946-969

Crossref Suche in Google Scholar
Download RIS citation
8 Faldum A, Pommerening K. An optimal code for patient identifiers. Comput Methods Programs Biomed 2005; 79 (01) 81-88

Crossref PubMed Suche in Google Scholar
Download RIS citation
9 Lablans M, Borg A, Ückert F. A RESTful interface to pseudonymization services in modern web applications. BMC Med Inform Decis Mak 2015; 15: 2

Crossref PubMed Suche in Google Scholar
Download RIS citation
10 Joos S, Nettelbeck DM, Reil-Held A. et al. German Cancer Consortium (DKTK) - a national consortium for translational cancer research. Mol Oncol. 2019; 13 (03) 535-542

PubMed Suche in Google Scholar
Download RIS citation
11 Prokosch H-U, Acker T, Bernarding J. et al. MIRACUM: Medical Informatics in Research and Care in University Medicine: A Large Data Sharing Network to Enhance Translational Research and Medical Care. Erlangen: Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU); 2018

Suche in Google Scholar
Download RIS citation
12 Burkhart M, Wiese B. Deutsches Mukoviszidose-Register – Berichtsband. Accessed May 25, 2021 at: https://www.muko.info/fileadmin/user_upload/angebote/qualitaetsmanagement/register/berichtsbaende/berichtsband_2015.pdf
13 Bernemann I, Kersting M, Prokein J, Hummel M, Klopp N, Illig T. Zentralisierte Biobanken als Grundlage für die medizinische Forschung. Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz 2016; 59 (03) 336-343

Crossref PubMed Suche in Google Scholar
Download RIS citation
14 Bialke M, Penndorf P, Wegner T. et al. A workflow-driven approach to integrate generic software modules in a Trusted Third Party. J Transl Med 2015; 13: 176

Crossref PubMed Suche in Google Scholar
Download RIS citation
15 Geidel L, Bahls T, Hoffmann W. Ein generisches Pseudonymisierungswerkzeug als Modul des Zentralen Datenmanagements medizinischer Forschungsdaten. In: Löffler M, Riedel-Heller S. editors. Abstractband 8th Annual Conference of the German Society for Epidemiology (DGEpi) e.V. and 1st International LIFE Symposium (Abstractband 8. Jahrestagung der Deutschen Gesellschaft für Epidemiologie und 1. Internationales LIFE Symposium). Leipzig; 2013: 245-246

Suche in Google Scholar
Download RIS citation
16 Pseudonymverwaltung mit gPAS. Accessed May 25, 2021 at: https://www.toolpool-gesundheitsforschung.de/produkte/gpas
17 Bergh B, Hoffmann W, Lablans M. MAGIC - Mainzelliste, Samply.Auth und der Generische Informed Consent Service als Open-Source-Werkzeuge für Identitäts-, Einwilligungs- und Rechtemanagement in der medizinischen Verbundforschung. Accessed May 25, 2021 at: https://gepris.dfg.de/gepris/projekt/315057496?context=projekt&task=showDetail&id=315057496&
18 Bialke M, Bahls T, Geidel L. et al. MAGIC: once upon a time in consent management-a FHIR^® tale. J Transl Med 2018; 16 (01) 256

Crossref PubMed Suche in Google Scholar
Download RIS citation
19 Pommerening K, Drepper J, Helbing K, Ganslandt T. Leitfaden zum Datenschutz in medizinischen Forschungsprojekten: Generische Lösungen der TMF 2.0. Schriftenreihe der TMF - Technologie- und Methodenplattform für die Vernetzte Medizinische Forschung e. V; Bd. 11. Berlin: Medizinisch Wissenschaftliche Verlagsgesellschaft; 2014

Crossref Suche in Google Scholar
Download RIS citation
20 Telematikplattform – Verbund zur Förderung vernetzter medizinischer Forschung (TMF) e. V. Accessed May 25, 2021 at: https://www.tmf-ev.de/
21 Lablans M, Borg A. Clinical Communication Platform (CCP-IT): Datenschutzkonzept. Accessed May 25, 2021 at: https://dktk.dkfz.de/application/files/9014/6235/8458/Datenschutzkonzept_CCP-IT__10.10.2014.pdf
22 Rytina S. Die digitale Zukunft hat begonnen: Mit “DataThereHouse” wird in Heidelberg ein “Navigationssystem” für Ärzte entwickelt. Accessed May 25, 2021 at: https://deutsch.medscape.com/artikelansicht/4906555
23 Büttner R, Wolf J, Kron A. Nationales Netzwerk Genomische Medizin. Das nationale Netzwerk Genomische Medizin (nNGM) : Modell für eine innovative Diagnostik und Therapie von Lungenkrebs im Spannungsfeld eines öffentlichen Versorgungsauftrages. Pathologe 2019; 40 (03) 276-280

Crossref PubMed Suche in Google Scholar
Download RIS citation
24 Lablans M, Schmidt EE, Ückert F. An architecture for translational cancer research as exemplified by the German Cancer Consortium. JCO Clin Cancer Inform 2018; 2 (02) 1-8

Suche in Google Scholar
25 Boyd R. Getting Started with OAuth 2.0: Programming Clients for Secure Web API authorization and Authentication. Sebastopol, CA: O'Reilly; 2012

Suche in Google Scholar
26 Gamma E, Helm R, Johnson R, Vlissides J. Design Patterns: Elements of Reusable Object-Oriented Software 39. Boston, MA: Addison-Wesley; 2011

Suche in Google Scholar
Download RIS citation
27 Liang S, Bracha G. Dynamic class loading in the Java virtual machine. Paper presented at: OOPSLA '98 Proceedings of the 13th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and applications. New York, NY, United States 1998; (33) 36-44

PubMed Suche in Google Scholar
Download RIS citation
28 Lin B, Chen Y, Chen X, Yu Y. Comparison between JSON and XML in Applications Based on AJAX. Paper presented at: 2012 International Conference on Computer Science and Service System. IEEE; 2012 2012 1174-1177

Suche in Google Scholar
29 Haq ZU, Khan GF, Hussain T. A Comprehensive analysis of XML and JSON web technologies. New Developments in Circuits, Systems, Signal Processing, Communications and Computers. 2013: 102-109

Suche in Google Scholar
Download RIS citation
30 Nurseitov N, Paulson M, Reynolds R, Izurieta C. Comparison of JSON and XML data interchange formats: a case study. Paper presented at: Proceedings of the ISCA 22nd International Conference on Computer Applications in Industry and Engineering CAINE 2009. November 4–6, 2009, Hilton San Francisco Fisherman's Wharf, San Francisco, California, United States: 2009: 157-162

Suche in Google Scholar
31 Khare R, Rifkin A. XML: a door to automated Web applications. IEEE Internet Comput 1997; 1 (04) 78-87

Crossref Suche in Google Scholar
Download RIS citation
32 REST vs. RPC: what problems are you trying to solve with your APIs? Google Cloud Blog. Accessed May 25, 2021 at: https://cloud.google.com/blog/products/application-development/rest-vs-rpc-what-problems-are-you-trying-to-solve-with-your-apis
33 Richardson L, Ruby S. RESTful Web Services: Web Services for the Real World. Beijing: O'Reilly; 2007

Suche in Google Scholar
Download RIS citation
34 Feng X, Shen J, Fan Y. REST: An alternative to RPC for Web services architecture. Paper presented at: 2009 First International Conference on Future Information Networks. October 14–17, 2009, Beijing, China. Piscataway: IEEE; 2009: 7-10

Suche in Google Scholar
35 Drepper J. PID-Generator. Accessed May 25, 2021 at: https://www.tmf-ev.de/Themen/Projekte/V015_01_PID_Generator.aspx
36 Nitzlnader M, Schreier G. Patient identity management for secondary use of biomedical research data in a distributed computing environment. Stud Health Technol Inform 2014; 198: 211-218

PubMed Suche in Google Scholar
Download RIS citation
37 Hippisley-Cox J. OpenPseudonymiser. Accessed May 25, 2021 at: https://www.openpseudonymiser.org/
38 Boyle DIR. GRHANITE™. Accessed May 25, 2021 at: https://www.grhanite.com/
39 Ibsen C, Anstey J. Camel in Action. 2nd ed.. Shelter Island, NY: Manning; 2018

Suche in Google Scholar
Download RIS citation
40 Workflow and Decision Automation Platform. Camunda BPM. Accessed May 25, 2021 at: https://camunda.com/
41 Cogoluègnes A, Templier T, Gregory G, Bazoud O. Spring batch in action. Shelter Island, NY: Manning; 2012

Suche in Google Scholar
Download RIS citation
42 Kai Waehner. When to use Apache Camel?. Accessed July 15, 2019 at: http://www.kai-waehner.de/blog/2011/06/02/when-to-use-apache-camel/
43 Joachim Hackmann. SAP, Adobe, Bosch und Bizagi: Fraunhofer testet acht BPM-Suites. Accessed May 25, 2021 at: https://www.computerwoche.de/a/fraunhofer-testet-acht-bpm-suites,2552844,6

Ähnliche Zeitschriften

RSS-Feed abonnieren

Teilen / Bookmarken

MAGICPL: A Generic Process Description Language for Distributed Pseudonymization Scenarios

Autor*innen

Abstract

Keywords

Note

Publikationsverlauf

References