On Creating a Patient-centric Database from Multiple Hospital Information Systems
received:16 September 2010
accepted:16 May 2011
20 January 2018 (online)
Background: The information present in Hospital Information Systems (HIS) is heterogeneous and is used primarily by health practitioners to support and improve patient care. Conducting clinical research, data analyses or knowledge discovery projects using electronic patient data in secondary care centres relies on accurate data collection, which is often an ad-hoc process poorly described in the literature.
Objectives: This paper aims at facilitating and expanding on the process of retrieving and collating patient-centric data from multiple HIS for the purpose of creating a research database. The development of a process roadmap for this purpose illustrates and exposes the constraints and drawbacks of undertaking such work in secondary care centres.
Methods: A data collection exercise was carried using a combined approach based on segments of well established data mining and knowledge discovery methodologies, previous work on clinical data integration and local expert consultation. A case study on prostate cancer was carried out at an English regional National Health Service (NHS) hospital.
Results: The process for data retrieval described in this paper allowed patient-centric data, pertaining to the case study on prostate cancer, to be successfully collected from multiple heterogeneous hospital sources, and collated in a format suitable for further clinical research.
Conclusions: The data collection exercise described in this paper exposes the lengthy and difficult journey of retrieving and collating patient-centric, multi-source data from a hospital, which is indeed a non-trivial task, and one which will greatly benefit from further attention from researchers and hospital IT management.
- 1 Pakhomov S, Weston SA, Jacobsen SJ, Chute CG, Meverden R, Roger VL. Electronic medical records for clinical research: application to the identification of heart failure. American Journal of Managed Care 2007; 13: 281-288.
- 2 Powell J, Buchan I. Electronic Health Records Should Support Clinical Research. Journal of Medical Internet Research 2005; 7 (01) e4
- 3 Berg M, Goorman E. The contextual nature of medical information. International Journal of Medical Informatics 1999; 56: 51-60.
- 4 Sorensen HT, Sabroe S, Olsen J. A framework for evaluation of secondary data sources for epidemiological research. International Journal of Epidemiology 1996; 25: 435-442.
- 5 Haux R. Health information systems past, present, future. International Journal of Medical Informatics 2006; 75: 268-281.
- 6 Reichertz PL. Hospital information systemsPast, present, future. International Journal of Medical Informatics 2006; 75: 282-299.
- 7 The Academy of Medical Sciences A Personal data for public good: using health information in medical research. London: AMS; 2006
- 8 Safran C, Perreault LE. Management of Information in Integrated Delivery Networks. In: Shortliffe EH, Perreault LE, Wiederhold G, Fagan LM. editors Medical Informatics: Computer Applications in Health Care and Biomedicine Springer: 2003: 359-396.
- 9 Cios KJ, Moore W. Uniqueness of Medical Data Mining. Artificial Intelligence in Medicine 2002; 26: 1-24.
- 10 Muller ML, Ganslandt T, Eich HP, Lang K, Ohmann C, Prokosch HU. Towards integration of clinical decision support in commercial hospital information systems using distributed, reusable software and knowledge components. International Journal of Medical Informatics 2001; 64 (2-3) 369-377.
- 11 Sujansky W. Heterogeneous database integration in biomedicine. Journal of Biomedical Informatics 2001; 34: 285-298.
- 12 Mackay DM, Papi C, Roberts N, Bexon N. Ten ways to improve information technology in the NHS. Primary care doctors need to become aware of training opportunities. British Medical Journal 2003; 326: 1034
- 13 de Keizer N, Ammenwerth E. The quality of evidence in health informatics: How did the quality of healthcare IT evaluation publications develop from 1982 to 2005?. International Journal of Medical Informatics 2008; 77 (01) 41-49.
- 14 Debuse JCW, de la Iglesia B, Howard CM, Rayward-Smith VJ. Building the KDD Roadmap: A Methodology for Knowledge Discovery. In: Roy R. editor Industrial Knowledge Management Springer-Verlag: 2000: 179-196.
- 15 Shearer C. The CRISP-DM model: the new blueprint for data mining. Journal of Data Warehousing 2000; 5: 13-22.
- 16 Cios KJ. Medical Data Mining and Knowledge Discovery. Cios KJ. editor Physica-Verlag Heidelberg; 2001
- 17 Brazhnik O, Jones JF. Anatomy of Data Integration. Journal of Biomedical Informatics 2007; 40: 252-269.
- 18 Cios KJ, Kurgan LA. Trends in Data Mining and Knowledge Discovery. In: Pal N.R, Jain L.C, Teoderesku N. (eds.) Knowledge Discovery in Advanced Information Systems Springer: 2002: 200-202.
- 19 Cheung W, Hsu C. The model-assisted global query system for multiple databases in distributed enterprises. ACM Transactions on Information Systems 1996; 14 (04) 421-470.
- 20 Sheth AP, Larson JA. Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys 1990; 22 (03) 183-236.
- 21 Hsu C, Bouziane M, Rattner L, Yee L. Information Resources Management in Heterogeneous, Distributed Environments: A Metadatabase Approach. IEEE Transactions on Software Engineering 1991; 17 (06) 604-625.
- 22 NHS Data Model and Dictionary; 2009 Avail-able from: http://www.connectingforhealth.nhs.uk/systemsandservices/data/nhsdmds/dmd
- 23 Bettencourt J. Extracting Patient-Centric Data from the NHS: A Case Study in Prostate Cancer at the Norfolk & Norwich University Hospital. School of Computing Sciences, University of East Anglia, Norwich 2009
- 24 Thangavel K, Pethalakshmi A. Dimensionality reduction based on rough set theory: A review. Applied Soft Computing 2008; 9: 1-12.
- 25 Gupta A, Lam MS. Estimating Missing Values Using Neural Networks. Journal of the Operational Research Society 1996; 47: 229-238.
- 26 Rahm E, Bernstein PA. A survey of approaches to automatic schema matching. The International Journal on Very Large Data Bases 2001; 10 (04) 334-350.