Coronary Artery Disease Phenotype Detection in an Academic Hospital System SettingFunding The project described was supported by the National Institute of General Medical Sciences, 2U54GM104942–02 and in part by funds from the National Science Foundation (NSF: # 1920920). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Background The United States, and especially West Virginia, have a tremendous burden of coronary artery disease (CAD). Undiagnosed familial hypercholesterolemia (FH) is an important factor for CAD in the U.S. Identification of a CAD phenotype is an initial step to find families with FH.
Objective We hypothesized that a CAD phenotype detection algorithm that uses discrete data elements from electronic health records (EHRs) can be validated from EHR information housed in a data repository.
Methods We developed an algorithm to detect a CAD phenotype which searched through discrete data elements, such as diagnosis, problem lists, medical history, billing, and procedure (International Classification of Diseases [ICD]-9/10 and Current Procedural Terminology [CPT]) codes. The algorithm was applied to two cohorts of 500 patients, each with varying characteristics. The second (younger) cohort consisted of parents from a school child screening program. We then determined which patients had CAD by systematic, blinded review of EHRs. Following this, we revised the algorithm by refining the acceptable diagnoses and procedures. We ran the second algorithm on the same cohorts and determined the accuracy of the modification.
Results CAD phenotype Algorithm I was 89.6% accurate, 94.6% sensitive, and 85.6% specific for group 1. After revising the algorithm (denoted CAD Algorithm II) and applying it to the same groups 1 and 2, sensitivity 98.2%, specificity 87.8%, and accuracy 92.4; accuracy 93% for group 2. Group 1 F1 score was 92.4%. Specific ICD-10 and CPT codes such as “coronary angiography through a vein graft” were more useful than generic terms.
Conclusion We have created an algorithm, CAD Algorithm II, that detects CAD on a large scale with high accuracy and sensitivity (recall). It has proven useful among varied patient populations. Use of this algorithm can extend to monitor a registry of patients in an EHR and/or to identify a group such as those with likely FH.
Keywordsclinical phenotype - clinical registry - coronary artery disease phenotype - accuracy - problem list - validation of algorithm - knowledge management - data validation and verification
Protection of Human and Animal Subjects
This study was performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects and was reviewed by the West Virginia University Institutional Review Board.
Received: 26 May 2020
Accepted: 09 October 2020
06 January 2021 (online)
© 2021. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
- 1 Wright A, Sittig DF, McGowan J, Ash JS, Weed LL. Bringing science to medicine: an interview with Larry Weed, inventor of the problem-oriented medical record. J Am Med Inform Assoc 2014; 21 (06) 964-968
- 2 Liao KP, Ananthakrishnan AN, Kumar V. et al. Methods to develop an electronic medical record phenotype algorithm to compare the risk of coronary artery disease across 3 chronic disease cohorts. PLoS One 2015; 10 (08) e0136651
- 3 Popejoy LL, Khalilia MA, Popescu M. et al. Quantifying care coordination using natural language processing and domain-specific ontology. J Am Med Inform Assoc 2015; 22 (e1): e93-e103
- 4 Teixeira PL, Wei WQ, Cronin RM. et al. Evaluating electronic health record data sources and algorithmic approaches to identify hypertensive individuals. J Am Med Inform Assoc 2017; 24 (01) 162-171
- 5 The Burden of Cardiovascular Disease in West Virginia. Published 2011. Accessed January 29, 2017 at: http://www.wvdhhr.org/bph/hsc/pubs/other/burdenofcvd2010/cvh_burden_2010.pdf
- 6 Benjamin EJ, Muntner P, Alonso A. et al; American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee. Heart Disease and Stroke Statistics-2019 update: a report from the American Heart Association. Circulation 2019; 139 (10) e56-e528
- 7 Kennell Jr TI, Willig JH, Cimino JJ. Clinical informatics researcher's desiderata for the data content of the next generation electronic health record. Appl Clin Inform 2017; 8 (04) 1159-1172
- 8 Aragam KG, Chaffin M, Levinson RT. et al; GRADE Investigators. Phenotypic refinement of heart failure in a national biobank facilitates genetic discovery. Circulation 2018
- 9 Kashyap R, Sarvottam K, Wilson GA, Jentzer JC, Seisa MO, Kashani KB. Derivation and validation of a computable phenotype for acute decompensated heart failure in hospitalized patients. BMC Med Inform Decis Mak 2020; 20 (01) 85
- 10 Rodrigues J, Schulz S, Rector A. et al. ICD-11 and SNOMED CT Common Ontology: Circulatory System. Copenhagen, Denmark: European Federation for Medical Informatics and IOS Press; 2014
- 11 Lingren T, Thaker V, Brady C. et al. Developing an algorithm to detect early childhood obesity in two tertiary pediatric medical centers. Appl Clin Inform 2016; 7 (03) 693-706
- 12 Roldán-García MD, García-Godoy MJ, Aldana-Montes JF. Dione: an OWL representation of ICD-10-CM for classifying patients' diseases. J Biomed Semantics 2016; 7 (01) 62
- 13 Wei WQ, Teixeira PL, Mo H, Cronin RM, Warner JL, Denny JC. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J Am Med Inform Assoc 2016; 23 (e1): e20-e27
- 14 Neal W, Knowles J, Wilemon K. Underutilization of cascade screening for familial hypercholesterolemia. Clin Lipidol 2014; 9 (03) 291-293
- 15 Vinci SR, Rifas-Shiman SL, Cheng JK, Mannix RC, Gillman MW, de Ferranti SD. Cholesterol testing among children and adolescents during health visits. JAMA 2014; 311 (17) 1804-1807
- 16 Ritchie SK, Murphy EC, Ice C. et al. Universal versus targeted blood cholesterol screening among youth: the CARDIAC project. Pediatrics 2010; 126 (02) 260-265
- 17 Nordestgaard BG, Chapman MJ, Humphries SE. et al; European Atherosclerosis Society Consensus Panel. Familial hypercholesterolaemia is underdiagnosed and undertreated in the general population: guidance for clinicians to prevent coronary heart disease: consensus statement of the European Atherosclerosis Society. Eur Heart J 2013; 34 (45) 3478-90a
- 18 Williams R, Schumacher M, Barlow G. et al. Documented need for more effective diagnosis and treatment of familial hypercholesterolemia according to data from 502 heterozygotes in Utah. Am J Cardiol 1993; 72: 18D-24D
- 19 Wald DS, Bestwick JP, Morris JK, Whyte K, Jenkins L, Wald NJ. Child-parent familial hypercholesterolemia screening in primary care. N Engl J Med 2016; 375 (17) 1628-1637
- 20 Lilly CL, Gebremariam YD, Cottrell L, John C, Neal W. Trends in serum lipids among 5th grade CARDIAC participants, 2002-2012. J Epidemiol Community Health 2014; 68 (03) 218-223
- 21 Pyles L, Elliott E, Neal W. Screening for hypercholesterolemia in children. Curr Cardiol Rep 2017; 11: 5
- 22 Elliott E, Lilly C, Murphy E, Pyles LA, Cottrell L, Neal WA. The Coronary Artery Risk Detection in Appalachian Communities (CARDIAC) project: an 18 year review. Curr Pediatr Rev 2017; 13 (04) 265-276
- 23 Pletcher MJ, Vittinghoff E, Thanataveerat A, Bibbins-Domingo K, Moran AE. Young adult exposure to cardiovascular risk factors and risk of events later in life: the Framingham Offspring Study. PLoS One 2016; 11 (05) e0154288
- 24 Luirink IK, Wiegman A, Kusters DM. et al. 20-year follow-up of statins in children with familial hypercholesterolemia. N Engl J Med 2019; 381 (16) 1547-1556
- 25 Wald DS, Kasturiratne A, Godoy A. et al. Child-parent screening for familial hypercholesterolemia. J Pediatr 2011; 159 (05) 865-867
- 26 Denney MJ, Long DM, Armistead MG, Anderson JL, Conway BN. Validating the extract, transform, load process used to populate a large clinical research database. Int J Med Inform 2016; 94: 271-274
- 27 Hicks KA, Tcheng JE, Bozkurt B. et al. 2014 ACC/AHA key data elements and definitions for cardiovascular endpoint events in clinical trials: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Data Standards (Writing Committee to Develop Cardiovascular Endpoints Data Standards). J Am Coll Cardiol 2015; 66 (04) 403-469
- 28 Akobeng AK. Understanding diagnostic tests 1: sensitivity, specificity and predictive values. Acta Paediatr 2007; 96 (03) 338-341