Subscribe to RSS
DOI: 10.1055/a-2554-0043
Use of Present-on-Admission Indicators to Improve Accuracy of Pulmonary Embolism Identification from Electronic Health Record Data
Funding B.B. is supported by a Career Development Award from the American Heart Association and Vascular InterVentional Advances Physicians (#938814) for the PE-EHR+ study. G.P. received research grants from BMS/Pfizer, Janssen, Alexion, Bayer, Amgen, BSC, Esperion, Regeneron, and 1R01HL164717-01.

Validation of the accuracy of International Classification of Diseases (ICD)-10 codes for identifying pulmonary embolism (PE) in claims databases is essential for research purposes and quality improvement initiatives.[1] [2] While ICD-10 principal discharge diagnosis codes are highly specific, they lack sensitivity, missing 40% of cases.[1] Adding secondary discharge diagnosis codes to principal codes improves sensitivity but increases false positives, diminishing the positive predictive value (PPV) by more than 10%.[1] [3] Therefore, efforts are necessary to reduce false positive results from incorporating secondary codes.
We recently validated an algorithm combining ICD-10 principal codes or secondary codes plus imaging codes to identify patients with PE from electronic health record (EHR) data.[1] However, imaging codes may be unavailable or incomplete at the individual patient level in large claims databases or require substantive time for processing and linkage. The current report focuses on the evaluation of an alternative approach, combining present-on-admission (POA) indicators (i.e., claims codes distinguishing preexisting conditions from those arising during hospitalization[4]) and ICD-10 codes to enhance PE identification, compared with using ICD-10 principal or secondary codes alone.
The rationale and design features of this study (called PE-EHR + ) were described previously.[3] Briefly, it included 1,712 adult hospitalized patients at Mass General Brigham Health System (MGB) between January 1, 2016, and December 31, 2021, to validate EHR-based tools for identifying patients with PE. Patients were selected in three equal-sized groups, including patients with a principal diagnosis code for PE, a secondary diagnosis code for PE, or no discharge diagnosis code for PE. ICD-10 discharge diagnosis codes for PE were I26 and its derivatives. Two independent physicians (A.B. and C.D.K) reviewed medical charts using prespecified criteria as the reference standard.[3] [5] These two physicians independently reviewed the medical charts, including a review of medical notes, vital signs, laboratory data, and imaging reports from computed tomography scans, high-probability ventilation/perfusion scans, ultrasound studies, and others as needed.[3] [5] Discrepancies were resolved by consulting with a third physician (B.B.).[3] [5] According to the prespecified criteria, patients were considered to have PE if acute PE diagnosis was mentioned in medical notes such as discharge summaries, verified by sufficient confirmatory findings for PE in radiology reports during the hospitalization (such as reports for filling defect in computed tomography pulmonary angiography, high-probability ventilation/perfusion scan, direct verification of pulmonary thrombi/emboli in invasive angiography, or presence of new proximal deep vein thrombosis in conjunction with symptoms and signs of PE).[3] [5] The investigators evaluated the imaging reports to differentiate acute PE from chronic-appearing emboli. The location of PE could be either subsegmental, segmental, lobar, and/or central pulmonary arteries.[3] [5] [6]
POA indicators denote whether a diagnosis was present at the time of admission or occurred during hospitalization.[4] [7] POA can be reported as “Y” (diagnosis present on admission), “N” (diagnosis absent on admission), “U” (inadequate documentation and timing of diagnosis cannot be determined), “W” (adequate documentation but the timing of diagnosis is unclear due to clinical uncertainty), and “1” (diagnosis code is exempted from POA reporting; not applicable to PE).[4] [7]
A hybrid approach was tested, incorporating ICD-10 principal codes for PE, or secondary codes plus POA indicators “Y” or “N” for PE (i.e., excluding “U” or “W”) plus the absence of ICD-10 principal or secondary discharge diagnosis codes for PE within 30 days before the index hospitalization ([Fig. 1A]). The rationale for considering the POA indicator “Y” with secondary codes was to minimize false positive findings, as we hypothesized that those with both secondary codes and POA indicator “Y” for PE were more likely to have acute PE than those with secondary codes alone. We considered the POA indicator “N” plus secondary codes for PE to account for hospital-acquired PE. To eliminate patients with a history of recent PE who did not have acute PE in the index presentation, in this subset, we excluded individuals with ICD-10 discharge diagnosis codes for PE in any position within 30 days before the index hospitalization.


We recognized that patients with ICD-10 codes for PE in either the principal or secondary discharge position are disproportionately represented in the unweighted sample compared with their actual prevalence in health care systems, as most patients do not have acute or prior PE.[3] To account for this and ensure an accurate estimation of diagnostic accuracy metrics, it was predetermined that the three equally sized groups—those with a principal diagnosis of PE, those with a secondary diagnosis of PE, and those without a PE diagnosis—should be appropriately weighted.[3] Therefore, weighted estimates were determined considering the total number of hospitalizations at MGB in the study period. From January 1, 2016, to December 31, 2021, there were 4,878 patients at MGB with principal codes for PE, 3,224 patients with secondary codes for PE, and 373,540 patients without codes for PE.[3] Sensitivity, specificity, PPV, and negative predictive values were ascertained via MedCalc.[8] F1 scores were calculated as a metric for the overall performance and compared using the chi-square test. F1 score combines sensitivity and PPV and is calculated using the following formulae: 2 × (PPV × sensitivity)/(PPV + sensitivity) or (2 × true positive)/(2 × true positive + false positive + false negative).[9]
In the unweighted sample of 1,712 patients (mean age 60.6 ± 17.8, 52.3% female), the hybrid approach combining POA indicators with ICD-10 discharge diagnosis codes resulted in a sensitivity of 97.7% and a specificity of 92.3% ([Fig. 1B] and [Table 1]). In weighted estimates, the hybrid approach incorporating POA indicators resulted in higher sensitivity (81.8% vs. 58.3%) and similar PPV (92.7% vs. 92.1%) compared with using only principal discharge diagnosis codes for PE. The hybrid approach achieved comparable sensitivity (81.8% vs. 83.2%) and higher PPV (92.7% vs. 79.1%) than the method using principal or secondary discharge codes. The F1 score was significantly higher for the hybrid approach than using principal codes (0.87 vs. 0.71, p < 0.001) or principal or secondary codes (0.87 vs. 0.81, p < 0.001), indicating its superior performance in identifying PE.
Principal discharge diagnosis codes |
Secondary discharge diagnosis codes |
Principal or secondary discharge diagnosis codes |
Hybrid approach of discharge diagnosis codes and POA indicators |
|
---|---|---|---|---|
Unweighted sample |
||||
Overall, n |
568 |
568 |
1,136 |
908 |
True positive, n |
523 |
338 |
861 |
843 |
False negative, n |
340 |
525 |
2 |
20 |
True negative, n |
804 |
619 |
574 |
784 |
False positive, n |
45 |
230 |
275 |
65 |
Total population |
1,712 |
1,712 |
1,712 |
1,712 |
Sensitivity, % (95% CI) |
523/863 = 60.6 (57.3–63.9) |
338/863 = 39.1 (35.9–42.5) |
861/863 = 99.8 (99.2–99.9) |
843/863 = 97.7 (96.4–98.6) |
Specificity, % (95% CI) |
804/849 = 94.7 (93.0–96.1) |
619/849 = 72.9 (69.8–75.9) |
574/849 = 67.6 (64.4–70.8) |
784/849 = 92.3 (90.4–94.0) |
Weighted sample |
||||
Overall, n |
4,878 |
3,224 |
8,102 |
6,808 |
True positive, n |
4,492 |
1,919 |
6,411 |
6,308 |
False negative, n |
3,216 |
5,789 |
1,297 |
1,400 |
True negative, n |
373,548 |
373,905 |
372,243 |
373,434 |
False positive, n |
386 |
1,305 |
1,691 |
500 |
Total population |
381,642 |
381,642 |
381,642 |
381,642 |
Sensitivity, % (95% CI) |
4,492/7,708 = 58.3 (57.2–59.4) |
1,919/7,708 = 24.9 (23.9–25.9) |
6,411/7,708 = 83.2 (82.3–84.0) |
6,308/7,708 = 81.8 (81.0–82.7) |
Specificity, % (95% CI) |
373,548/373,934 = 99.9 (99.9–99.9) |
373,905/375,210 = 99.7 (99.6–99.7) |
372,243/373,943 = 99.5 (99.5–99.6) |
373,434/373,934 = 99.9 (99.9–99.9) |
PPV, % (95% CI) |
4,492/4,878 = 92.1 (91.3–92.8) |
1,919/3,224 = 59.5 (57.9–61.1) |
6,411/8,102 = 79.1 (78.3–79.9) |
6,308/6,808 = 92.7 (92.0–93.2) |
NPV, % (95% CI) |
373,548/376,764 = 99.1 (99.1–99.2) |
373,905/378,418 = 98.5 (98.5–98.5) |
372,243/373,540 = 99.7 (99.6–99.7) |
373,434/374,834 = 99.6 (99.6–99.6) |
F1 Score |
0.71 |
0.35 |
0.81 |
0.87[a] |
Abbreviations: CI, confidence interval; ICD, International Classification of Diseases; NPV, negative predictive value; PE, pulmonary embolism; POA, present-on-admission; PPV, positive predictive value.
a The F1 score for this hybrid approach was significantly higher than all the other four approaches compared using the chi-square test (p < 0.001).
This study demonstrated that using principal discharge codes or secondary discharge codes paired with POA indicators (“Y” or “N”) plus codes to verify no PE-related hospitalization in the past 30 days improved overall performance for PE identification (higher F1 scores) compared with methods using either principal discharge codes alone or a combination of principal or secondary codes.
Incorporating POA indicators into ICD-10 discharge diagnosis codes has been evaluated in cardiovascular diseases, such as myocardial infarction and heart failure.[10] However, their utility for identifying PE has not been widely studied. Prior studies mainly focused on hospital-acquired venous thromboembolism rather than all patients with acute PE.[4] [11]
A challenge in using ICD-10 secondary codes for PE identification is distinguishing a recent history of PE from acute PE during the hospitalization of interest.[1] In our unweighted sample, 15 out of 17 patients with secondary codes for PE and recent PE-related hospitalization did not have acute PE during index hospitalization, according to chart reviews. Therefore, our hybrid approach excluded these patients, improving PE identification accuracy. However, those patients with principal discharge diagnosis codes (compared with those with only secondary discharge diagnosis codes) are more likely to have PE despite recent PE-related hospitalization, as the PPV of principal codes is substantially higher than secondary codes (92.1% vs. 59.5%). Thus, patients with principal codes for PE and recent PE-related hospitalization were kept in the hybrid approach, as they likely represent recurrent PE.
This study had some limitations. The data were derived from several centers in the United States within the MGB Health Care System. The Centers for Medicare and Medicaid Services demands that medical diagnoses in hospital discharge records be labeled with a POA indicator using similar approaches.[4] Consequently, POA indicators have been widely utilized across the United States for various diagnoses, including PE.[4] [10] [12] [13] Globally, the World Health Organization (WHO) has recommended using a diagnosis-timing flag to improve the ability of coded hospital data to support outcomes research and quality improvement initiatives.[14] While the exact “POA” terminology is not universally adopted, similar practices exist in several other health care systems, such as in the United Kingdom,[15] South Korea,[16] Australia,[14] [17] and Canada.[14] Variations may exist in the use of the POA indicator in other U.S. health systems[13] and, more importantly, other countries than the United States. Future studies are warranted to evaluate the validity of the proposed hybrid approach in other health care systems. Additionally, the hybrid approach using discharge diagnosis codes paired with POA indicators cannot be applied to the minority of patients with low-risk PE managed as outpatients. Furthermore, the current investigation did not explore approaches to validate the detection of PE-related outcomes (e.g., recurrent PE or PE-related death), which should be pursued in future studies.
In conclusion, a hybrid approach comprising ICD-10 principal codes of PE or secondary codes plus POA indicators “Y” or “N” plus no recent PE-related hospitalization can reliably identify PE without compromising sensitivity or PPV, making it useful for future research and quality improvement efforts based on claims data.
Publication History
Received: 24 January 2025
Accepted: 07 March 2025
Article published online:
24 June 2025
© 2025. Thieme. All rights reserved.
Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA
-
References
- 1 Bikdeli B, Khairani CD, Bejjani A. et al.; PE-EHR+ Investigators. Validating International Classification of Diseases Code (ICD) 10(th) Revision Algorithms for Accurate Identification of Pulmonary Embolism. J Thromb Haemost 2025; 23 (02) 556-564
- 2 Burles K, Innes G, Senior K, Lang E, McRae A. Limitations of pulmonary embolism ICD-10 codes in emergency department administrative data: let the buyer beware. BMC Med Res Methodol 2017; 17 (01) 89
- 3 Bikdeli B, Lo YC, Khairani CD. et al. Developing validated tools to identify pulmonary embolism in electronic databases: Rationale and design of the PE-EHR+ Study. Thromb Haemost 2023; 123 (06) 649-662
- 4 Khanna RR, Kim SB, Jenkins I. et al. Predictive value of the present-on-admission indicator for hospital-acquired venous thromboembolism. Med Care 2015; 53 (04) e31-e36
- 5 Bikdeli B, Khairani CD, Bejjani A. et al.; PE-EHR+ Investigators. Validating International Classification of Diseases Code 10th Revision algorithms for accurate identification of pulmonary embolism. J Thromb Haemost 2025; 23 (02) 556-564
- 6 Rashedi S, Bejjani A, Hunsaker AR. et al.; PE-EHR+ Investigators. Isolated subsegmental pulmonary embolism identification based on International Classification of Diseases (ICD)-10 codes and imaging reports. Thromb Res 2025; 247: 109271
- 7 ICD. ICD List. Appendix I - Present on Admission Reporting Guidelines. 2024. Accessed March 13, 2025 at: https://icdlist.com/icd-10/guidelines/appendix-i-present-on-admission-reporting-guidelines
- 8 MedCalc. Diagnostic test evaluation calculator. Accessed March 13, 2025 at: https://www.medcalc.org/calc/diagnostic_test.php
- 9 Mor Y. Diagnostic test evaluation. Translational Interventional Radiology. Academic Press; 2023: 221-224
- 10 Triche EW, Xin X, Stackland S. et al. Incorporating present-on-admission indicators in medicare claims to inform hospital quality measure risk adjustment models. JAMA Netw Open 2021; 4 (05) e218512
- 11 Khanna R, Maynard G, Sadeghi B. et al. Incidence of hospital-acquired venous thromboembolic codes in medical patients hospitalized in academic medical centers. J Hosp Med 2014; 9 (04) 221-225
- 12 Goldman LE, Chu PW, Bacchetti P, Kruger J, Bindman A. Effect of present-on-admission (POA) reporting accuracy on hospital performance assessments using risk-adjusted mortality. Health Serv Res 2015; 50 (03) 922-938
- 13 Goldman LE, Chu PW, Osmond D, Bindman A. The accuracy of present-on-admission reporting in administrative data. Health Serv Res 2011; 46 (6pt1): 1946-1962
- 14 Sundararajan V, Romano PS, Quan H. et al. Capturing diagnosis-timing in ICD-coded hospital data: recommendations from the WHO ICD-11 topic advisory group on quality and safety. Int J Qual Health Care 2015; 27 (04) 328-333
- 15 National Health Services (NHS) of the United Kingdom. NHS Data Model and Dictionary: Present On Admission Indicator. Accessed March 13, 2025 at: https://archive.datadictionary.nhs.uk/DD%20Release%20May%202024/attributes/present_on_admission_indicator.html?utm_source=chatgpt.com
- 16 Lee K, Hwang J, Lee CM. The usefulness of present-on-admission data as an indicator of healthcare quality evaluation using the Korean National Hospital Discharge in-Depth Injury Survey Data from 2006 to 2019. Risk Manag Healthc Policy 2023; 16: 2309-2320
- 17 Triep K, Beck T, Donzé J, Endrich O. Diagnostic value and reliability of the present-on-admission indicator in different diagnosis groups: pilot study at a Swiss tertiary care center. BMC Health Serv Res 2019; 19 (01) 23