Subscribe to RSS
Ambiguous and Incomplete: Natural Language Processing Reveals Problematic Reporting Styles in Thyroid Ultrasound ReportsFunding None.
Objective Natural language processing (NLP) systems convert unstructured text into analyzable data. Here, we describe the performance measures of NLP to capture granular details on nodules from thyroid ultrasound (US) reports and reveal critical issues with reporting language.
Methods We iteratively developed NLP tools using clinical Text Analysis and Knowledge Extraction System (cTAKES) and thyroid US reports from 2007 to 2013. We incorporated nine nodule features for NLP extraction. Next, we evaluated the precision, recall, and accuracy of our NLP tools using a separate set of US reports from an academic medical center (A) and a regional health care system (B) during the same period. Two physicians manually annotated each test-set report. A third physician then adjudicated discrepancies. The adjudicated “gold standard” was then used to evaluate NLP performance on the test-set.
Results A total of 243 thyroid US reports contained 6,405 data elements. Inter-annotator agreement for all elements was 91.3%. Compared with the gold standard, overall recall of the NLP tool was 90%. NLP recall for thyroid lobe or isthmus characteristics was: laterality 96% and size 95%. NLP accuracy for nodule characteristics was: laterality 92%, size 92%, calcifications 76%, vascularity 65%, echogenicity 62%, contents 76%, and borders 40%. NLP recall for presence or absence of lymphadenopathy was 61%. Reporting style accounted for 18% errors. For example, the word “heterogeneous” interchangeably referred to nodule contents or echogenicity. While nodule dimensions and laterality were often described, US reports only described contents, echogenicity, vascularity, calcifications, borders, and lymphadenopathy, 46, 41, 17, 15, 9, and 41% of the time, respectively. Most nodule characteristics were equally likely to be described at hospital A compared with hospital B.
Conclusions NLP can automate extraction of critical information from thyroid US reports. However, ambiguous and incomplete reporting language hinders performance of NLP systems regardless of institutional setting. Standardized or synoptic thyroid US reports could improve NLP performance.
Received: 15 June 2021
Accepted: 05 November 2021
Article published online:
06 January 2022
© 2022. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
- 1 Haugen BR, Alexander EK, Bible KC. et al. 2015 American Thyroid Association Management Guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid 2016; 26 (01) 1-133
- 2 Grant EG, Tessler FN, Hoang JK. et al. Thyroid ultrasound reporting lexicon: white paper of the ACR Thyroid Imaging, Reporting and Data System (TIRADS) Committee. J Am Coll Radiol 2015; 12 (12 Pt A): 1272-1279
- 3 Tessler FN, Middleton WD, Grant EG. et al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): white paper of the ACR TI-RADS Committee. J Am Coll Radiol 2017; 14 (05) 587-595
- 4 Kreimeyer K, Foster M, Pandey A. et al. Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review. J Biomed Inform 2017; 73: 14-29
- 5 Reinsel D, Gantz J, Rydning J. The Digitization of the World from Edge to Core; 2018. MA: IDC;
- 6 Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008; 17 (01) 128-144
- 7 Patterson BW, Jacobsohn GC, Shah MN. et al. Development and validation of a pragmatic natural language processing approach to identifying falls in older adults in the emergency department. BMC Med Inform Decis Mak 2019; 19 (01) 138
- 8 Castro SM, Tseytlin E, Medvedeva O. et al. Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform 2017; 69: 177-187
- 9 Sippo DA, Warden GI, Andriole KP. et al. Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing. J Digit Imaging 2013; 26 (05) 989-994
- 10 Liu K, Mitchell KJ, Chapman WW, Crowley RS. Automating tissue bank annotation from pathology reports—comparison to a gold standard expert annotation set. AMIA Annu Symp Proc 2005; 2005: 460-464
- 11 Xu H, Anderson K, Grann VR, Friedman C. Facilitating cancer research using natural language processing of pathology reports. Stud Health Technol Inform 2004; 107 (Pt 1): 565-572
- 12 Gold S, Elhadad N, Zhu X, Cimino JJ, Hripcsak G. Extracting structured medication event information from discharge summaries. AMIA Annu Symp Proc 2008; 2008: 237-241
- 13 Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 2012; 13 (06) 395-405
- 14 Wu X, Zhao Y, Radev D, Malhotra A. Identification of patients with carotid stenosis using natural language processing. Eur Radiol 2020; 30 (07) 4125-4133
- 15 Adekkanattu P, Jiang G, Luo Y. et al. Evaluating the portability of an NLP system for processing echocardiograms: a retrospective, multi-site observational study. AMIA Annu Symp Proc 2020; 2019: 190-199
- 16 Chen P, Liu Q, Wei L. et al. Automatically structuring on Chinese ultrasound report of cerebrovascular diseases via natural language processing. IEEE Access 2019; 7: 89043-89050
- 17 Swartz J, Koziatek C, Theobald J, Smith S, Iturrate E. Creation of a simple natural language processing tool to support an imaging utilization quality dashboard. Int J Med Inform 2017; 101: 93-99
- 18 Chen KJ, Dedhia PH, Imbus JR, Schneider DF. Thyroid ultrasound reports: will the thyroid imaging, reporting, and data system improve natural language processing capture of critical thyroid nodule features?. J Surg Res 2020; 256: 557-563
- 19 Percha B, Nassif H, Lipson J, Burnside E, Rubin D. Automatic classification of mammography reports by BI-RADS breast tissue composition class. J Am Med Inform Assoc 2012; 19 (05) 913-916
- 20 Yang X, Zhang H, He X, Bian J, Wu Y. Extracting family history of patients from clinical narratives: exploring an end-to-end solution with deep learning models. JMIR Med Inform 2020; 8 (12) e22982
- 21 Pons E, Braun LMM, Hunink MGM, Kors JA. Natural language processing in radiology: a systematic review. Radiology 2016; 279 (02) 329-343
- 22 Cai T, Giannopoulos AA, Yu S. et al. Natural language processing technologies in radiology research and clinical applications. Radiographics 2016; 36 (01) 176-191
- 23 NIH/NLM UMLS Metathesaurus Browser. . Accessed November 18, 2013 at: https://uts.nlm.nih.gov/uts/umls/home
- 24 Wheater E, Mair G, Sudlow C, Alex B, Grover C, Whiteley W. A validated natural language processing algorithm for brain imaging phenotypes from radiology reports in UK electronic health records. BMC Med Inform Decis Mak 2019; 19 (01) 184
- 25 Trivedi G, Dadashzadeh ER, Handzel RM, Chapman WW, Visweswaran S, Hochheiser H. Interactive NLP in clinical care: identifying incidental findings in radiology reports. Appl Clin Inform 2019; 10 (04) 655-669
- 26 Pham A-D, Névéol A, Lavergne T. et al. Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings. BMC Bioinformatics 2014; 15 (01) 266
- 27 Senders JT, Cho LD, Calvachi P. et al. Automating clinical chart review: an open-source natural language processing pipeline developed on free-text radiology reports from patients with glioblastoma. JCO Clin Cancer Inform 2020; 4: 25-34
- 28 Hamour AF, Yang W, Lee JJW. et al. Association of the implementation of a standardized thyroid ultrasonography reporting program with documentation of nodule characteristics. JAMA Otolaryngol Head Neck Surg 2021; 147 (04) 343-349
- 29 Wang JT, Babyn P, Groot G, Otani R. Electronic synoptic reporting of thyroid nodules: potential for reduction in number of patients undergoing thyroid nodule biopsies. Open J Radiol 2016; 06 (03) 233-242
- 30 Gamme G, Parrington T, Wiebe E. et al. The utility of thyroid ultrasonography in the management of thyroid nodules. Can J Surg 2017; 60 (02) 134-139
- 31 Inman A, Liu K, Ong K. et al. Completeness of ultrasound reporting impacts time to biopsy for benign and malignant thyroid nodules. Am J Surg 2017; 213 (05) 931-935
- 32 Ernst BP, Hodeib M, Strieth S. et al. Structured reporting of head and neck ultrasound examinations. BMC Med Imaging 2019; 19 (01) 25
- 33 Griffin AS, Mitsky J, Rawal U, Bronner AJ, Tessler FN, Hoang JK. Improved quality of thyroid ultrasound reports after implementation of the ACR thyroid imaging reporting and data system nodule lexicon and risk stratification system. J Am Coll Radiol 2018; 15 (05) 743-748
- 34 Russ G, Royer B, Bigorgne C, Rouxel A, Bienvenu-Perrard M, Leenhardt L. Prospective evaluation of thyroid imaging reporting and data system on 4550 nodules with and without elastography. Eur J Endocrinol 2013; 168 (05) 649-655
- 35 Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv181004805 Cs. Published online May 24, 2019. Accessed July 29, 2021 at: http://arxiv.org/abs/1810.04805