Interactive NLP in Clinical Care: Identifying Incidental Findings in Radiology Reports

Gaurav Trivedi; Esmaeel R. Dadashzadeh; Robert M. Handzel; Wendy W. Chapman; Shyam Visweswaran; Harry Hochheiser

doi:10.1055/s-0039-1695791

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035026.xml

Share / Bookmark

Facebook Linkedin Weibo

Download PDF

Appl Clin Inform 2019; 10(04): 655-669
DOI: 10.1055/s-0039-1695791

Research Article

Georg Thieme Verlag KG Stuttgart · New York

Interactive NLP in Clinical Care: Identifying Incidental Findings in Radiology Reports

Gaurav Trivedi

¹Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Esmaeel R. Dadashzadeh

²Department of Surgery and Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Robert M. Handzel

³Department of Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Wendy W. Chapman

⁴Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States

,

Shyam Visweswaran

¹Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

⁵Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

,

Harry Hochheiser

¹Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

⁵Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States

› Author Affiliations Funding The research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under award number R01LM012095 and a Provost’s Fellowship in Intelligent Systems at the University of Pittsburgh (awarded to G.T.). The content of the paper is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the University of Pittsburgh.

Further Information

Publication History

25 April 2019

09 July 2019

Publication Date:
04 September 2019 (online)

Also available at

Abstract
Full Text
References
Supplementary Material

Permissions and Reprints

Abstract

Background Despite advances in natural language processing (NLP), extracting information from clinical text is expensive. Interactive tools that are capable of easing the construction, review, and revision of NLP models can reduce this cost and improve the utility of clinical reports for clinical and secondary use.

Objectives We present the design and implementation of an interactive NLP tool for identifying incidental findings in radiology reports, along with a user study evaluating the performance and usability of the tool.

Methods Expert reviewers provided gold standard annotations for 130 patient encounters (694 reports) at sentence, section, and report levels. We performed a user study with 15 physicians to evaluate the accuracy and usability of our tool. Participants reviewed encounters split into intervention (with predictions) and control conditions (no predictions). We measured changes in model performance, the time spent, and the number of user actions needed. The System Usability Scale (SUS) and an open-ended questionnaire were used to assess usability.

Results Starting from bootstrapped models trained on 6 patient encounters, we observed an average increase in F1 score from 0.31 to 0.75 for reports, from 0.32 to 0.68 for sections, and from 0.22 to 0.60 for sentences on a held-out test data set, over an hour-long study session. We found that tool helped significantly reduce the time spent in reviewing encounters (134.30 vs. 148.44 seconds in intervention and control, respectively), while maintaining overall quality of labels as measured against the gold standard. The tool was well received by the study participants with a very good overall SUS score of 78.67.

Conclusion The user study demonstrated successful use of the tool by physicians for identifying incidental findings. These results support the viability of adopting interactive NLP tools in clinical care settings for a wider range of clinical applications.

Keywords

workflow - data display - data interpretation - statistical - medical records systems - computerized

Protection of Human and Animal Subjects

Our data collection and user-study protocols were approved by the University of Pittsburgh's Institutional Review Board (PRO17030447 and PRO18070517).

Supplementary Material

Supplementary Material

References
1 Chapman WW, Nadkarni PM, Hirschman L, D'Avolio LW, Savova GK, Uzuner O. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. J Am Med Inform Assoc 2011; 18 (05) 540-543

Crossref PubMed Search in Google Scholar
2 Salim A, Sangthong B, Martin M, Brown C, Plurad D, Demetriades D. Whole body imaging in blunt multisystem trauma patients without obvious signs of injury: results of a prospective study. Arch Surg 2006; 141 (05) 468-473

PubMed Search in Google Scholar
3 Lumbreras B, Donat L, Hernández-Aguado I. Incidental findings in imaging diagnostic tests: a systematic review. Br J Radiol 2010; 83 (988) 276-289

Crossref PubMed Search in Google Scholar
4 James MK, Francois MP, Yoeli G, Doughlin GK, Lee SW. Incidental findings in blunt trauma patients: prevalence, follow-up documentation, and risk factors. Emerg Radiol 2017; 24 (04) 347-353

Crossref PubMed Search in Google Scholar
5 Sperry JL, Massaro MS, Collage RD. , et al. Incidental radiographic findings after injury: dedicated attention results in improved capture, documentation, and management. Surgery 2010; 148 (04) 618-624

Crossref PubMed Search in Google Scholar
6 Pons E, Braun LMM, Hunink MGM, Kors JA. Natural language processing in radiology: a systematic review. Radiology 2016; 279 (02) 329-343

Crossref PubMed Search in Google Scholar
7 Cai T, Giannopoulos AA, Yu S. , et al. Natural language processing technologies in radiology research and clinical applications. Radiographics 2016; 36 (01) 176-191

Crossref PubMed Search in Google Scholar
8 Grundmeier RW, Masino AJ, Casper TC. , et al; Pediatric Emergency Care Applied Research Network. Identification of long bone fractures in radiology reports using natural language processing to support healthcare quality improvement. Appl Clin Inform 2016; 7 (04) 1051-1068

Thieme Connect PubMed Search in Google Scholar
9 Yetisgen-Yildiz M, Gunn ML, Xia F, Payne TH. Automatic identification of critical follow-up recommendation sentences in radiology reports. AMIA Annual Symposium. Proceedings of the AMIA Symposium; 2011:1593–1602

PubMed
10 Zech J, Pain M, Titano J. , et al. Natural language-based machine learning models for the annotation of clinical radiology reports. Radiology 2018; 287 (02) 570-580

Crossref PubMed Search in Google Scholar
11 Yetisgen M, Klassen P, McCarthy L, Pellicer E, Payne T, Gunn M. Annotation of clinically important follow-up recommendations in radiology reports. In: Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis; 2015 :50–54

PubMed Search in Google Scholar
12 Ware M, Frank E, Holmes G, Hall MA, Witten IH. Interactive machine learning: letting users build classifiers. Int J Hum Comput Stud 2001; 55: 281-292

Crossref PubMed Search in Google Scholar
13 Fails JA, Olsen Jr DR. Interactive machine learning. In: Proceedings of the 8th International Conference on Intelligent User Interfaces; 2003 :39–45

PubMed Search in Google Scholar
14 Amershi S, Fogarty J, Kapoor A, Tan D. Effective end-user interaction with machine learning. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence; 2011:1529–1532

PubMed
15 Amershi S, Cakmak M, Knox WB, Kulesza T. Power to the people: the role of humans in interactive machine learning. AI Mag 2014; 35 (04) 105-120

PubMed Search in Google Scholar
16 Boukhelifa N, Bezerianos A, Lutton E. Evaluation of Interactive Machine Learning Systems. Human and Machine Learning, 2018

PubMed
17 Gobbel GT, Garvin J, Reeves R. , et al. Assisted annotation of medical free text using RapTAT. J Am Med Inform Assoc 2014; 21 (05) 833-841

Crossref PubMed Search in Google Scholar
18 Gobbel GT, Reeves R, Jayaramaraja S. , et al. Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. J Biomed Inform 2014; 48: 54-65

Crossref PubMed Search in Google Scholar
19 Mayfield E, Rosé CP. LightSIDE: Open source machine learning for text. In Handbook of Automated Essay Evaluation, 2013;146–157. Routledge
20 Soysal E, Wang J, Jiang M. , et al. CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines. J Am Med Inform Assoc 2017; ocx132

PubMed Search in Google Scholar
21 D'Avolio LW, Nguyen TM, Goryachev S, Fiore LD. Automated concept-level information extraction to reduce the need for custom software and rules development. J Am Med Inform Assoc 2011; 18 (05) 607-613

Crossref PubMed Search in Google Scholar
22 Ogren PV. Knowtator: a protégé plug-in for annotated corpus construction. In: Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Morristown, NJ, USA; 2006 :273–275

PubMed Search in Google Scholar
23 Savova GK, Masanz JJ, Ogren PV. , et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc 2010; 17 (05) 507-513

Crossref PubMed Search in Google Scholar
24 Malmasi S, Sandor NL, Hosomura N, Goldberg M, Skentzos S, Turchin A. Canary: an NLP platform for clinicians and researchers. Appl Clin Inform 2017; 8 (02) 447-453

Thieme Connect PubMed Search in Google Scholar
25 Osborne JD, Wyatt M, Westfall AO, Willig J, Bethard S, Gordon G. Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning. J Am Med Inform Assoc 2016; 23 (06) 1077-1084

Crossref PubMed Search in Google Scholar
26 Chau DH, Kittur A, Hong JI, Faloutsos C. Apolo: making sense of large network data by combining rich user interaction and machine learning. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA; 2011 :167–176

PubMed Search in Google Scholar
27 Heimerl F, Koch S, Bosch H, Ertl T. Visual classifier training for text document retrieval. IEEE Trans Vis Comput Graph 2012; 18 (12) 2839-2848

Crossref PubMed Search in Google Scholar
28 Kulesza T, Burnett M, Wong W-K, Stumpf S. Principles of explanatory debugging to personalize interactive machine learning. In: Proceedings of the 20th International Conference on Intelligent User Interfaces, New York, NY, USA; 2015 :126–137

PubMed Search in Google Scholar
29 Trivedi G, Pham P, Chapman WW, Hwa R, Wiebe J, Hochheiser H. NLPReViz: an interactive tool for natural language processing on clinical text. J Am Med Inform Assoc 2018; 25 (01) 81-87

Crossref PubMed Search in Google Scholar
30 Choo J, Lee C, Reddy CK, Park H. UTOPIAN: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans Vis Comput Graph 2013; 19 (12) 1992-2001

Crossref PubMed Search in Google Scholar
31 Chuang J, Ramage D, Manning CD, Heer J. Interpretation and trust: designing model-driven visualizations for text analysis. In: ACM Human Factors in Computing Systems (CHI); 2012

Search in Google Scholar
32 Wang Y, Zheng K, Xu H, Mei Q. Interactive medical word sense disambiguation through informed learning. J Am Med Inform Assoc 2018; 25 (07) 800-808

Crossref PubMed Search in Google Scholar
33 Cakmak M, Thomaz AL. Optimality of human teachers for robot learners. In: 2010 IEEE 9th International Conference on Development and Learning; 2010 :64–69

PubMed Search in Google Scholar
34 Gupta D, Saul M, Gilbertson J. Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research. Am J Clin Pathol 2004; 121 (02) 176-186

Crossref PubMed Search in Google Scholar
35 Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960; 20 (01) 37-46

Crossref PubMed Search in Google Scholar
36 Honnibal M, Johnson M. An improved non-monotonic transition system for dependency parsing. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal; 2015 :1373–1378

PubMed Search in Google Scholar
37 Zaidan OF, Eisner J. Using “annotator rationales” to improve machine learning for text categorization. In: In NAACL-HLT; 2007. :260–267

Search in Google Scholar
38 Trivedi G, Hong C, Dadashzadeh ER, Handzel RM, Hochheiser H, Visweswaran S. Identifying incidental findings from radiology reports of trauma patients: an evaluation of automated feature representation methods. Int J Med Inform 2019; 129: 81-87

Crossref PubMed Search in Google Scholar
39 Pedregosa F, Varoquaux G, Gramfort A. , et al. Scikit-learn: machine learning in Python. J Mach Learn Res 2011; 12 (Oct): 2825-2830

Search in Google Scholar
40 Fiebrink R, Cook PR, Trueman D. Human model evaluation in interactive supervised learning. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, New York, NY, USA; 2011 :147–156

PubMed Search in Google Scholar
41 Friedman CP, Wyatt JC. Evaluation Methods in Biomedical Informatics (Health Informatics). Secaucus, NJ: Springer-Verlag New York, Inc.; 2005

Search in Google Scholar
42 Brooke J. SUS: a quick and dirty usability scale. In: Jordan PW, Weerdmeester B, Thomas A, Mclelland IL. , eds. Usability Evaluation in Industry. London: Taylor and Francis; 1996

Search in Google Scholar
43 Sauro J. A Practical Guide to the System Usability Scale: Background, Benchmarks and Best Practices. Denver, CO: CreateSpace; 2011

Search in Google Scholar
44 Perri-Moore S, Kapsandoy S, Doyon K. , et al. Automated alerts and reminders targeting patients: a review of the literature. Patient Educ Couns 2016; 99 (06) 953-959

Crossref PubMed Search in Google Scholar
45 Xu Y, Tsujii J, Chang EI-C. Named entity recognition of follow-up and time information in 20,000 radiology reports. J Am Med Inform Assoc 2012; 19 (05) 792-799

Crossref PubMed Search in Google Scholar
46 Jenniskens K, de Groot JAH, Reitsma JB, Moons KGM, Hooft L, Naaktgeboren CA. Overdiagnosis across medical disciplines: a scoping review. BMJ Open 2017; 7 (12) e018448

Crossref PubMed Search in Google Scholar
47 Amershi S, Weld D, Vorvoreanu M. , et al. Guidelines for Human-AI Interaction. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems; New York, NY, 2019:3–13

PubMed
48 Heer J. Agency plus automation: designing artificial intelligence into interactive systems. Proc Natl Acad Sci U S A 2019; 116 (06) 1844-1850

Crossref PubMed Search in Google Scholar
49 Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med 2019; 380 (14) 1347-1358

Crossref PubMed Search in Google Scholar
50 Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform 2001; 34 (05) 301-310

Crossref PubMed Search in Google Scholar

Supplementary Material

Supplementary Material

Subscribe to RSS

Share / Bookmark

Interactive NLP in Clinical Care: Identifying Incidental Findings in Radiology Reports

Publication History

Abstract

Keywords

Protection of Human and Animal Subjects

Supplementary Material

References