Expanding the Diversity of Texts and Applications: Findings from the Section on Clinical Natural Language Processing of the International Medical Informatics Association Yearbook
29 August 2018 (online)
Objectives: To summarize recent research and present a selection of the best papers published in 2017 in the field of clinical Natural Language Processing (NLP).
Methods: A survey of the literature was performed by the two editors of the NLP section of the International Medical Informatics Association (IMIA) Yearbook. Bibliographic databases PubMed and Association of Computational Linguistics (ACL) Anthology were searched for papers with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. A total of 709 papers were automatically ranked and then manually reviewed based on title and abstract. A shortlist of 15 candidate best papers was selected by the section editors and peer-reviewed by independent external reviewers to come to the three best clinical NLP papers for 2017.
Results: Clinical NLP best papers provide a contribution that ranges from methodological studies to the application of research results to practical clinical settings. They draw from text genres as diverse as clinical narratives across hospitals and languages or social media.
Conclusions: Clinical NLP continued to thrive in 2017, with an increasing number of contributions towards applications compared to fundamental methods. Methodological work explores deep learning and system adaptation across language variants. Research results continue to translate into freely available tools and corpora, mainly for the English language.
- In the reference list below, papers that were shortlisted as best paper candidates are marked with a *.
- 1 Filannino M, Uzuner Ö. Advancing the State of the Art in Clinical NLP through Shared Tasks. Yearb Med Inform 2018; 184-92
- 2 * Pérez A, Weegar R, Casillas A, Gojenola K, Oronoz M, Dalianis H. Semi-supervised medical entity recognition: A study on Spanish and Swedish clinical corpora. J Biomed Inform 2017; Jul; 71: 16-30
- 3 * Tapi Nzali MD, Bringay S, Lavergne C, Mollevi C, Opitz T. What Patients Can Tell Us: Topic Analysis for Social Media on Breast Cancer. JMIR Med Inform 2017; Jul 31; 5 (03) e23
- 4 * Castro SM, Tseytlin E, Medvedeva O, Mitchell K, Visweswaran S, Bekhuis T. , et al. Automated annotation and classification of BI-RADS assessment from radiology reports. J Biomed Inform 2017; May; 69: 177-187
- 5 Norman C, Leeflang M, Zweigenbaum P, Névéol A. Automating Document Discovery in the Systematic Review Process: How to Use Chaff to Extract Wheat. Language and Resource Evaluation Conference, LREC 2018
- 6 Gonzalez-Hernandez G, Sarker A, O’Connor K, Savova G. Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text. Yearb Med Inform 2017; 214-27
- 7 * Cheng Q, Li TM, Kwok CL, Zhu T, Yip PS. Assessing Suicide Risk and Emotional Distress in Chinese Social Media: A Text Mining and Machine Learning Study. J Med Internet Res 2017; Jul 10; 19 (07) e243
- 8 * Miller M, Banerjee T, Muppalla R, Romine W, Sheth A. What Are People Tweeting About Zika? An Exploratory Study Concerning Its Symptoms, Treatment, Transmission, and Prevention. JMIR Public Health Surveill 2017; Jun 19; 3 (02) e38
- 9 * Lu Y, Wu Y, Liu J, Li J, Zhang P. Understanding Health Care Social Media Use From Different Stakeholder Perspectives: A Content Analysis of an Online Health Community. J Med Internet Res 2017; Apr 7; 19 (04) e109
- 10 * Hao H, Zhang K, Wang W, Gao G. A tale of two countries: International comparison of online doctor reviews between China and the United States. Int J Med Inform 2017; Mar; 99: 37-44
- 11 * Amith M, Cunningham R, Savas LS, Boom J, Schvaneveldt R, Tao C. , et al. Using Pathfinder networks to discover alignment between expert and consumer conceptual knowledge from online vaccine content. J Biomed Inform 2017; Oct; 74: 33-45
- 12 Prud’hommeaux E, van Santen J, Gliner D. Vector space models for evaluating semantic fluency in autism. Proc ACL 2017; 32-7
- 13 * Kang T, Zhang S, Tang Y, Hruby GW, Rusanov A, Elhadad N, Weng C. EliIE: An open-source information extraction system for clinical trial eligibility criteria. J Am Med Inform Assoc 2017; Nov 1; 24 (06) 1062-71
- 14 * Iqbal E, Mallah R, Rhodes D, Wu H, Romero A, Chang N. , et al. ADEPt, a semantically-enriched pipeline for extracting adverse drug events from free-text electronic health records. PLoS One 2017; Nov 9; 12 (11) e0187121
- 15 Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, Xu H. CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines. . J Am Med Inform Assoc 2017 Nov 24.
- 16 * Ye Y, Wagner MM, Cooper GF, Ferraro JP, Su H, Gesteland PH. , et al. A study of the transferability of influenza case detection systems between two large healthcare systems. PLoS One 2017; Apr 5; 12 (04) e0174970
- 17 * Sohn S, Wang Y, Wi CI, Krusemark EA, Ryu E, Ali MH. , et al. Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions. . J Am Med Inform Assoc 2017 Nov 30.
- 18 * Bejan CA, Angiolillo J, Conway D, Nash R, Shirey-Rice JK, Lipworth L. , et al. Mining 100 million notes to find homelessness and adverse childhood experiences: 2 case studies of rare and severe social determinants of health in electronic health records. J Am Med Inform Assoc 2018; Jan 1; 25 (01) 61-71
- 19 * Boytcheva S, Angelova G, Angelov Z, Tcharaktchiev D. Mining comorbidity patterns using retrospective analysis of big collection of outpatient records. Health Inf Sci Syst 2017; Sep 28; 5 (01) 3
- 20 Wu S, Miller T, Masanz J, Coarr M, Halgrim S, Carrell D. , et al. Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One 2014; Nov 13; 9 (11) e112774
- 21 Kang T, Zhang S, Xu N, Wen D, Zhang X, Lei J. Detecting negation and scope in Chinese clinical notes using character and word embedding. Comput Methods Programs Biomed 2017; Mar; 140: 53-9
- 22 Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A. Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc 2017; May 1; 24 (03) 607-13
- 23 * Alvaro N, Miyao Y, Collier N. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations. JMIR Public Health Surveill 2017; May 3; 3 (02) e24
- 24 Gkotsis G, Oellrich A, Velupillai S, Liakata M, Hubbard TJ, Dobson RJ. , et al. Characterisation of mental health conditions in social media using Informed Deep Learning. Sci Rep 2017; Mar 22; 7: 45141 . Erratum in: Sci Rep 2017 May 16;7:46813