Yearb Med Inform 2015; 24(01): 194-198
DOI: 10.15265/IY-2015-035
Original Article
Georg Thieme Verlag KG Stuttgart

Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare

A. Névéol
1   LIMSI CNRS UPR 3251, Orsay, France
,
P. Zweigenbaum
1   LIMSI CNRS UPR 3251, Orsay, France
,
Section Editors for the IMIA Yearbook Section on Clinical Natural Language Processing › Author Affiliations
Further Information

Correspondence to:

Aurélie Névéol, Pierre Zweigenbaum
LIMSI CNRS UPR 3251
Rue John von Neumann
Campus Universitaire d’Orsay
91405 Orsay cedex, France

Publication History

13 August 2015

Publication Date:
10 March 2018 (online)

 

Summary

Objective: To summarize recent research and present a selection of the best papers published in 2014 in the field of clinical Natural Language Processing (NLP).

Method: A systematic review of the literature was performed by the two section editors of the IMIA Yearbook NLP section by searching bibliographic databases with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. A shortlist of candidate best papers was first selected by the section editors before being peer-reviewed by independent external reviewers.

Results: The clinical NLP best paper selection shows that the field is tackling text analysis methods of increasing depth. The full review process highlighted five papers addressing foundational methods in clinical NLP using clinically relevant texts from online forums or encyclopedias, clinical texts from Electronic Health Records, and included studies specifically aiming at a practical clinical outcome. The increased access to clinical data that was made possible with the recent progress of de-identification paved the way for the scientific community to address complex NLP problems such as word sense disambiguation, negation, temporal analysis and specific information nugget extraction. These advances in turn allowed for efficient application of NLP to clinical problems such as cancer patient triage. Another line of research investigates online clinically relevant texts and brings interesting insight on communication strategies to convey health-related information.

Conclusions: The field of clinical NLP is thriving through the contributions of both NLP researchers and healthcare professionals interested in applying NLP techniques for concrete healthcare purposes. Clinical NLP is becoming mature for practical applications with a significant clinical impact.


#

 


#
  • References

  • 1 Lamy JB, Séroussi B, Griffon N, Kerdelhué G, Jaulent MC, Bouaud J. Toward a Formalization of the Process to Select IMIA Yearbook Best Papers. Methods Inf Med 2014 Nov 14;54(1).
  • 2 de Bronkart D. How the e-patient community helped save my life: an essay by Dave deBronkart. BMJ 2013; Apr 2 346: f1990.
  • 3 Deléger L, Lingren T, Ni Y, Kaiser M, Stoutenbo-rough L, Marsolo K. et al. Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research. J Biomed Inform 2014; Aug 50: 173-83.
  • 4 Grouin C, Névéol A. De-identification of clinical notes in French: towards a protocol for reference corpus development. J Biomed Inform. 2014 Aug;50: 151-61.
  • 5 Meystre SM, Ferrández Ó, Friedlin FJ, South BR, Shen S, Samore MH. Text de-identification for privacy protection: a study of its impact on clinical text information content. J Biomed Inform 2014; Aug 50: 142-50.
  • 6 Li M, Carrell D, Aberdeen J, Hirschman L, Malin BA. De-identification of clinical narratives through writing complexity measures. Int J Med Inform 2014; Oct 83 (10) 750-67.
  • 7 * Sánchez D, Batet M, Viejo A. Utility-preserving privacy protection of textual healthcare documents. J Biomed Inform 2014; Dec 52: 189-98.
  • 8 * Gobbel GT, Garvin J, Reeves R, Cronin RM, Heavirland J, Williams J. et al. Assisted annotation of medical free text using RapTAT. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 833-41.
  • 9 Lingren T, Deléger L, Molnar K, Zhai H, Meinzen-Derr J, Kaiser M. et al. Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. J Am Med Inform Assoc 2014; May-Jun 21 (03) 406-13.
  • 10 Henriksson A, Moen H, Skeppstedt M, Daudaravicius V, Duneld M. Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J Biomed Semantics 2014; Feb 5 5 (01) 6.
  • 11 Chasin R, Rumshisky A, Uzuner Ö, Szolovits P. Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 842-9.
  • 12 Stenetorp P, Pyysalo S, Ananiadou S, Tsujii J. Generalising semantic category disambiguation with large lexical resources for fun and profit. J Biomed Semantics 2014; Jun 2 5: 26.
  • 13 Laippala V, Viljanen T, Airola A, Kanerva J, Salanterä S, Salakoski T. et al. Statistical parsing of varieties of clinical Finnish. Artif Intell Med 2014; Jul 61 (03) 131-6.
  • 14 Dligach D, Bethard S, Becker L, Miller T, Savova GK. Discovering body site and severity modifiers in clinical texts. J Am Med Inform Assoc 2014; May-Jun 21 (03) 448-54.
  • 15 Sohn S, Clark C, Halgrim SR, Murphy SP, Chute CG, Liu H. MedXN: an open source medication extraction and normalization tool for clinical text. J Am Med Inform Assoc 2014; SepOct 21 (05) 858-65.
  • 16 Skeppstedt M, Kvist M, Nilsson GH, Dalianis H. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J Biomed Inform 2014; Jun 49: 148-58.
  • 17 Lei J, Tang B, Lu X, Gao K, Jiang M, Xu H. A comprehensive study of named entity recognition in Chinese clinical text. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 808-14.
  • 18 Wang H, Zhang W, Zeng Q, Li Z, Feng K, Liu L. Extracting important information from Chinese Operation Notes with natural language processing methods. J Biomed Inform 2014; Apr 48: 130-6.
  • 19 Wu S, Miller T, Masanz J, Coarr M, Halgrim S, Carrell D. et al. Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One 2014; Nov 13 9 (11) e112774.
  • 20 Afzal Z, Pons E, Kang N, Sturkenboom M, Schuemie MJ, Kors JA. ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus. BMC Bioinformatics 2014; Nov 29 15 (01) 373.
  • 21 Velupillai S, Skeppstedt M, Kvist M, Mowery D, Chapman BE, Dalianis H. et al. Cue-based assertion classification for Swedish clinical text– developing a lexicon for pyConTextSwe. Artif Intell Med 2014; Jul 61 (03) 137-44.
  • 22 Raghavan P, Fosler-Lussier E, Elhadad N, Lai A. Cross-narrative Temporal Ordering of Medical Events. Proc ACL 2014; 998-1008.
  • 23 Biyani P, Caragea C, Mitra P, Yen J. Identifying Emotional and Informational Support in Online Health Communities. Proc COLING 2014 2014; 827-836.
  • 24 Vijayakrishnan R, Steinhubl SR, Ng K, Sun J, Byrd RJ, Daar Z. et al. Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. J Card Fail 2014; Jul 20 (07) 459-64.
  • 25 Byrd RJ, Steinhubl SR, Sun J, Ebadollahi S, Stewart WF. Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records. Int J Med Inform 2014; Dec 83 (12) 983-92.
  • 26 Carrell DS, Halgrim S, Tran DT, Buist DS, Chubak J, Chapman WW. et al. Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence. Am J Epidemiol 2014; Mar 15 179 (06) 749-58.
  • 27 Bellows BK, LaFleur J, Kamauu AW, Ginter T, Forbush TB, Agbor S. et al. Automated identification of patients with a diagnosis of binge eating disorder from narrative electronic health records. J Am Med Inform Assoc 2014; Feb 21 e1 e163-8.
  • 28 Thomas AA, Zheng C, Jung H, Chang A, Kim B, Gelfond J. et al. Extracting data from electronic medical records: validation of a natural language processing program to assess prostate biopsy results. World J Urol 2014; Feb 32 (01) 99-103.
  • 29 Kontio E, Airola A, Pahikkala T, Lundgren-Laine H, Junttila K, Korvenranta H. et al. Predicting patient acuity from electronic patient records. J Biomed Inform 2014; Oct 51: 35-40.
  • 30 Fraser KC, Hirst G, Meltzer JA, Mack JE, Thompson CK. Using statistical parsing to detect agrammatic aphasia. Proceedings of BioNLP 2014, ACL, 2014 June; 134-42.
  • 31 Bullard J, Ovesdotter Alm C, Yu Q, Shi P, Haake A. Towards multimodal modeling of physicians’ diagnostic confidence and self-awareness using medical narratives. Proc COLING 2014: 1718-1727
  • 32 Sarker A, Gonzalez G. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform 2015; Feb 53: 196-207.
  • 33 Iyer SV, Harpaz R, LePendu P, Bauer-Mehren A, Shah NH. Mining clinical text for signals of adverse drug-drug interactions. J Am Med Inform Assoc 2014; Mar-Apr 21 (02) 353-62.
  • 34 Spickard 3rd A, Ridinger H, Wrenn J, O’brien N, Shpigel A, Wolf M. et al. Automatic scoring of medical students’ clinical notes to monitor learning in the workplace. Med Teach 2014; Jan 36 (01) 68-72.
  • 35 Burke HB, Hoang A, Becher D, Fontelo P, Liu F, Stephens M. et al. QNOTE: an instrument for measuring the quality of EHR clinical notes. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 910-6.

Correspondence to:

Aurélie Névéol, Pierre Zweigenbaum
LIMSI CNRS UPR 3251
Rue John von Neumann
Campus Universitaire d’Orsay
91405 Orsay cedex, France

  • References

  • 1 Lamy JB, Séroussi B, Griffon N, Kerdelhué G, Jaulent MC, Bouaud J. Toward a Formalization of the Process to Select IMIA Yearbook Best Papers. Methods Inf Med 2014 Nov 14;54(1).
  • 2 de Bronkart D. How the e-patient community helped save my life: an essay by Dave deBronkart. BMJ 2013; Apr 2 346: f1990.
  • 3 Deléger L, Lingren T, Ni Y, Kaiser M, Stoutenbo-rough L, Marsolo K. et al. Preparing an annotated gold standard corpus to share with extramural investigators for de-identification research. J Biomed Inform 2014; Aug 50: 173-83.
  • 4 Grouin C, Névéol A. De-identification of clinical notes in French: towards a protocol for reference corpus development. J Biomed Inform. 2014 Aug;50: 151-61.
  • 5 Meystre SM, Ferrández Ó, Friedlin FJ, South BR, Shen S, Samore MH. Text de-identification for privacy protection: a study of its impact on clinical text information content. J Biomed Inform 2014; Aug 50: 142-50.
  • 6 Li M, Carrell D, Aberdeen J, Hirschman L, Malin BA. De-identification of clinical narratives through writing complexity measures. Int J Med Inform 2014; Oct 83 (10) 750-67.
  • 7 * Sánchez D, Batet M, Viejo A. Utility-preserving privacy protection of textual healthcare documents. J Biomed Inform 2014; Dec 52: 189-98.
  • 8 * Gobbel GT, Garvin J, Reeves R, Cronin RM, Heavirland J, Williams J. et al. Assisted annotation of medical free text using RapTAT. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 833-41.
  • 9 Lingren T, Deléger L, Molnar K, Zhai H, Meinzen-Derr J, Kaiser M. et al. Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. J Am Med Inform Assoc 2014; May-Jun 21 (03) 406-13.
  • 10 Henriksson A, Moen H, Skeppstedt M, Daudaravicius V, Duneld M. Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J Biomed Semantics 2014; Feb 5 5 (01) 6.
  • 11 Chasin R, Rumshisky A, Uzuner Ö, Szolovits P. Word sense disambiguation in the clinical domain: a comparison of knowledge-rich and knowledge-poor unsupervised methods. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 842-9.
  • 12 Stenetorp P, Pyysalo S, Ananiadou S, Tsujii J. Generalising semantic category disambiguation with large lexical resources for fun and profit. J Biomed Semantics 2014; Jun 2 5: 26.
  • 13 Laippala V, Viljanen T, Airola A, Kanerva J, Salanterä S, Salakoski T. et al. Statistical parsing of varieties of clinical Finnish. Artif Intell Med 2014; Jul 61 (03) 131-6.
  • 14 Dligach D, Bethard S, Becker L, Miller T, Savova GK. Discovering body site and severity modifiers in clinical texts. J Am Med Inform Assoc 2014; May-Jun 21 (03) 448-54.
  • 15 Sohn S, Clark C, Halgrim SR, Murphy SP, Chute CG, Liu H. MedXN: an open source medication extraction and normalization tool for clinical text. J Am Med Inform Assoc 2014; SepOct 21 (05) 858-65.
  • 16 Skeppstedt M, Kvist M, Nilsson GH, Dalianis H. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: an annotation and machine learning study. J Biomed Inform 2014; Jun 49: 148-58.
  • 17 Lei J, Tang B, Lu X, Gao K, Jiang M, Xu H. A comprehensive study of named entity recognition in Chinese clinical text. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 808-14.
  • 18 Wang H, Zhang W, Zeng Q, Li Z, Feng K, Liu L. Extracting important information from Chinese Operation Notes with natural language processing methods. J Biomed Inform 2014; Apr 48: 130-6.
  • 19 Wu S, Miller T, Masanz J, Coarr M, Halgrim S, Carrell D. et al. Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One 2014; Nov 13 9 (11) e112774.
  • 20 Afzal Z, Pons E, Kang N, Sturkenboom M, Schuemie MJ, Kors JA. ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus. BMC Bioinformatics 2014; Nov 29 15 (01) 373.
  • 21 Velupillai S, Skeppstedt M, Kvist M, Mowery D, Chapman BE, Dalianis H. et al. Cue-based assertion classification for Swedish clinical text– developing a lexicon for pyConTextSwe. Artif Intell Med 2014; Jul 61 (03) 137-44.
  • 22 Raghavan P, Fosler-Lussier E, Elhadad N, Lai A. Cross-narrative Temporal Ordering of Medical Events. Proc ACL 2014; 998-1008.
  • 23 Biyani P, Caragea C, Mitra P, Yen J. Identifying Emotional and Informational Support in Online Health Communities. Proc COLING 2014 2014; 827-836.
  • 24 Vijayakrishnan R, Steinhubl SR, Ng K, Sun J, Byrd RJ, Daar Z. et al. Prevalence of heart failure signs and symptoms in a large primary care population identified through the use of text and data mining of the electronic health record. J Card Fail 2014; Jul 20 (07) 459-64.
  • 25 Byrd RJ, Steinhubl SR, Sun J, Ebadollahi S, Stewart WF. Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records. Int J Med Inform 2014; Dec 83 (12) 983-92.
  • 26 Carrell DS, Halgrim S, Tran DT, Buist DS, Chubak J, Chapman WW. et al. Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence. Am J Epidemiol 2014; Mar 15 179 (06) 749-58.
  • 27 Bellows BK, LaFleur J, Kamauu AW, Ginter T, Forbush TB, Agbor S. et al. Automated identification of patients with a diagnosis of binge eating disorder from narrative electronic health records. J Am Med Inform Assoc 2014; Feb 21 e1 e163-8.
  • 28 Thomas AA, Zheng C, Jung H, Chang A, Kim B, Gelfond J. et al. Extracting data from electronic medical records: validation of a natural language processing program to assess prostate biopsy results. World J Urol 2014; Feb 32 (01) 99-103.
  • 29 Kontio E, Airola A, Pahikkala T, Lundgren-Laine H, Junttila K, Korvenranta H. et al. Predicting patient acuity from electronic patient records. J Biomed Inform 2014; Oct 51: 35-40.
  • 30 Fraser KC, Hirst G, Meltzer JA, Mack JE, Thompson CK. Using statistical parsing to detect agrammatic aphasia. Proceedings of BioNLP 2014, ACL, 2014 June; 134-42.
  • 31 Bullard J, Ovesdotter Alm C, Yu Q, Shi P, Haake A. Towards multimodal modeling of physicians’ diagnostic confidence and self-awareness using medical narratives. Proc COLING 2014: 1718-1727
  • 32 Sarker A, Gonzalez G. Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J Biomed Inform 2015; Feb 53: 196-207.
  • 33 Iyer SV, Harpaz R, LePendu P, Bauer-Mehren A, Shah NH. Mining clinical text for signals of adverse drug-drug interactions. J Am Med Inform Assoc 2014; Mar-Apr 21 (02) 353-62.
  • 34 Spickard 3rd A, Ridinger H, Wrenn J, O’brien N, Shpigel A, Wolf M. et al. Automatic scoring of medical students’ clinical notes to monitor learning in the workplace. Med Teach 2014; Jan 36 (01) 68-72.
  • 35 Burke HB, Hoang A, Becher D, Fontelo P, Liu F, Stephens M. et al. QNOTE: an instrument for measuring the quality of EHR clinical notes. J Am Med Inform Assoc 2014; Sep-Oct 21 (05) 910-6.