Summary
Objectives: We present a review of recent advances in clinical Natural Language Processing (NLP),
with a focus on semantic analysis and key subtasks that support such analysis.
Methods: We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing
recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant
referenced publications from the included papers.
Results: Significant articles published within this time-span were included and are discussed
from the perspective of semantic analysis. Three key clinical NLP subtasks that enable
such analysis were identified: 1) developing more efficient methods for corpus creation
(annotation and de-identification), 2) generating building blocks for extracting meaning
(morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical
utility (NLP applications and infrastructure for clinical use cases). Finally, we
provide a reflection upon most recent developments and potential areas of future NLP
development and applications.
Conclusions: There has been an increase of advances within key NLP subtasks that support semantic
analysis. Performance of NLP semantic analysis is, in many cases, close to that of
agreement between humans. The creation and release of corpora annotated with complex
semantic information models has greatly supported the development of new tools and
approaches. Research on non-English languages is continuously growing. NLP methods
have sometimes been successfully employed in real-world clinical tasks. However, there
is still a gap between the development of advanced resources and their utilization
in clinical settings. A plethora of new clinical use cases are emerging due to established
health care initiatives and additional patient-generated sources through the extensive
use of social media and other devices.
Keywords
Clinical Natural Language Processing - Semantics - Information Extraction - Annotation,
Domain Adaptation - Review