CC BY-NC-ND 4.0 · Methods Inf Med 2022; 61(S 01): e28-e34
DOI: 10.1055/s-0042-1742388
Original Article

Disambiguating Clinical Abbreviations Using a One-Fits-All Classifier Based on Deep Learning Techniques

Areej Jaber
1   Applied Computing Department, Palestine Technical University - Kadoorie, Tulkarem, Palestine
2   Department of Computer Science, Universidad Carlos III de Madrid, Leganés, Spain
Paloma Martínez
2   Department of Computer Science, Universidad Carlos III de Madrid, Leganés, Spain
› Author Affiliations
Funding This work has been supported by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with UC3M in the line of Excellence of University Professors (EPUC3M17), and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation) and Palestine Technical University - Kadoorie (Palestine). The work was also supported by the PID2020-116527RB-I00 project.


Background Abbreviations are considered an essential part of the clinical narrative; they are used not only to save time and space but also to hide serious or incurable illnesses. Misreckoning interpretation of the clinical abbreviations could affect different aspects concerning patients themselves or other services like clinical support systems. There is no consensus in the scientific community to create new abbreviations, making it difficult to understand them. Disambiguate clinical abbreviations aim to predict the exact meaning of the abbreviation based on context, a crucial step in understanding clinical notes.

Objectives Disambiguating clinical abbreviations is an essential task in information extraction from medical texts. Deep contextualized representations models showed promising results in most word sense disambiguation tasks. In this work, we propose a one-fits-all classifier to disambiguate clinical abbreviations with deep contextualized representation from pretrained language models like Bidirectional Encoder Representation from Transformers (BERT).

Methods A set of experiments with different pretrained clinical BERT models were performed to investigate fine-tuning methods on the disambiguation of clinical abbreviations. One-fits-all classifiers were used to improve disambiguating rare clinical abbreviations.

Results One-fits-all classifiers with deep contextualized representations from Bioclinical, BlueBERT, and MS_BERT pretrained models improved the accuracy using the University of Minnesota data set. The model achieved 98.99, 98.75, and 99.13%, respectively. All the models outperform the state-of-the-art in the previous work of around 98.39%, with the best accuracy using the MS_BERT model.

Conclusion Deep contextualized representations via fine-tuning of pretrained language modeling proved its sufficiency on disambiguating clinical abbreviations; it could be robust for rare and unseen abbreviations and has the advantage of avoiding building a separate classifier for each abbreviation. Transfer learning can improve the development of practical abbreviation disambiguation systems.

Ethical Approval

No human subjects were involved in this project, and institutional review board approval was not required.

Publication History

Received: 26 August 2021

Accepted: 29 October 2021

Article published online:
01 February 2022

© 2022. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

  • References

  • 1 Grossman LV, Mitchell EG, Hripcsak G, Weng C, Vawdrey DK. A method for harmonization of clinical abbreviation and acronym sense inventories. J Biomed Inform 2018; 88: 62-69
  • 2 Holper S, Barmanray R, Colman B, Yates CJ, Liew D, Smallwood D. Ambiguous medical abbreviation study: challenges and opportunities. Intern Med J 2020; 50 (09) 1073-1078
  • 3 Sinha S, McDermott F, Srinivas G, Houghton PWJ. Use of abbreviations by healthcare professionals: what is the way forward?. Postgrad Med J 2011; 87 (1029): 450-452
  • 4 Yim WW, Yetisgen M, Harris WPKS, Kwan SW. Natural language processing in oncology: a review. JAMA Oncol 2016; 2 (06) 797-804
  • 5 Murff HJ, FitzHenry F, Matheny ME. et al. Automated identification of postoperative complications within an electronic medical record using natural language processing. JAMA 2011; 306 (08) 848-855
  • 6 Hanauer D, Aberdeen J, Bayer S. et al. Bootstrapping a de-identification system for narrative patient records: cost-performance tradeoffs. Int J Med Inform 2013; 82 (09) 821-831
  • 7 Jaber A, Martínez P. Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings. In: Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies. Vol. 5. SCITEPRESS - Science and Technology Publications; 2021: 501-508
  • 8 Joopudi V, Dandala B, Devarakonda M. A convolutional route to abbreviation disambiguation in clinical text. J Biomed Inform 2018; 86: 71-78
  • 9 Li I, Yasunaga M, Nuzumlalı MY, Caraballo C, Mahajan S, Krumholz H, Radev D. A neural topic-attention model for medical term abbreviation disambiguation. 2019 arXiv preprint arXiv:1910.14076
  • 10 Navigli R. Word sense disambiguation: a survey. ACM Comput Surv 2009; 41 (02) 1-69
  • 11 Mihalcea R. Knowledge-Based Methods for WSD. In: Agirre E, Edmonds P. eds. Word Sense Disambiguation: Algorithms and Applications. Dordrecht: Springer Netherlands; 2006: 107-131
  • 12 Xu H, Wu Y, Elhadad N, Stetson PD, Friedman C. A new clustering method for detecting rare senses of abbreviations in clinical notes. J Biomed Inform 2012; 45 (06) 1075-1083
  • 13 Finley GP, Pakhomov SVS, McEwan R, Melton GB. Towards comprehensive clinical abbreviation disambiguation using machine-labeled training data. AMIA Annu Symp Proc 2017; 2016: 560-569
  • 14 Wu Y, Xu J, Zhang Y, Xu H. Clinical abbreviation disambiguation using neural word embeddings. In Proceedings of BioNLP 15. 2015: 171-176
  • 15 Màrquez L, Escudero G, Martínez D, Rigau G. Supervised corpus-based methods for WSD. In: Agirre E, Edmonds P. eds. Word Sense Disambiguation: Algorithms and Applications. Dordrecht; Springer Netherlands: 2006: 167-216
  • 16 Wang Y, Hou Y, Che W, Liu T. From static to dynamic word representations: a survey. Int J Mach Learn Cybern 2020; 11: 1611-1630
  • 17 Moon S, Pakhomov S, Melton GB. Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations. AMIA Annu Symp Proc 2012; 2012: 1310-1319
  • 18 Peters M, Neumann M, Iyyer M. et al. Deep Contextualized Word Representations. arXiv preprint 2018;arXiv:1802.05365
  • 19 Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long and Short Papers) Vol. 1. 2019: 4171-4186
  • 20 Liu Y, Lapata M. Text summarization with pretrained encoders. CoRR. 2019: abs/1908.0
  • 21 Chalkidis I, Fergadiotis M, Malakasiotis P, Androutsopoulos I. Large-scale multi-label text classification on {EU} Legislation. CoRR. 2019: abs/1906.0
  • 22 Hakala K, Pyysalo S. Biomedical Named Entity Recognition with Multilingual {BERT}. In: Proceedings of The 5th Workshop on BioNLP Open Shared Tasks. Hong Kong, China: Association for Computational Linguistics; 2019: 56-61
  • 23 Gao Z, Feng A, Song X, Wu X. Target-dependent sentiment classification with BERT. IEEE Access 2019; 7: 154290-154299
  • 24 Laguna JY, Alberola V. Dictionary of medical acronyms, abbreviations and hospital discharge codification related terms. Ministry of Health Publications Center. 2003
  • 25 Moon S, Pakhomov S, Liu N, Ryan JO, Melton GB. A sense inventory for clinical abbreviations and acronyms created using clinical notes and medical dictionary resources. J Am Med Inform Assoc 2014; 21 (02) 299-307
  • 26 Vaswani A, Shazeer N, Parmar N. et al. Attention is all you need. CoRR. 2017: abs/1706.0
  • 27 Jin Q, Liu J, Lu X. Deep contextualized biomedical abbreviation expansion. 2019 ; arXiv preprint arXiv:1906.03360
  • 28 Du J, Qi F, Sun M. Using BERT for word sense disambiguation. arXiv preprint 2019;arXiv:1909.08358
  • 29 Alsentzer E, Murphy JR, Boag W, Weng WH, Jin D, Naumann T, McDermott M. Publicly available clinical BERT embeddings. arXiv preprint 2019;arXiv:1904.03323
  • 30 Johnson AEW, Pollard TJ, Shen L. et al. MIMIC-III, a freely accessible critical care database. Sci Data 2016; 3: 160035
  • 31 Lee J, Yoon W, Kim S. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020; 36 (04) 1234-1240
  • 32 Peng Y, Yan S, Lu Z. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets. arXiv preprint 2019;arXiv:1906.05474
  • 33 MS-BERT. Accessed December 22, 2021:
  • 34 Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw 2015; 61: 85-117
  • 35 Agarap AF. Deep Learning using Rectified Linear Units (ReLU). arXiv preprint 2019;arXiv:1803.08375v2 [cs.NE]
  • 36 Kingma DP, Ba JL. Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings. 2015: 1-15
  • 37 Kashyap A, Burris H, Callison-Burch C, Boland MR. The CLASSE GATOR (CLinical Acronym SenSE disambiGuATOR): a method for predicting acronym sense from neonatal clinical notes. Int J Med Inform 2020; 137: 104101
  • 38 Adams G, Ketenci M, Bhave S, Perotte A, Elhadad N. Zero-shot clinical acronym expansion via Latent Meaning Cells. CoRR. 2020: abs/2010.0:12–40
  • 39 Kim Juyong, and, Gong Linyuan, and, Khim Justin, and Weiss, Jeremy C, . and. Ravikumar P. Improved Clinical Abbreviation Expansion via Non-Sense-Based Approaches. 2020 . Available at: