Methods Inf Med 2015; 54(04): 338-345
DOI: 10.3414/ME15-01-0010
Original Articles
Schattauer GmbH

An Evaluation of Patient Safety Event Report Categories Using Unsupervised Topic Modeling

A. Fong
1   MedStar Institute for Innovation – National Center for Human Factors in Healthcare, Washington, D.C., USA
,
R. Ratwani
1   MedStar Institute for Innovation – National Center for Human Factors in Healthcare, Washington, D.C., USA
2   Georgetown University School of Medicine, Washington, D.C., USA
› Author Affiliations
Further Information

Publication History

received: 14 January 2015

accepted: 27 February 2015

Publication Date:
22 January 2018 (online)

Summary

Objective: Patient safety event data repositories have the potential to dramatically improve safety if analyzed and leveraged appropriately. These safety event reports often consist of both structured data, such as general event type categories, and unstructured data, such as free text descriptions of the event. Analyzing these data, particularly the rich free text narratives, can be challenging, especially with tens of thousands of reports. To overcome the resource intensive manual review process of the free text descriptions, we demonstrate the effectiveness of using an unsupervised natural language processing approach.

Methods: An unsupervised natural language processing technique, called topic modeling, was applied to a large repository of patient safety event data to identify topics, or themes, from the free text descriptions of the data. Entropy measures were used to evaluate and compare these topics to the general event type categories that were originally assigned by the event reporter.

Results: Measures of entropy demonstrated that some topics generated from the un-supervised modeling approach aligned with the clinical general event type categories that were originally selected by the individual entering the report. Importantly, several new latent topics emerged that were not originally identified. The new topics provide additional insights into the patient safety event data that would not otherwise easily be detected.

Conclusion: The topic modeling approach provides a method to identify topics or themes that may not be immediately apparent and has the potential to allow for automatic reclassification of events that are ambiguously classified by the event reporter.

 
  • References

  • 1 Aspden P, Corrigan JW, Erickson SM. Patient Safety Reporting Systems and Applications. In: Patient Safety: Achieving a new standard of care. Washington, D.C: National Academy Press; 2004: 250-278.
  • 2 Rosenthal J, Booth M. Maxmizing the Use of State Adverse Event Data to Improve Patient Safety. Portlan, ME: 2005
  • 3 Clarke JR. How a system for reporting medical errors can and cannot improve patient safety. Am Surg. 2006; 72: 1088-1091. discussion 1126–1148 http://www.ncbi.nlm.nih.gov/pubmed/17120952.
  • 4 Pronovost P, Morlock LL, Sexton B. Improving the value of patient safety reporting systems. In: Advances in patient safety: New directions and alternative approaches. Vol 1. Assessment. Rockville, MD: Agency for Healthcare Research and Quality; 2008
  • 5 White J. Adverse Event Reporting and Learning Systems: A Review of the Relevant Literature. The Canadian Patient Safety Institute. 2007
  • 6 Longo DR, Hewett JE, Ge B. et al. The long road to patient safety: a status report on patient safety systems. JAMA 2005; 294: 2858-2865. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16352793.
  • 7 Spyns P. Natural language processing in medicine: an overview. Methods Inf Med 1996; 35: 285-301.
  • 8 Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc 2011; 18: 544-551. doi:10.1136/amiajnl-2011–000464.
  • 9 Choudhury M De, Gamon M, Counts S. et al. Predicting Depression via Social Media. ICWSM 2013; 2: 128-137. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/viewFile/6124/6351 (accessed Sep 11, 2014).
  • 10 Monroe BL, Colaresi MP, Quinn KM. Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict. Polit Anal 2008; 16: 372-403. doi: 10.1093/pan/mpn018.
  • 11 Melton G, Hripcsak G. Automated detection of adverse events using natural language processing of discharge summaries. J Am Med Informatics Assoc 2005; 12: 448-457.
  • 12 Chapman WW, Nadkarni PM, Hirschman L. et al. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. J Am Med Inform Assoc 2011; 18: 540-543. doi: 10.1136/amiajnl-2011-000465.
  • 13 Wagholikar KB, MacLaughlin KL, Henry MR. et al. Clinical decision support with automated text processing for cervical cancer screening. J Am Med Inform Assoc 2012; 19: 833-839. doi: 10.1136/amiajnl-2012-000820.
  • 14 Doan S, Bastarache L, Klimkowski S. et al. Integrating existing natural language processing tools for medication extraction from discharge summaries. J Am Med Inform Assoc 2010; 17: 528-531. doi: 10.1136/jamia.2010.003855.
  • 15 Ware H, Mullett CJ, Jagannathan V. Natural language processing framework to assess clinical conditions. J Am Med Inform Assoc 2009; 16: 585-589. doi: 10.1197/jamia.M3091.
  • 16 Botsis T, Nguyen MD, Woo EJ. et al. Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. J Am Med Inform Assoc 2011; 18: 631-638. doi: 10.1136/amiajnl-2010-000022.
  • 17 Chai KEK, Anthony S, Coiera E. et al. Using statistical text classification to identify health information technology incidents. J Am Med Informatics Assoc 2013; 20: 1-6. doi: 10.1136/amiajnl-2012-001409.
  • 18 Ong M-S, Magrabi F, Coiera E. Automated identification of extreme-risk events in clinical incident reports. J Am Med Informatics Assoc 2012; 19: e110-118. doi:10.1136/amiajnl-2011-000562.
  • 19 Magrabi F, Ong M-S, Runciman W. et al. Using FDA reports to inform a classification for health information technology safety problems. J Am Med Informatics Assoc 2012; 19: 45-53. doi: 10.1136/amiajnl-2011-000369.
  • 20 Ong M-S, Magrabi F, Coiera E. Automated categorisation of clinical incident reports using statistical text classification. Qual Saf Health Care 2010; 19: e55 doi: 10.1136/qshc.2009.036657.
  • 21 Blei D, Ng A, Jordan M. Latent dirichlet allocation. J Mach Learn Res 2003; 3: 993-1022. http://dl.acm.org/citation.cfm?id=944937 (accessed Sep 11, 2014).
  • 22 Roberts ME, Stewart BM, Tingley D. et al. Structural Topic Models for Open-Ended Survey Responses. Am J Pol Sci 2014; 58 (04) 1064-1082. doi: 10.1111/ajps.12103.
  • 23 Roberts M, Stewart B, Tingley D. et al. The structural topic model and applied social science. 2013. http://mimno.infosci.cornell.edu/nips2013ws/slides/stm.pdf (accessed Sep 11, 2014).
  • 24 Bisgin H, Liu Z, Fang H. et al. Mining FDA drug labels using an unsupervised learning technique - topic modeling. BMC Bioinformatics 2011; 12 Suppl (Suppl. 01) S11 doi: 10.1186/1471-2105-12-S10-S11.
  • 25 Chang J. Collapsed Gibbs sampling methods for topic models 1.3.2 Retrieved from. http://cran.r- project.org/web/packages/lda/index.html 2014
  • 26 Shannon C. A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev. 2001; 5: 3-55. http://dl.acm.org/citation.cfm? id=584093 (accessed Sep 11, 2014).