Subscribe to RSS
DOI: 10.1055/a-2647-1142
The Effect of Ambient Artificial Intelligence Scribes on Trainee Documentation Burden
Funding None.

Abstract
Background
Ambient artificial intelligence scribes have become widespread commercial products in the era of generative artificial intelligence. While studies have examined the effect of these tools on the experience of attending physicians, little evidence is available regarding their use by resident physician trainees.
Objectives
To assess trainee experience with an ambient artificial intelligence scribe using measures of usability, acceptability, and documentation burden.
Methods
This prospective observational study enrolled 47 trainees in a 2-month pilot. Pre/postsurveys were conducted with the NASA Task Load Index (NASA-TLX, raw unweighted form, pre/post, for cognitive load during the documentation), the System Usability Scale (post; general usability), the Net Promoter Score (post; acceptability), and the AMIA TrendBurden Survey (pre/post; documentation burden). Electronic health record utilization metrics were obtained from Epic Signal for both the pilot period and a 6-month baseline.
Results
In total, 43/47 (91.5%) of participants adopted the intervention in practice. NASA-TLX scores improved from 56.3 to 43.3 (p < 0.001), and multiple items on the TrendBurden survey improved with high measures of acceptability. No significant difference in time spent on notes activity per note written was observed, with a median increase of 0.4 minutes (p = 0.568).
Conclusion
Trainee use of an ambient artificial intelligence scribe was associated with improvements in documentation burden. Additional research on the effect of this technology on trainee learning and expertise development is needed.
Keywords
generative artificial intelligence - artificial intelligence - graduate medical education - electronic health record - health information technologyProtection of Human and Animal Subjects
The study was determined to be exempt from review by the Yale University Institutional Review Board (approval no.: HIC 2000038118) before participant recruitment.
Publication History
Received: 21 March 2025
Accepted: 01 July 2025
Accepted Manuscript online:
02 July 2025
Article published online:
20 August 2025
© 2025. Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Seth P, Carretas R, Rudzicz F. The utility and implications of ambient scribes in primary care. JMIR AI 2024; 3: e57673
- 2 Shah SJ, Devon-Sand A, Ma SP. et al. Ambient artificial intelligence scribes: physician burnout and perspectives on usability and documentation burden. J Am Med Inform Assoc 2025; 32 (02) 375-380
- 3 Ma SP, Liang AS, Shah SJ. et al. Ambient artificial intelligence scribes: utilization and impact on documentation time. J Am Med Inform Assoc 2025; 32 (02) 381-385
- 4 Haberle T, Cleveland C, Snow GL. et al. The impact of nuance DAX ambient listening AI documentation: a cohort study. J Am Med Inform Assoc 2024; 31 (04) 975-979
- 5 Duggan MJ, Gervase J, Schoenbaum A. et al. Clinician experiences with ambient scribe technology to assist with documentation burden and efficiency. JAMA Netw Open 2025; 8 (02) e2460637
- 6 Hassan H, Zipursky AR, Rabbani N. et al. Special topic on burnout: clinical implementation of artificial intelligence scribes in healthcare: a systematic review. Appl Clin Inform 2025;
- 7 Sloss EA, Abdul S, Aboagyewah MA. et al. Toward alleviating clinician documentation burden: a scoping review of burden reduction efforts. Appl Clin Inform 2024; 15 (03) 446-455
- 8 Hobensack M, Levy DR, Cato K. et al. 25 × 5 symposium to reduce documentation burden: report-out and call for action. Appl Clin Inform 2022; 13 (02) 439-446
- 9 Levy DR, Rossetti SC, Brandt CA. et al. Interventions to mitigate EHR and documentation burden in health professions trainees: a scoping review. Appl Clin Inform 2025; 16 (01) 111-127
- 10 Melnick ER, Harry E, Sinsky CA. et al. Perceived electronic health record usability as a predictor of task load and burnout among US physicians: mediation analysis. J Med Internet Res 2020; 22 (12) e23382
- 11 Harris PA, Taylor R, Minor BL. et al; REDCap Consortium. The REDCap consortium: building an international community of software platform partners. J Biomed Inform 2019; 95: 103208
- 12 Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)–a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 2009; 42 (02) 377-381
- 13 Hart SG. Nasa-task load index (NASA-TLX); 20 years later. Proc Hum Factors Ergon Soc Annu Meet 2006; 50 (09) 904-908
- 14 Bangor A, Kortum PT, Miller JT. An empirical evaluation of the system usability scale. Int J Hum Comput Interact 2008; 24 (06) 574-594
- 15 Adams C, Walpola R, Schembri AM, Harrison R. The ultimate question? Evaluating the use of net promoter score in healthcare: a systematic review. Health Expect 2022; 25 (05) 2328-2339
- 16 TrendBurden: Pulse Survey on Excessive Documentation Burden for Health Professionals | AMIA - American Medical Informatics Association. Accessed February 13, 2025 at: https://amia.org/about-amia/amia-25x5/trendburden-pulse-survey
- 17 Levy DR, Withall JB, Mishuris RG. et al. Defining documentation burden (DocBurden) and excessive DocBurden for all health professionals: a scoping review. Appl Clin Inform 2024; 15 (05) 898-913
- 18 R Core Team. R: A Language and Environment for Statistical Computing. Published online in 2024. Accessed July 4, 2025 at: https://www.R-project.org/
- 19 Wickham H, Chang W, Henry L. et al. ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. Published online April 23, 2024. Accessed February 13, 2025 at: https://cran.r-project.org/web/packages/ggplot2/index.html
- 20 Bangor A, Kortum P, Miller J. Determining what individual SUS scores mean: adding an adjective rating scale. J Usability Stud 2009; 4 (03) 114-234