Open Access
CC BY 4.0 · Journal of Clinical Interventional Radiology ISVIR
DOI: 10.1055/s-0045-1809953
Original Article

Can ChatGPT Aid in Musculoskeletal Intervention?

Mohamed Ashiq Shazahan
1   Department of Orthopedics, Royal Orthopedic Hospital, Birmingham, United Kingdom
,
Saavi Reddy Pellakuru
2   Department of Musculoskeletal Radiology, Royal Orthopedic Hospital, Birmingham, United Kingdom
,
Sonal Saran
3   Department of Musculoskeletal Radiology, All India Institute of Medical Sciences, Rishikesh, India
,
Shashank Chapala
4   Department of Radiology, AIG Hospitals, Hyderabad, India
,
Sindhura Mettu
5   Department of Radiology, Himagiri Hospitals, Hyderabad, India
,
2   Department of Musculoskeletal Radiology, Royal Orthopedic Hospital, Birmingham, United Kingdom
› Author Affiliations

Funding None.
Preview

Abstract

Objective

Radiology has continuously evolved exploring cutting-edge technologies to improve patient care. It is a prime example of how medical science is propelled forward by technological innovation.

In recent times, artificial intelligence (AI) has played a crucial role in various technological advancements. Chat Generative Pre-trained Transformer (ChatGPT)-4, an AI language model primarily focusing on natural language understanding and generation, is increasingly used to retrieve medical information. This study explores the utility of ChatGPT-4o in aiding imaging-guided musculoskeletal interventions, detailing its advantages and limitations.

Methods

Two musculoskeletal radiologists assessed the information generated by ChatGPT on common musculoskeletal interventions. They analyzed the overall utility of ChatGPT-4o in guiding musculoskeletal interventions by examining the procedure steps and pre- and post-procedure details provided. The assessment was documented in a 5-point Likert scale and subjected to statistical analysis.

Results

The statistical analysis of Likert scale scores by both readers revealed a moderate level of inter-rater agreement, as indicated by a Cohen's Kappa score of 0.54. Across the categories, the mode of Likert score ranged from 1 to 3, as rated by both readers, indicating suboptimal performance. The lowest scores were observed in image quality assessments, whereas the highest ratings were of post-procedure details.

Conclusion

ChatGPT-4o offers structured procedural guidance but falls short in complex, image-dependent tasks due to limited anatomical detail and contextual accuracy. It may aid education, but not clinical use without expert oversight. Domain-specific training, validation, and multidisciplinary collaboration are essential for safe and effective integration into practice.

Ethical Approval

Local ethical committee approval was not required.


Supplementary Material



Publication History

Article published online:
03 July 2025

© 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Thieme Medical and Scientific Publishers Pvt. Ltd.
A-12, 2nd Floor, Sector 2, Noida-201301 UP, India