RSS-Feed abonnieren
DOI: 10.1055/s-0044-1782952
Reliability Concerns: Can AI Interpret Nuanced Medical and Ethical Scenarios in the Field of Gastroenterology
Aims AI tools like ChatGPT and Google Bard are gaining traction in healthcare, notably gastroenterology, offering benefits such as vast data knowledge, swift responses, and easy access. However, their reliability in medical and ethical decision-making is uncertain. They provide information effectively but cannot fully emulate human medical professionals' nuanced understanding and empathy. Especially in ethical decisions, which demand comprehension of personal and contextual factors, these AI tools should serve as adjuncts, not replacements, to expert human judgment.
Methods The study evaluated the medical and ethical dependability of two widely used chatbots, ChatGPT and Google BARD, within the gastroenterology sphere. A questionnaire was administered to both bots, with their responses being rated using a 1-10 Likert scale where 1 indicated exceptional accuracy. To ensure unbiased evaluation, two independent assessors analyzed each bot's answers. The goal was to systematically evaluate the chatbots' competencies and trustworthiness using this performance review. The involvement of dual evaluators and the application of the Likert scale aimed to mitigate any potential bias, strengthening the validity of the findings.
Results Our study compared the dependability of ChatGPT and Google BARD in medical management scenarios. ChatGPT scored 21% (p<0.01), and Google BARD scored 19% (p=0.022) in terms of reliability when juxtaposed with standardized practices. ChatGPT had a higher score than Google BARD among the chatbots (67% vs. 41%, p=0.034). However, both chatbots' reliability scores were inferior compared to standard practice. This underscores the importance of reliability in developing gastroenterology-focused chatbots and the need for ongoing research and improvements.
Conclusions Despite potential benefits, AI tools like ChatGPT and Google Bard currently fall short in assisting medical and ethical decisions in gastroenterology, as shown by lower reliability scores against standardized guidelines. Although ChatGPT marginally outperformed Google Bard, both fail to match human healthcare professionals' nuanced understanding and empathy. This underlines the crucial need for AI dependability and the importance of ongoing research to enhance these technologies, ensuring they support human judgment in decision-making, not supplant them.
Conflicts of interest
Authors do not have any conflict of interest to disclose.
Publikationsverlauf
Artikel online veröffentlicht:
15. April 2024
© 2024. European Society of Gastrointestinal Endoscopy. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany