Klin Monbl Augenheilkd 2024; 241(10): 1140-1144
DOI: 10.1055/a-2327-8484
Klinische Studie

Evaluation of Current Artificial Intelligence Programs on the Knowledge of Glaucoma

Evaluierung aktueller Programme zur künstlichen Intelligenz zum Wissen über Glaukom
Eyupcan Sensoy
Ophthalmology, Ankara City Hospital, Ankara, Turkey
,
Mehmet Citirik
Ophthalmology, Ankara City Hospital, Ankara, Turkey
› Author Affiliations

Abstract

Background To measure the success of three different artificial intelligence chatbots, ChatGPT, Bard, and Bing, in correctly answering questions about glaucoma types and treatment modalities and to examine their superiority over each other.

Materials and Methods Thirty-two questions about glaucoma types and treatment modalities were asked using the ChatGPT, Bard, and Bing chatbots. The correct and incorrect answers were also provided. Accuracy rates were compared.

Outcomes Questions asked: ChatGPT answered 56.3%, Bard 78.1%, and Bing 59.4% correctly. There was no statistically significant difference between the three artificial intelligence chatbots in the rate of correct and incorrect answers to the questions asked (p = 0.195).

Conclusion Artificial intelligence chatbots can be used as a tool to access accurate information regarding glaucoma types and treatment modalities. However, the information obtained is not always accurate, and care should be taken when using this information.

Zusammenfassung

Hintergrund Ziel ist es, den Erfolg von drei verschiedenen Chatbots mit künstlicher Intelligenz – ChatGPT, Bard und Bing – bei der richtigen Beantwortung von Fragen zu Glaukomarten und Behandlungsmethoden zu messen und ihre Überlegenheit gegenüber den anderen zu untersuchen.

Methoden Mithilfe der Chatbots ChatGPT, Bard und Bing wurden 32 Fragen zu Glaukomarten und Behandlungsmethoden gestellt. Die richtigen und falschen Antworten wurden ebenfalls angegeben. Die Genauigkeitsraten wurden verglichen.

Ergebnisse Gestellte Fragen: ChatGPT antwortete 56,3%, Bard 78,1% und Bing 59,4% richtig. Es gab keinen statistisch signifikanten Unterschied in der Rate der richtigen und falschen Antworten auf die gestellten Fragen zwischen den drei Chatbots mit künstlicher Intelligenz (p = 0,195).

Schlussfolgerung Chatbots mit künstlicher Intelligenz können als Hilfsmittel eingesetzt werden, um genaue Informationen zu Glaukomarten und Behandlungsmethoden zu erhalten. Die erhaltenen Informationen sind jedoch nicht immer genau und bei der Verwendung dieser Informationen ist Vorsicht geboten.

Conclusion Box

Already Known:

  • Chatbots are new applications that have emerged with the development of artificial intelligence programs.

  • Although the usability of artificial intelligence programs in ophthalmology fields has been tested, the existence and usability of three different artificial intelligence chatbots, which are available for free use, have not been investigated for their superiority over each other in accessing information about glaucoma diseases and treatment methods.

Newly described:

  • Although all three artificial intelligence programs were not statistically superior to each other in answering the questions correctly, the more up-to-date Bard and Bing artificial intelligence chatbots answered the questions with higher accuracy rates.

  • Artificial intelligence programs, including current artificial intelligence programs such as Bard and Bing, may encounter various obstacles (paid access, etc.) in accessing current and accurate information. There seems to be a need for further development of artificial intelligence programs and the completion of these deficiencies.



Publication History

Received: 12 August 2023

Accepted: 12 May 2024

Article published online:
24 July 2024

© 2024. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

 
  • References

  • 1 Evans RS. Electronic Health Records: Then, Now, and in the Future. Yearb Med Inform 2016; 25: S48
  • 2 Rahimy E. Deep learning applications in ophthalmology. Curr Opin Ophthalmol 2018; 29: 254-260
  • 3 Patel VL, Shortliffe EH, Stefanelli M. et al. The coming of age of artificial intelligence in medicine. Artif Intell Med 2009; 46: 5-17
  • 4 Mikolov T, Deoras A, Povey D. et al. Strategies for training large scale neural network language models. 2011 IEEE Workshop on Automatic Speech Recognition and Understanding. Waikoloa, HI, USA: ASRU Proceedings; 2011: 196-201
  • 5 Harasymowycz P, Birt C, Gooi P. et al. Medical Management of Glaucoma in the 21st Century from a Canadian Perspective. J Ophthalmol 2016;
  • 6 Thomas S, Hodge W, Malvankar-Mehta M. The Cost-Effectiveness Analysis of Teleglaucoma Screening Device. PLoS One 2015; 10: e0137913
  • 7 Imrie C, Tatham AJ. Glaucoma: the patientʼs perspective. Br J Gen Pract 2016; 66: e371
  • 8 McMonnies CW. Glaucoma history and risk factors. J Optom 2017; 10: 71-78
  • 9 Hashemi H, Mohammadi M, Zandvakil N. et al. Prevalence and risk factors of glaucoma in an adult population from Shahroud, Iran. J Curr Ophthalmol 2018; 31: 366-372
  • 10 Tanna AP, Boland MV, Giaconi JA. et al. Glaucoma. San Francisco: American Academy of Ophthalmology; 2022
  • 11 Kung TH, Cheatham M, Medenilla A. et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health 2023; 2: e0000198
  • 12 Wen J, Wang W. The future of ChatGPT in academic research and publishing: A commentary for clinical and translational medicine. Clin Transl Med 2023; 13: e1207
  • 13 Khan RA, Jawaid M, Khan AR. et al. ChatGPT – Reshaping medical education and clinical management. Pak J Med Sci 2023; 39: 605
  • 14 Jeblick K, Schachtner B, Dexl J. et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol 2023; 34: 2817-2825
  • 15 Gao CA, Howard FM, Markov NS. et al. Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers. bioRxiv 2022; 2022: 12.23.521610
  • 16 Jin Q, Dhingra B, Liu Z. et al. PubMedQA: A Dataset for Biomedical Research Question Answering. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong: Association for Computational Linguistics; 2019: 2567-2577
  • 17 Gilson A, Safranek CW, Huang T. et al. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med Educ 2023; 9: e45312