Open Access
CC BY 4.0 · Eur J Dent
DOI: 10.1055/s-0045-1809155
Original Article

Artificial Intelligence Chatbots as Sources of Implant Dentistry Information for the Public: Validity and Reliability Assessment

Authors

  • Tahani Mohammed Binaljadm

    1   Department of Substitutive Dental Sciences (Prosthodontics), College of Dentistry, Taibah University, Al Madinah, Saudi Arabia
  • Ahmed Yaseen Alqutaibi

    1   Department of Substitutive Dental Sciences (Prosthodontics), College of Dentistry, Taibah University, Al Madinah, Saudi Arabia
    2   Department of Prosthodontics, College of Dentistry, Ibb University, Ibb, Yemen
  • Esam Halboub

    3   Department of Maxillofacial Surgery and Diagnostic Science, College of Dentistry, Jazan University, Jazan, Saudi Arabia
  • Muhammad Sohail Zafar

    4   Department of Clinical Sciences, College of Dentistry, Ajman University, Ajman, United Arab Emirates
    5   Centre of Medical and Bio-allied Health Sciences Research, Ajman University, Ajman, United Arab Emirates
    6   School of Dentistry, Jordan University, Amman, Jordan
  • Samah Saker

    1   Department of Substitutive Dental Sciences (Prosthodontics), College of Dentistry, Taibah University, Al Madinah, Saudi Arabia
Preview

Abstract

Objectives

This study assessed the reliability and validity of responses from three chatbot systems—OpenAI's GPT-3.5, Gemini, and Copilot—concerning frequently asked questions (FAQs) in implant dentistry posed by patients.

Materials and Methods

Twenty FAQs were prompted to three chatbots in three different times utilizing their respective application programming interfaces. The responses were assessed for validity (low and high threshold) and reliability by two prosthodontic consultants using a five-point Likert scale.

Statistical Analysis

The test of normality was utilized using the Shapiro–Wilk test. Differences between different chatbots regarding the quantitative variables in a given (fixed) time point and between the same chatbots in different time points were assessed using Friedman's two-way analysis of variance by ranks, followed by pairwise comparisons. All statistical analyses were conducted using the SPSS (Statistical Package for Social Sciences) Version 26.0 software program.

Results

GPT-3.5 provided the longest responses, while Gemini was the most concise. All chatbots advised consulting dental professionals more frequently. Validity was high under the low-threshold test but low under the high-threshold test, with Copilot scoring the highest. Reliability was high for all, with Gemini achieving perfect consistency.

Conclusion

Chatbots showed consistent and generally valid responses with some variability in accuracy and details. While the chatbots demonstrated a high degree of reliability, their validity—especially under high-threshold criterion—remains limited. Improvements in accuracy and comprehensiveness are necessary for more effective use in providing information about dental implants.

Supplementary Material



Publication History

Article published online:
20 May 2025

© 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Thieme Medical and Scientific Publishers Pvt. Ltd.
A-12, 2nd Floor, Sector 2, Noida-201301 UP, India