Open Access
CC BY 4.0 · Brazilian Journal of Oncology 2025; 21
DOI: 10.1055/s-0045-1807868
INNOVATION IN HEALTHCARE
1919
POSTER PRESENTATION

Assessment of ChatGPT's ability to answer frequently asked questions about cancer for the general population

Authors

  • Elisa Patiño Lima

  • Maria Carolina Bedran Ananias

  • Bruna Almeida

  • Erika Staib

  • Larissa Monteiro

  • Gloria Priscila Rodrigues

  • Thiago Elias Peres

  • Rafael Fonseca

  • Luis Fernando Bouzas

  • Victor Duarte

 

    Introduction: ChatGPT consists of a LLM - Large Language Model, and is capable of producing responses on a wide range of topics in natural language. In this context, a new way of patient contact with health information is emerging.

    Objective: Evaluate and quantify the reliability, accuracy, and quality of LLM's cancer information in the context of non-medical internet users. Analyze discrepancies between ChatGPT answers and medical source of information. Identify the emotional approach in the responses provided by Artificial Intelligence.

    Methodology: To analyze the reliability and accuracy of information provided by ChatGPT, 100 questions were developed based on three sources of information: INCA's website FAQ section on cancer, queries of the most trending topics using Google Trends tool, and personal experiences of group members based on common patient inquiries in hospitals and primary care settings. Both formal and informal language, non-technical terms, and subjective words were used to test ChatGPT's ability to handle various question formulations. Each question was prompted on the platform, with a new ChatGPT instance created for each message to prevent previous responses from influencing the next. This process was conducted for both 3.5 and 4.0 versions of ChatGPT. The responses from both models were recorded in a Google Sheets table, separated, and individually sent to each member of the evaluation committee, consisting of four group members. For the analysis, the committee members, supervised by an oncologist, were divided into pairs, each in charge of half of the responses. The criteria used for the evaluation were: correct, complete, empathetic, comprehensible, and if it encourages the user to seek a health professional. For each criterion, one point was given if the condition was met, and zero if not, with no decimal values allowed. To make the final evaluation, it was considered the average score from both pairs.

    Results: Statistical evaluations revealed that although both models had similar accuracy rates, 87% GPT 3.5 versus 91% GPT 4.0, the latter version is usually more complete: 28% vs 74%, respectively. Neither 3.5 nor 4.0 versions address emotional aspects in most answers (5% vs 14%, respectively), but they do answer in a patient-friendly language (92% vs 95%, respectively). In conclusion, it is possible that LLM's may be used in the future as a tool for patient education.

    Corresponding author: Elisa Patiño Lima (e-mail: elisapatinolima@icloud.com).


    No conflict of interest has been declared by the author(s).

    Publication History

    Article published online:
    06 May 2025

    © 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution 4.0 International License, permitting copying and reproduction so long as the original work is given appropriate credit (https://creativecommons.org/licenses/by/4.0/)

    Thieme Revinter Publicações Ltda.
    Rua Rego Freitas, 175, loja 1, República, São Paulo, SP, CEP 01220-010, Brazil

    Bibliographical Record
    Elisa Patiño Lima, Maria Carolina Bedran Ananias, Bruna Almeida, Erika Staib, Larissa Monteiro, Gloria Priscila Rodrigues, Thiago Elias Peres, Rafael Fonseca, Luis Fernando Bouzas, Victor Duarte. Assessment of ChatGPT's ability to answer frequently asked questions about cancer for the general population. Brazilian Journal of Oncology 2025; 21.
    DOI: 10.1055/s-0045-1807868