AI-Assisted knowledge assessment: comparison of ChatGPT and Gemini on undescended testicle in children

Yükleniyor...
Küçük Resim

Tarih

2025

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Aksaray Üniversitesi

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

This study aimed to evaluate the accuracy and completeness of ChatGPT-4 and Google Gemini in answering questions about undescended testis, as these AI tools can sometimes provide seemingly accurate but incorrect information, raising caution in medical applications. Methods: Researchers created 20 identical questions independently and submitted them to both ChatGPT-4 and Google Gemini.A pediatrician and a pediatric surgeon evaluated the responses for accuracy, using the Johnson et al. scale (accuracy rated from 1 to 6 and completeness from 1 to 3).Responses that lacked content received a score of 0. Statistical analyses were performed using R Software (version 4.3.1) to assess differences in accuracy and consistency between the tools. Results: Both chatbots answered all questions, with ChatGPT achieving a median accuracy score of 5.5 and a mean score of 5.35, while Google Gemini had a median score of 6 and a mean of 5.5. Completeness was similar, with ChatGPT scoring a median of 3 and Google Gemini showing comparable performance. Conclusion: ChatGPT and Google Gemini showed comparable accuracy and completeness; however, inconsistencies between accuracy and completeness suggest these AI tools require refinement.Regular updates are essential to improve the reliability of AI-generated medical information on UDT and ensure up-to-date, accurate responses.

Açıklama

Anahtar Kelimeler

ChatGPT, Gemini, Children, Undescended Testicle

Kaynak

Aksaray Üniversitesi Tıp Bilimleri Dergisi

WoS Q Değeri

Scopus Q Değeri

Cilt

5

Sayı

3

Künye