Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study

doi:10.2196/51580

Published on 28.Dec.2023 in Vol 25 (2023)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/51580, first published 04.Aug.2023.

Dentist and patient discussing treatment plan on laptop in dental office.

Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study

Kostis Giannakopoulos¹

; Argyro Kavadella¹

; Anas Aaqel Salim¹

; Vassilis Stamatopoulos²

; Eleftherios G Kaklamanos^{1, 3, 4}

Article Authors Cited by (193) Tweetations (7) Metrics

Journals

Balasanjeevi G, Surapaneni K. Comparison of ChatGPT version 3.5 & 4 for utility in respiratory medicine education using clinical case scenarios. Respiratory Medicine and Research 2024;85:101091 View
Goyanes M, Lopezosa C. ChatGPT en Ciencias Sociales: revisión de la literatura sobre el uso de inteligencia artificial (IA) de OpenAI en investigación cualitativa y cuantitativa. Anuario ThinkEPI 2024;18 View
Kamihara T, Tabuchi M, Omura T, Suzuki Y, Aritake T, Hirashiki A, Kokubo M, Shimizu A. Evolution of a Large Language Model for Preoperative Assessment Based on the Japanese Circulation Society 2022 Guideline on Perioperative Cardiovascular Assessment and Management for Non-Cardiac Surgery. Circulation Reports 2024;6(4):142 View
Wang L, Ma Y, Bi W, Lv H, Li Y. An Entity Extraction Pipeline for Medical Text Records Using Large Language Models: Analytical Study. Journal of Medical Internet Research 2024;26:e54580 View
Warrier A, Singh R, Haleem A, Zaki H, Eloy J. The Comparative Diagnostic Capability of Large Language Models in Otolaryngology. The Laryngoscope 2024;134(9):3997 View
Rahimli Ocakoglu S, Coskun B. The Emerging Role of AI in Patient Education: A Comparative Analysis of the Accuracy of Large Language Models for Pelvic Organ Prolapse. Medical Principles and Practice 2024;33(4):330 View
Andreadis K, Newman D, Twan C, Shunk A, Mann D, Stevens E. Mixed methods assessment of the influence of demographics on medical advice of ChatGPT. Journal of the American Medical Informatics Association 2024;31(9):2002 View
Moulaei K, Yadegari A, Baharestani M, Farzanbakhsh S, Sabet B, Reza Afrash M. Generative artificial intelligence in healthcare: A scoping review on benefits, challenges and applications. International Journal of Medical Informatics 2024;188:105474 View
Liu L, Qu S, Zhao H, Kong L, Xie Z, Jiang Z, Zou P. Global trends and hotspots of ChatGPT in medical research: a bibliometric and visualized study. Frontiers in Medicine 2024;11 View
Özbay Y. Evaluation of ChatGPT as a Multiple-Choice Question Generator in Dental Traumatology. Medical Records 2024;6(2):235 View
Rossettini G, Rodeghiero L, Corradi F, Cook C, Pillastrini P, Turolla A, Castellini G, Chiappinotto S, Gianola S, Palese A. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Medical Education 2024;24(1) View
Naz R, Akacı O, Erdoğan H, Açıkgöz A. Can large language models provide accurate and quality information to parents regarding chronic kidney diseases?. Journal of Evaluation in Clinical Practice 2024;30(8):1556 View
Danesh A, Danesh A, Danesh F. Innovating dental diagnostics: ChatGPT's accuracy on diagnostic challenges. Oral Diseases 2025;31(3):911 View
Khan M, O’Sullivan E. A comparison of the diagnostic ability of large language models in challenging clinical cases. Frontiers in Artificial Intelligence 2024;7 View
Zhui L, Yhap N, Liping L, Zhengjie W, Zhonghao X, Xiaoshu Y, Hong C, Xuexiu L, Wei R. Impact of Large Language Models on Medical Education and Teaching Adaptations. JMIR Medical Informatics 2024;12:e55933 View
Zhui L, Fenghe L, Xuehu W, Qining F, Wei R. Ethical Considerations and Fundamental Principles of Large Language Models in Medical Education: Viewpoint. Journal of Medical Internet Research 2024;26:e60083 View
Shah-Mohammadi F, Finkelstein J. Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department. Diagnostics 2024;14(16):1779 View
Akyon S, Akyon F, Camyar A, Hızlı F, Sari T, Hızlı Ş. Evaluating the Capabilities of Generative AI Tools in Understanding Medical Papers: Qualitative Study. JMIR Medical Informatics 2024;12:e59258 View
Is E, Menekseoglu A. Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o. Clinical Rheumatology 2024;43(11):3507 View
Tam T, Sivarajkumar S, Kapoor S, Stolyar A, Polanska K, McCarthy K, Osterhoudt H, Wu X, Visweswaran S, Fu S, Mathur P, Cacciamani G, Sun C, Peng Y, Wang Y. A framework for human evaluation of large language models in healthcare derived from literature review. npj Digital Medicine 2024;7(1) View
Wu Z, Gan W, Xue Z, Ni Z, Zheng X, Zhang Y. Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study. JMIR Medical Education 2024;10:e52746 View
Peng L, Liang R, Zhao A, Sun R, Yi F, Zhong J, Li R, Zhu S, Zhang S, Wu S. Amplifying Chinese physicians’ emphasis on patients’ psychological states beyond urologic diagnoses with ChatGPT – a multicenter cross-sectional study. International Journal of Surgery 2024;110(10):6501 View
Ezanno A, Fougerousse A, Pruvost-Balland C, Maccari F, Fite C. AI in Hidradenitis Suppurativa: Expert Evaluation of Patient-Facing Information. Clinical, Cosmetic and Investigational Dermatology 2024;Volume 17:2459 View
Tokgöz Kaplan T, Cankar M. Evidence‐Based Potential of Generative Artificial Intelligence Large Language Models on Dental Avulsion: ChatGPT Versus Gemini. Dental Traumatology 2025;41(2):178 View
Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, Malin B, Yin Z. Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review. Journal of Medical Internet Research 2024;26:e22769 View
Zheng J, Ding X, Pu J, Chung S, Ai Q, Hung K, Shan Z. Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review. Bioengineering 2024;11(11):1145 View
Chatzopoulos G, Koidou V, Tsalikis L, Kaklamanos E. Large language models in periodontology: Assessing their performance in clinically relevant questions. The Journal of Prosthetic Dentistry 2025;134(6):2328 View
Diniz-Freitas M, López-Pintor R, Santos-Silva A, Warnakulasuriya S, Diz-Dios P. Assessing the accuracy and readability of ChatGPT-4 and Gemini in answering oral cancer queries—an exploratory study. Exploration of Digital Health Technologies 2024:334 View
Bulut S. Bibliometric Analysis of Studies on Chat GPT with Vosviewer. Black Sea Journal of Engineering and Science 2024;7(6):1194 View
Puleio F, Lo Giudice G, Bellocchio A, Boschetti C, Lo Giudice R. Clinical, Research, and Educational Applications of ChatGPT in Dentistry: A Narrative Review. Applied Sciences 2024;14(23):10802 View
Mavrych V, Ganguly P, Bolgova O. Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis. Clinical Anatomy 2025;38(2):200 View
Gosak L, Štiglic G, Pruinelli L, Vrbnjak D. PICOT questions and search strategies formulation: A novel approach using artificial intelligence automation. Journal of Nursing Scholarship 2025;57(1):5 View
Ho C, Tian T, Ayers A, Aaron R, Phillips V, Wolf R, Mathioudakis N, Dai T, Klonoff D. Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review. BMC Medical Informatics and Decision Making 2024;24(1) View
Guven Y, Ozdemir O, Kavan M. Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study. Dental Traumatology 2025;41(3):338 View
Demir S. Evaluation of the reliability and readability of answers given by chatbots to frequently asked questions about endophthalmitis: A cross-sectional study on chatbots. Health Informatics Journal 2024;30(4) View
Khosravi M, Mojtabaeian S, Demiray E, Sayar B. A Systematic Review of the Outcomes of Utilization of Artificial Intelligence Within the Healthcare Systems of the Middle East: A Thematic Analysis of Findings. Health Science Reports 2024;7(12) View
Fanelli F, Saleh M, Santamaria P, Zhurakivska K, Nibali L, Troiano G. Development and Comparative Evaluation of a Reinstructed GPT‐4o Model Specialized in Periodontology. Journal of Clinical Periodontology 2025;52(5):707 View
Demir S. Investigating the role of large language models on questions about refractive surgery. International Journal of Medical Informatics 2025;195:105787 View
Farhadi Nia M, Ahmadi M, Irankhah E. Transforming dental diagnostics with artificial intelligence: advanced integration of ChatGPT and large language models for patient care. Frontiers in Dental Medicine 2025;5 View
Sridhar G, Gumpeny L. Prospects and perils of ChatGPT in diabetes. World Journal of Diabetes 2025;16(3) View
Nitsch K, Ivatury S. Do people prefer AI-generated patient educational materials over traditional ones?. Patient Education and Counseling 2025;134:108672 View
Taşkıran Tepe H, Aslantürk H. Comparative Efficacy of AI LLMs in Clinical Social Work: ChatGPT-4, Gemini, Copilot. Research on Social Work Practice 2025;35(8):968 View
Mustuloğlu Ş, Deniz B. Evaluation of Chatbots in the Emergency Management of Avulsion Injuries. Dental Traumatology 2025;41(4):437 View
Öztürk Z, Bal C, Çelikkaya B. Evaluation of Information Provided by ChatGPT Versions on Traumatic Dental Injuries for Dental Students and Professionals. Dental Traumatology 2025;41(4):427 View
Wang X, Ye H, Zhang S, Yang M, Wang X. Evaluation of the Performance of Three Large Language Models in Clinical Decision Support: A Comparative Study Based on Actual Cases. Journal of Medical Systems 2025;49(1) View
Jin F, Peng X, Sun L, Song Z, Zhou K, Lin C. Knowledge (Co‐)Construction Among Artificial Intelligence, Novice Teachers, and Experienced Teachers in an Online Professional Learning Community. Journal of Computer Assisted Learning 2025;41(2) View
Symeou L, Louca L, Kavadella A, Mackay J, Danidou Y, Raffay V. Development of Evidence‐Based Guidelines for the Integration of Generative AI in University Education Through a Multidisciplinary, Consensus‐Based Approach. European Journal of Dental Education 2025;29(2):285 View
Dermata A, Arhakis A, Makrygiannakis M, Giannakopoulos K, Kaklamanos E. Evaluating the evidence-based potential of six large language models in paediatric dentistry: a comparative study on generative artificial intelligence. European Archives of Paediatric Dentistry 2025;26(3):527 View
Koidou V, Chatzopoulos G, Tsalikis L, Kaklamanos E. Large Language Models in peri-implant disease: How well do they perform?. The Journal of Prosthetic Dentistry 2025;134(6):2435 View
Şişman A, Acar A. Artificial intelligence-based chatbot assistance in clinical decision-making for medically complex patients in oral surgery: a comparative study. BMC Oral Health 2025;25(1) View
Ozdemir Z, Yapici E. Evaluating the Accuracy, Reliability, Consistency, and Readability of Different Large Language Models in Restorative Dentistry. Journal of Esthetic and Restorative Dentistry 2025;37(7):1740 View
Terzi M, Yavuz M, Bicer T, Buyuk S. Evaluation of artificial intelligence robot’s knowledge and reliability on dental implants and peri-implant phenotype. Scientific Reports 2025;15(1) View
Kim‐Berman H, Tarlie J, Herremans J. Student Performance on the American Board of Orthodontics Written Examination Following Flipped Classroom and Generative AI Approach. Journal of Dental Education 2025;89(S3):1781 View
Durmazpinar P, Ekmekci E. Comparing diagnostic skills in endodontic cases: dental students versus ChatGPT-4o. BMC Oral Health 2025;25(1) View
Mine Y, Okazaki S, Taji T, Kawaguchi H, Kakimoto N, Murayama T. Benchmarking multimodal large language models on the dental licensing examination: Challenges with clinical image interpretation. Journal of Dental Sciences 2025;20(4):2427 View
Gökcek Taraç M. Evaluation of Artificial Intelligence Chatbots in the Management of Primary Tooth Traumas: A Comparative Analysis. Journal of International Dental Sciences 2025;11(1):22 View
Mao T, Zhao X, Jiang K, Xie Q, Yang M, Wang R, Gao F. A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study. DIGITAL HEALTH 2025;11 View
Lafourcade C, Kérourédan O, Ballester B, Richert R. Accuracy, consistency, and contextual understanding of large language models in restorative dentistry and endodontics. Journal of Dentistry 2025;157:105764 View
Li S, Jiang J, Yang X. Preliminary assessment of large language models’ performance in answering questions on developmental dysplasia of the hip. Journal of Children's Orthopaedics 2025;19(3):207 View
Du Y, Ji C, Xu J, Wei M, Ren Y, Xia S, Zhou J. Performance of ChatGPT and Microsoft Copilot in Bing in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports. Scientific Reports 2025;15(1) View
AlFarabi Ali S, AlDehlawi H, Jazzar A, Ashi H, Esam Abuzinadah N, AlOtaibi M, Algarni A, Alqahtani H, Akeel S, Almazrooa S. The Diagnostic Performance of Large Language Models and Oral Medicine Consultants for Identifying Oral Lesions in Text-Based Clinical Scenarios: Prospective Comparative Study. JMIR AI 2025;4:e70566 View
Özbay Y, Erdoğan D, Dinçer G. Evaluation of the performance of large language models in clinical decision-making in endodontics. BMC Oral Health 2025;25(1) View
Sarı M, Tufenkci P. Evaluation of the Competency of Large Language Models GPT-4o and Claude 3.5 Sonnet in Endodontic Emergencies. European Annals of Dental Sciences 2025;52(1):10 View
Yang X, Xiao Y, Liu D, Deng H, Huang J, Zhou Y, Dai C, Wu J, Liu D, Liang M, Xu C. Cross language transformation of free text into structured lobectomy surgical records from a multi center study. Scientific Reports 2025;15(1) View
Dawadi R, Vu T, Tay J, Tran Ngoc Hoang P, Oya A, Yamamoto M, Watanabe N, Kuriya Y, Araki M. Large language models and questions from older adults: a human and machine-based evaluation study. Discover Artificial Intelligence 2025;5(1) View
Llorente de Pedro M, Suárez A, Algar J, Díaz-Flores García V, Andreu-Vázquez C, Freire Y. Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning. Applied Sciences 2025;15(10):5231 View
Khareedi R, Fernandez D. The Role of Chatbots in Enquiry‐Based Learning for Oral Health Students—An Exploratory Study. European Journal of Dental Education 2026;30(2):280 View
Alhazmi N, Alshehri A, BaHammam F, Philip M, Nadeem M, Khanagar S. Can Large Language Models Serve as Reliable Tools for Information in Dentistry? A Systematic Review. International Dental Journal 2025;75(4):100835 View
Chen Y, Lee S, Sheu H, Lin S, Hu C, Fu S, Yang C, Lin Y. Enhancing responses from large language models with role-playing prompts: a comparative study on answering frequently asked questions about total knee arthroplasty. BMC Medical Informatics and Decision Making 2025;25(1) View
Zhang Y, Liu Z, Bai S, Xu T, Lau R. Reference decisions enhance LLM performance, amplified by source disclosure. DIGITAL HEALTH 2025;11 View
Freire Y, Santamaría Laorden A, Orejas Pérez J, Ortiz Collado I, Gómez Sánchez M, Thuissard Vasallo I, Díaz-Flores García V, Suárez A, Kolahi J. Evaluating the influence of prompt formulation on the reliability and repeatability of ChatGPT in implant-supported prostheses. PLOS One 2025;20(5):e0323086 View
Liu H, Peng J, Li L, Deng A, Huang X, Yin G, Luo H. Large Language Models as a Consulting Hotline for Patients With Breast Cancer and Specialists in China: Cross-Sectional Questionnaire Study. JMIR Medical Informatics 2025;13:e66429 View
Abouzeid E, Wassef R, Jawwad A, Harris P. Chatbots’ Role in Generating Single Best Answer Questions for Undergraduate Medical Student Assessment: Comparative Analysis. JMIR Medical Education 2025;11:e69521 View
Chatzopoulos G, Koidou V, Tsalikis L, Kaklamanos E. Clinical Applications of Artificial Intelligence in Periodontology: A Scoping Review. Medicina 2025;61(6):1066 View
Babaee Hemmati Y, Rasouli M, Falahchai M. Comparative Analysis of ChatGPT‐3.5 and GPT‐4 in Open‐Ended Clinical Reasoning Across Dental Specialties. European Journal of Dental Education 2026;30(2):526 View
Yin L, He H, Zhang H, Shang Y, Fu C, Wu S, Jin T. Revolution of AAV in Drug Discovery: From Delivery System to Clinical Application. Journal of Medical Virology 2025;97(6) View
Chatzopoulos G, Koidou V, Tsalikis L, Kaklamanos E. Evaluation of Large Language Model Performance in Answering Clinical Questions on Periodontal Furcation Defect Management. Dentistry Journal 2025;13(6):271 View
Kim N, Ahn Y, Myers G, Bach B. How Good Is ChatGPT in Giving Advice on Your Visualization Design?. ACM Transactions on Computer-Human Interaction 2025;32(5):1 View
Chau R, Thu K, Yu O, Hsung R, Wang D, Man M, Wang J, Lam W. Evaluation of Chatbot Responses to Text-Based Multiple-Choice Questions in Prosthodontic and Restorative Dentistry. Dentistry Journal 2025;13(7):279 View
Felemban D, Jazzar A, Mair Y, Alsharif M, Alsharif A, Kassim S. Evaluating the accuracy of CHATGPT models in answering multiple-choice questions on oral and maxillofacial pathologies and oral radiology. DIGITAL HEALTH 2025;11 View
Zheng X, Zou H, Wu L, Dong P, Yuan W, Chen Y. Generative artificial intelligence in cardiovascular specialty care: a scoping review. BMC Nursing 2025;24(1) View
Shirani M, Emami M. Performance comparison of large language models in treatment planning for the restoration of endodontically treated teeth over time. Journal of Dentistry 2025;161:105998 View
Dede B, Çakar İ, Oğuz M, Alyanak B, Bağcıer F. Could a New Method of Acromiohumeral Distance Measurement Emerge? Artificial Intelligence vs. Physician. Journal of Imaging Informatics in Medicine 2025;39(2):1645 View
Dolu F, Ay O, Kupeli A, Karademir E, Büyükavcı M. Evaluation of ChatGPT-4 as an Online Outpatient Assistant in Puerperal Mastitis Management: Content Analysis of an Observational Study. JMIR Medical Informatics 2025;13:e68980 View
Ke Y, Yang Ong B, Jin L, Sim J, Chan C, Soh C, Wong D, Liu N, Sng B, Ting D, Yeo S, Ong M, Abdullah H. Clinical and economic impact of a large language model in perioperative medicine: a randomized crossover trial. npj Digital Medicine 2025;8(1) View
Abuabara A, do Nascimento T, Trentini S, Costa Gonçalves A, Hueb de Menezes-Oliveira M, Madalena I, Beisel-Memmert S, Kirschneck C, Antunes L, Miranda de Araujo C, Baratto-Filho F, Küchler E. Evaluating the accuracy of generative artificial intelligence models in dental age estimation based on the Demirjian's method. Frontiers in Dental Medicine 2025;6 View
Li X, Zhang Y, Zheng T, Deng Y, Lu Y, Hu J, Chen S, Li Y, Wang K. Using large language models to generate child-friendly education materials on myopia. DIGITAL HEALTH 2025;11 View
Cong-Lem N. Rethinking evidence-informed policy and practice in the age of generative artificial intelligence. London Review of Education 2025;23(1) View
Yıldız M, Topuz A, Polat H, Taşlıbeyaz E, Kurşun E, Yeşilyurt S. A comparative analysis of human and GenAI-generated feedback on EFL students’ argumentative writing performance. Educational Psychology 2026;46(1):52 View
Sciarra F, Caivano G, Cacioppo A, Messina P, Cumbo E, Di Vita E, Scardina G. Dentistry in the Era of Artificial Intelligence: Medical Behavior and Clinical Responsibility. Prosthesis 2025;7(4):95 View
Şahin N, Kaleli N, Ural Ç. Evaluation of color matching accuracy using artificial intelligence applications and a spectrophotometer: A photometric analysis. The Journal of Prosthetic Dentistry 2025;134(5):1955.e1 View
Salvi S, Vu G, Gurupur V, King C. Digital Convergence in Dental Informatics: A Structured Narrative Review of Artificial Intelligence, Internet of Things, Digital Twins, and Large Language Models with Security, Privacy, and Ethical Perspectives. Electronics 2025;14(16):3278 View
Jamil S, Alshathri N, Alsalamah S, Almansour N, Alsalamah F, Hameed T, Alqanatish J. Leveraging large language models to inform paediatric chronic condition care: a cross-sectional study. BMJ Paediatrics Open 2025;9(1):e003742 View
Mukhopadhyay A, Mukhopadhyay S, Biswas R. Evaluation of large language models in pediatric dentistry: a Bloom’s taxonomy-based analysis. Folia Medica 2025;67(4) View
Kaya Kaçar H, Sarıkaya B. Assessing AI-based chatbots accuracy in caloric estimation: A focus on traditional Turkish foods. The European Research Journal 2025;11(5):922 View
Lee J, Choi S, Park S, Hwang S, Cho D. Evaluation of Six Large Language Models for Clinical Decision Support: Application in Transfusion Decision-making for RhD Blood-type Patients. Annals of Laboratory Medicine 2025;45(5):520 View
Bashah A, Salem A, Al-waqeerah A, Ghaleb E, Wahan N, Awad A, Al-tos O, Chen G. Evaluation of deepseek, gemini, ChatGPT-4o, and perplexity in responding to salivary gland cancer. BMC Oral Health 2025;25(1) View
Hamada M, Kikuchi S, Akitomo T, Kusaka S, Iwamoto Y, Nomura R. Applications and potential of ChatGPT in dentistry: Scoping review of research perspectives. Journal of Dental Sciences 2026;21(1):1 View
Li K, Peng Y, Li L, Liu B, Huang Z. Evaluating ChatGPT’s Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search. JMIR Formative Research 2025;9:e76458 View
Moșoi A, Maican C, Cazan A, Sumedrea S. Do students need to think hard? The interplay of AI and cognitive abilities in solving problems. Education and Information Technologies 2025;30(17):24337 View
Chatzopoulos G, Koidou V, Tsalikis L, Kaklamanos E. Artificial intelligence for detection and classification of furcation defects using radiographic imaging: A systematic review. Imaging Science in Dentistry 2025;55(4):322 View
Serindere G, Aktuna Belgin C, Gündüz K. Is ChatGPT a Sufficient and Readable Help Tool for the Most Frequently Asked Questions in General Dentistry?. European Annals of Dental Sciences 2025;52(2):97 View
Benito López P, Adamo D, Caponio V, González‐Serrano J, dos Santos Silva A, de Pedro Herráez M, Albuquerque R, López Jornet M, Brailo V, Farag A, Diniz Freitas M, Noma N, Riordain R, Hernández G, López‐Pintor R. Can Large Artificial Intelligence‐Based Linguistic Models Help to Obtain Information About Burning Mouth Syndrome?. Oral Diseases 2026;32(4):1149 View
Özyemişci N, Bal B, Güngör M, Öztürk E, Canvar A, Nemli S. Evaluation of information provided by artificial intelligence chatbots on extraoral maxillofacial prostheses. The Journal of Prosthetic Dentistry 2025;134(6):2623.e1 View
Shukla M, Pandey D, Agarwal M, Kaur S, Goyal A. Assessing the Capabilities of Artificial Intelligence (AI) Tools in Community Medicine: A Comparative Study of ChatGPT, Gemini, and Bing in Community-Based Clinico-Social Case Interpretation. Cureus 2025 View
Raj M, Ravindran V. A Comparative Study on Generative Artificial Intelligence by Evaluating Multiple Large Language Models for Guidance to Parents Toward Pediatric Dentistry: A Multimodal Comparative LLM Study. Journal of International Oral Health 2025;17(5):378 View
Villena F, Véliz C, García-Huidobro R, Aguayo S. Generative artificial intelligence in dentistry: A narrative review of current approaches and future challenges. Dentistry Review 2025;5(4):100160 View
Gokkurt Yilmaz B, Ozbey F, Yilmaz B. Evaluation of the performance of different large language models on head and neck anatomy questions in the dentistry specialization exam in Turkey. Surgical and Radiologic Anatomy 2025;47(1) View
Balel Y. Comparative study of technical and patient-related question answering quality of DeepSeek-R1 and ChatGPT-4o in the field of oral and maxillofacial surgery. Oral and Maxillofacial Surgery 2025;29(1) View
Sezer B, Okutan A. Evaluation of ChatGPT-4’s performance on pediatric dentistry questions: accuracy and completeness analysis. BMC Oral Health 2025;25(1) View
Çeki̇ç E, Tavşan O. Evaluating large language models using national endodontic specialty examination questions: are they ready for real-world dentistry?. BMC Medical Education 2025;25(1) View
Makrygiannakis M, Kaklamanos E. Assessment of AI software's diagnostic accuracy in identifying impacted teeth in panoramic radiographs. European Journal of Orthodontics 2025;47(5) View
Demir Cicek B, Cicek O. Evaluating the Response of AI-Based Large Language Models to Common Patient Concerns About Endodontic Root Canal Treatment: A Comparative Performance Analysis. Journal of Clinical Medicine 2025;14(21):7482 View
Nana V, Marshall M. Generative Artificial Intelligence in Healthcare: A Bibliometric Analysis and Review of Potential Applications and Challenges. AI 2025;6(11):278 View
Ardila C, Pineda-Vélez E, Vivares-Builes A. Artificial Intelligence in Endodontic Education: A Systematic Review with Frequentist and Bayesian Meta-Analysis of Student-Based Evidence. Dentistry Journal 2025;13(11):489 View
Biswas R, Mukhopadhyay A, Mukhopadhyay S. Performance of large language models in fluoride-related dental knowledge: a comparative evaluation study of ChatGPT-4, Claude 3.5 Sonnet, Copilot, and Grok 3. Journal of Yeungnam Medical Science 2025;42:53 View
Kounatidis D, Raghunathan B. Investigation of Large Language Models’ Capabilities in Answering Clinical Questions Related to Periodontal Furcation Therapy. International Journal of Dental Research and Allied Sciences 2025;5(2):1 View
Larrea Eyzaguirre J, Bustillos Torrez W. La inteligencia artificial generativa en la educación odontológica: Una revisión narrativa de aplicaciones y desafíos //Generative artificial intelligence in dental education: A narrative review of applications and challenges. Revista de la Asociación Odontológica Argentina 2025:1 View
Nazir M, Carland J, Keep M, Janssen A. "...it saves so much time": A qualitative exploration of the use of Generative Artificial Intelligence by the health workforce. Health Policy and Technology 2026;15(1):101136 View
Ma Z, Du C, Lao Q, Xie X. Generative Artificial Intelligence: Applications and Future Prospects in Dentistry. International Dental Journal 2026;76(1):108276 View
Larrea Eyzaguirre J, Bustillos Torrez W. La inteligencia artificial generativa en la educación odontológica: Una revisión narrativa de aplicaciones y desafíos Generative artificial intelligence in dental education: A narrative review of applications and challenges. Revista de la Asociación Odontológica Argentina 2025:1 View
Karaca B, Çakmak Y, Erkal D. Clinical Relevance of Large Language Models in Endodontics: Diagnostic Appropriateness Based on 50 Simulated Case Scenarios. Australian Endodontic Journal 2026;52(1):130 View
Larrea Eyzaguirre J, Bustillos Torrez W. La inteligencia artificial generativa en la educación odontológica: Una revisión narrativa de aplicaciones y desafíos Generative artificial intelligence in dental education: A narrative review of applications and challenges. Revista de la Asociación Odontológica Argentina 2025:1 View
Austin G, Pe’er I, Korem T. Distributional bias compromises leave-one-out cross-validation. Science Advances 2025;11(48) View
Pinto L, K․N․ S, John T, M․V․ J. Towards the theory for mitigating gradient issues and dead neurons in deep learning through a modified Gaussian activation function. Neural Networks 2026;196:108353 View
Stoco J, Araújo J, Peres A, Oliveira G, Peres G. Simulating Physician–Patient Interactions Using Generative Artificial Intelligence: An Educational Tool for Medical Students. Cureus Journal of Computer Science 2025 View
Aydin Varol E, Ozturk Z, Bal C, Karamuftuoglu N. Comparison of two different artificial intelligence chatbots that provide information to patients and parents about primary tooth pulpotomy treatments. BMC Oral Health 2025;26(1) View
Kanmaz M, Agani Sabah G. Diagnostic accuracy of large language models in the classification of superior labial frenulum attachments. Odontology 2025 View
YAMAÇ B, AKÇAR R, ŞARKAN İ, ARSLAN C. Assessment of the information quality of chatbot technologies on orthodontic miniscrews. Dental Press Journal of Orthodontics 2025;30(5) View
Abuabara A, Baratto-Filho F, Gallego G, Luiz L, Meger M, Scariot R, Beisel-Memmert S, de Araujo C, Küchler E, de Araujo B. Comparative evaluation of ChatGPT and Gemini in detecting external apical root resorption on panoramic radiographs of orthodontic patients. Journal of Orofacial Orthopedics / Fortschritte der Kieferorthopädie 2025 View
Al-Rahahleh A, Rizik M, Al-Ashwal F, Abu-Farha R, Zawiah M. Diagnostic performance of four AI tools in pharmacology MCQs: Accuracy, sensitivity, and specificity. PLOS One 2025;20(12):e0337688 View
Çakmak B, Sökmen T, Baloş Tuncer B. Artificial intelligence-powered chatbots’ responses to orthodontic questions from the dentistry specialization examination: Accuracy and source evaluation. Journal of Dental Sciences 2025 View
Keleş Ö, Arslan Z. Performance of artificial intelligence chatbots in the diagnosis and management of simulated dental trauma cases: an evaluation based on IADT guidelines. Clinical Oral Investigations 2025;30(1) View
Usta Kutlu İ, Yıldırım A, Sarıkaya I. Is ChatGPT a reliable instrument for prosthetic dentistry?. Gümüşhane Üniversitesi Sağlık Bilimleri Dergisi 2025;14(4):1372 View
Ahmad Satmi A, Reza N, Khamis M, Mohamad N, Abdullah J, Mohd Ali Hanafiah M, Madawana A. Educational aspects of artificial intelligence in oral and maxillofacial radiology: insights from a scoping review. BMC Medical Education 2025;26(1) View
Pauwels R. Artificial Intelligence in Dental Diagnostics and Treatment Planning: General Principles, Current State, and Future Perspectives. Aktuel Nordisk Odontologi 2026;51(1):7 View
Shen S, Wang Z, Paul K, Li M, Huang X, Koizumi N. Evaluation of Large Language Models for Peer Review in Transplantation Research: Algorithm Validation Study. JMIR AI 2026;5:e84322 View
Phillips B. Towards evidence-based medicine for paediatricians. Archives of Disease in Childhood 2026;111(2):193.2 View
Camino-Moreno S. A RAG-Driven Conversational System for Citizen Information Requests at Ecuador's Tax Authority. Journal of Emerging & Sustainable Smart Engineering 2026;2(1):19 View
Makrygiannakis M, Giannakopoulos K, Kaklamanos E. Evidence-based potential of generative artificial intelligence large language models in orthodontics: a comparative study of ChatGPT, Google Bard, and Microsoft Bing. European Journal of Orthodontics 2025;48(1) View
Sabah G, Kanmaz M. Comparative evaluation of ChatGPT, Gemini, and Grok in clinical decision-making and general knowledge assessment for impacted maxillary canines. Korean Journal of Orthodontics 2026;56(1):45 View
KAPLAN T, AKYOL G. The role of large language models in dental diagnosis, decision-making, and communication: A systematic review. Japanese Dental Science Review 2026;62:57 View
Kaplan T. A comparative evaluation of two large language models in pediatric dentistry. BMC Oral Health 2026;26(1) View
Lledo B, Carbone P, Ortiz J, Morales R, Rodríguez-Arnedo A, Herrero L, Alvarez E, Ten J, Luque L, Castillo J, Suñol J, Racca A, Bernabeu A. Assessing the performance of generative AI chatbots in preimplantation genetic testing: a comparative study of expert evaluations. Reproductive BioMedicine Online 2026;52(3):105275 View
Alkabazi M, Tassoker M. Comparative performance of AI models on case-based oral medicine questions across Bloom’s taxonomy levels and subtopics. Odontology 2026 View
del Barrio M, Laos K, Lara M, García C, Menendez H. Size doesn’t matter: Assessing the trustworthiness of large language models in medical contexts: A focus on epidural information retrieval. Artificial Intelligence in Medicine 2026;175:103379 View
Pham T. A deep learning vision–language model for diagnosing pediatric dental diseases. Intelligence-Based Medicine 2026;13:100364 View
Özdal Zincir Ö, Çifçi Özkan E, Hatipoğlu Ş. Assessing the Quality of AI-Generated Responses to Botulinum Toxin Applications in Bruxism Therapy. Journal of Craniofacial Surgery 2026;37(3/4):658 View
Liu P, Watts D, Li X, Wang X, Li A. AI‐Augmented Leadership: How, Why, and When Leaders' Collaboration With AI Enhances Team Performance. Journal of Organizational Behavior 2026 View
Şimşek E, Büker M. Effect of language difference and time on the accuracy of artificial intelligence chatbots responses to questions about vital pulp therapy. BMC Oral Health 2026;26(1) View
Bhamra S, Jathanna R, Manjanatha M. Comparative Analysis of Generative AI Language Models in Orthodontics: Evidence‐Based Insights Into Perplexity, iASK, and ChatGPT 4o Mini. The Scientific World Journal 2026;2026(1) View
Kırıştıoğlu M, Yıldız M, İşleker S, Söğütlü Sarı E, Özmen A, Baykara M. Evaluating AI Chatbots for Pediatric Contact Lenses: A Study on Accuracy, Readability, and Reliability. Uludağ Üniversitesi Tıp Fakültesi Dergisi 2026;52:1780297 View
Şahap M, Gökbulut Özaslan N, Veizi E. An artificial intelligence chatbot as a substitute for physician-patient counselling on perioperative anaesthesia in knee-related surgeries. Kastamonu Medical Journal 2026;6(1):34 View
Du K, Li A, Zuo Q, Zhang C, Guo R, Chen P, Du W, Zuo Y, Li S. Comparing large language models and human experts in interpreting MRI reports for personalized patient education. International Journal of Medical Informatics 2026;214:106409 View
Pham T, Al-Hebshi S. Classification of pediatric dental diseases from panoramic radiographs using natural language transformer and deep learning models. Frontiers in Artificial Intelligence 2026;9 View
Denham D, Wang C, Maric E, Hinton L, Heniford B. Large Language Models in Surgery: Promise, Pitfalls, and Practical Use. Journal of Abdominal Wall Surgery 2026;5 View
Zhang R, Liu M, Liong O, Liu L, Guan S, Chen K, Li Y, Han X, Mei L. AI in the Chair: A Multi‐Centre Study of Doctor AI Answering Orthodontic Patient Questions. Journal of the Royal Society of New Zealand 2026;56(2) View
Karakaya G, Sirman A, Öreroğlu İ, Arıcan B. Are AI chatbots ready for endodontics? Evaluating their validity, consistency, and readability in patient-oriented responses. Odontology 2026 View
Chen S, Zhong R, Zhang W, Li Z, Su Y, Gao L, Hu K. What will be the next step of LLMs in TCM? A narrative review. Science of Traditional Chinese Medicine 2026;4(2):111 View
Yilanci H, Anjary A, Adleh M. Assessing the Credibility of ChatGPT on Temporomandibular Disorders. European Journal of Dental Education 2026 View
Kim M, Abrams D, Braun I, Case A, Davis M, Tanco K, Wallace M, Manuel C, Bruera E, Hui D. Chatbot Responses to Frequently Asked Questions About Cannabis and Its Use for Cancer Symptoms. Journal of Pain and Symptom Management 2026;72(1):1 View
Kuang E, Shen L, Jahangirzadeh Soure E, Fan M, Shinohara K. Standardizing the Evaluation of Usability Test Results: Criteria Development and Human-AI Collaborative Performance. International Journal of Human–Computer Interaction 2026:1 View
Metzger T, Hossain Z, Park K, Vu S, Dixon S, Taylor T. Assessment of Stage Two Hypertension Treatment Plans Written by Generative AI. Journal of Clinical Medicine 2026;15(8):3103 View
Wang H, Du W, Yang B, Liu M, Xu C, Zhang W, Xu C, He L, Zhang W, Yu Y, Lin J, Peng X. Evaluating open-source LLMs for dental EMR generation. BMC Oral Health 2026;26(1) View
Arslanparcasi H, Arslanparcasi M. Evaluation of the Accuracy of Responses Provided by AI-Based Conversational Systems to Patient Questions Regarding Endodontic Pain and Antibiotic Use. Cureus 2026 View
Sismanoglu S, Isik V, Kayahan M. Performance of large language models in endodontics: accuracy, consistency, and benchmarking with consensus guidelines. BMC Oral Health 2026;26(1) View
Shefov V, Orekhova L, Loboda E, Shefova A. Limitations of linear statistical methods for detecting associations between dental status and systemic patient’s health. Parodontologiya 2026;31(1):61 View
Herawati H, Kusnoto J, Gunardi I, Wirasto A, Astoeti T. Accuracy of Orthodontic Malocclusion Detection Using Multiple AI Models: A Comparative Study. Healthcare Informatics Research 2026;32(2):166 View
Dashti M, Khosraviani F, Meyari A, Amirzade-Iranaq M, Chaurasia A, Hefzi D, Ghadimi N, Tichy A, Khurshid Z, Schwendicke F. Accuracy of Large Language Models in Answering Dental Examination Questions: A Systematic Review and Meta-Analysis. International Dental Journal 2026;76(4):109609 View
Al‐Haj Ali S. Benchmarking Multi‐Modal Artificial Intelligence Models Against Student Performance: The Role of Question Characteristics in Objective Structured Practical Dental Examinations. European Journal of Dental Education 2026 View
Ramos R, Marques G, Rocha B, Simões F, Pithon M, Souza R, Freitas L. Confiabilidade do ChatGPT na avaliação de anomalias dentárias. Research, Society and Development 2026;15(5):e10015551135 View
Yildiz D, Bayram H, Bayram E. Educational potential of large language models in endodontics: A comparative analysis of responses to patient and exam-based questions. Endodontology 2026 View
Tepe H. Accuracy and Readability of Large Language Models’ Responses to Frequently Asked Questions on Social Assistance. Journal of the Society for Social Work and Research 2026;17(2):217 View
Gülhan Güner S, Tan Z, Gülpınar S. Comparative performance of artificial intelligence models in intensive care nursing questions: an evaluation of ChatGPT, DeepSeek, and Google Gemini. BMC Nursing 2026;25(1) View
Özüdoğru S, Tokgöz Kaplan T. Evaluation of the accuracy, quality, and readability of large language models on local anesthesia and general anesthesia-sedation in dentistry. BMC Anesthesiology 2026;26(1) View
Sehar U, Xiong J, Xia Z. Contemporary Applications of Artificial Intelligence in Dentistry: A Review. Machine Intelligence Research 2026 View
Zhang M, Yang Y, lv Y, Yu Y, Hu S, Yang Z, Wang Z, Wu M. Generative Artificial Intelligence-driven orthodontic education practices. BMC Medical Education 2026;26(1) View
Kollayan B, Cebeci T. Evaluating the clinical safety of large language models in oral cancer-related patient communication: a repeated-prompt observational study. BMC Oral Health 2026;26(1) View
Kamalakidis S, Lee D, Vazouras K, Giannakopoulos K, Kaklamanos E. Evidence-based evaluation of large language models in advanced clinical decision-making for removable prosthodontics. The Journal of Prosthetic Dentistry 2026 View
Bayırlı A, Uytun M, Erdem R, Genç Y. Comparative Evaluation of Chatgpt, Gemini and Grok with and without Deep Research Mode in Answering Bone Augmentation Queries. Mugla Journal of Science and Technology 2026;12(1):85 View
Zhang R, Mei L, Liu M, Liong O, Liu L, Jiang L, Chen Q, Polonowita A, Guan G. Proficiency of Artificial Intelligence in Addressing Patient Frequently Asked Questions on Oral Lichen Planus. Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology 2026 View
ŞIK H. Comparison of the Readability of Artificial Intelligence Applications in Periodontal Treatment. Acta Medica Young Doctors 2026;2(4) View
Hunt D, Di Miceli M. Evaluating the performance of 3 large language models in higher education essay-like assessments in 2024 and 2026. American Journal of STEM Education 2026;25:279 View
Umer F, Shaikh M, Ur Rahman A. Evaluating the safety of large language models in healthcare and dentistry: adversarial testing approaches. BDJ Open 2026;12(1) View

Books/Policy Documents

Niazai H, Monib W. Teaching in the Age of Medical Technology. View
Pham T, Holmes S, Chatzopoulou D, Coulthard P. Artificial Intelligence in Facial Trauma, Oral Diseases, and Systemic Health. View
Yin L. Gene Therapy - From Molecular Tools to Clinical Applications and Future Horizons [Working Title]. View

Conference Proceedings

Huang J, Wei Y, Zhang L, Chen W. 2024 International Symposium on Educational Technology (ISET). Evaluating generative artificial intelligence in answering course-related open questions: A pilot study View
Kuang E. Adjunct Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. Evaluating Usability Challenges in VR Games for Older Adults: A Comparison With and Without AI Assistance View
Dawadi R, Tay J, Martin-Morales A, Vu T, Ngoc Hoang P, Yamamoto M, Watanabe N, Kuriya Y, Araki M. Proceedings of the 2025 9th International Conference on Medical and Health Informatics. Comparing Large Language Models for Food-Microbiome Relation Extraction from Research Papers View
Desai C. 2025 8th International Conference on Emerging Technologies in Computer Engineering: Advances in Computing, Healthcare and Smart Systems (ICETCE). Generative Models in Dental Applications: A Survey of Emerging Trends View
Kuang E, Jahangirzadeh Soure E, Shen L, Goyal N, Fan M, Shinohara K. Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems. “It Became My Buddy, But I’m Not Afraid to Disagree”: A Multi-Session Study of UX Evaluators Collaborating with Conversational AI Assistants View
Liu Q, Ren H. Proceedings of the 2026 International Conference on Big Data and Informatization Education. Research on the Pathways of Integrating Generative Artificial Intelligence into Education: A Visualization Analysis Based on CiteSpace View

Citation

Please cite as:

Giannakopoulos K, Kavadella A, Aaqel Salim A, Stamatopoulos V, Kaklamanos EG
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study
J Med Internet Res 2023;25:e51580
doi: 10.2196/51580 PMID: 38009003 PMCID: 10784979

Export Metadata

END for: Endnote

BibTeX for: BibDesk, LaTeX

RIS for: RefMan, Procite, Endnote, RefWorks

Add this article to your Mendeley library

This paper is in the following e-collection/theme issue:

Generative Language Models Including ChatGPT (1422) e-Learning and Digital Medical Education (1535) Dental Sciences (231) Chatbots and Conversational Agents (1136)

Download

Download PDF Download XML

Share Article

Share on Bluesky Share on Twitter Share on Facebook Share on LinkedIn