Published on in Vol 25 (2023)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/51580, first published .
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study

Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study

Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study

Journals

  1. Balasanjeevi G, Surapaneni K. Comparison of ChatGPT version 3.5 & 4 for utility in respiratory medicine education using clinical case scenarios. Respiratory Medicine and Research 2024;85:101091 View
  2. Goyanes M, Lopezosa C. ChatGPT en Ciencias Sociales: revisión de la literatura sobre el uso de inteligencia artificial (IA) de OpenAI en investigación cualitativa y cuantitativa. Anuario ThinkEPI 2024;18 View
  3. Kamihara T, Tabuchi M, Omura T, Suzuki Y, Aritake T, Hirashiki A, Kokubo M, Shimizu A. Evolution of a Large Language Model for Preoperative Assessment Based on the Japanese Circulation Society 2022 Guideline on Perioperative Cardiovascular Assessment and Management for Non-Cardiac Surgery. Circulation Reports 2024;6(4):142 View
  4. Wang L, Ma Y, Bi W, Lv H, Li Y. An Entity Extraction Pipeline for Medical Text Records Using Large Language Models: Analytical Study. Journal of Medical Internet Research 2024;26:e54580 View
  5. Warrier A, Singh R, Haleem A, Zaki H, Eloy J. The Comparative Diagnostic Capability of Large Language Models in Otolaryngology. The Laryngoscope 2024;134(9):3997 View
  6. Rahimli Ocakoglu S, Coskun B. The Emerging Role of AI in Patient Education: A Comparative Analysis of the Accuracy of Large Language Models for Pelvic Organ Prolapse. Medical Principles and Practice 2024;33(4):330 View
  7. Andreadis K, Newman D, Twan C, Shunk A, Mann D, Stevens E. Mixed methods assessment of the influence of demographics on medical advice of ChatGPT. Journal of the American Medical Informatics Association 2024;31(9):2002 View
  8. Moulaei K, Yadegari A, Baharestani M, Farzanbakhsh S, Sabet B, Reza Afrash M. Generative artificial intelligence in healthcare: A scoping review on benefits, challenges and applications. International Journal of Medical Informatics 2024;188:105474 View
  9. Liu L, Qu S, Zhao H, Kong L, Xie Z, Jiang Z, Zou P. Global trends and hotspots of ChatGPT in medical research: a bibliometric and visualized study. Frontiers in Medicine 2024;11 View
  10. Özbay Y. Evaluation of ChatGPT as a Multiple-Choice Question Generator in Dental Traumatology. Medical Records 2024;6(2):235 View
  11. Rossettini G, Rodeghiero L, Corradi F, Cook C, Pillastrini P, Turolla A, Castellini G, Chiappinotto S, Gianola S, Palese A. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Medical Education 2024;24(1) View
  12. Naz R, Akacı O, Erdoğan H, Açıkgöz A. Can large language models provide accurate and quality information to parents regarding chronic kidney diseases?. Journal of Evaluation in Clinical Practice 2024;30(8):1556 View
  13. Danesh A, Danesh A, Danesh F. Innovating dental diagnostics: ChatGPT's accuracy on diagnostic challenges. Oral Diseases 2025;31(3):911 View
  14. Khan M, O’Sullivan E. A comparison of the diagnostic ability of large language models in challenging clinical cases. Frontiers in Artificial Intelligence 2024;7 View
  15. Zhui L, Yhap N, Liping L, Zhengjie W, Zhonghao X, Xiaoshu Y, Hong C, Xuexiu L, Wei R. Impact of Large Language Models on Medical Education and Teaching Adaptations. JMIR Medical Informatics 2024;12:e55933 View
  16. Zhui L, Fenghe L, Xuehu W, Qining F, Wei R. Ethical Considerations and Fundamental Principles of Large Language Models in Medical Education: Viewpoint. Journal of Medical Internet Research 2024;26:e60083 View
  17. Shah-Mohammadi F, Finkelstein J. Accuracy Evaluation of GPT-Assisted Differential Diagnosis in Emergency Department. Diagnostics 2024;14(16):1779 View
  18. Akyon S, Akyon F, Camyar A, Hızlı F, Sari T, Hızlı Ş. Evaluating the Capabilities of Generative AI Tools in Understanding Medical Papers: Qualitative Study. JMIR Medical Informatics 2024;12:e59258 View
  19. Is E, Menekseoglu A. Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o. Clinical Rheumatology 2024;43(11):3507 View
  20. Tam T, Sivarajkumar S, Kapoor S, Stolyar A, Polanska K, McCarthy K, Osterhoudt H, Wu X, Visweswaran S, Fu S, Mathur P, Cacciamani G, Sun C, Peng Y, Wang Y. A framework for human evaluation of large language models in healthcare derived from literature review. npj Digital Medicine 2024;7(1) View
  21. Wu Z, Gan W, Xue Z, Ni Z, Zheng X, Zhang Y. Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study. JMIR Medical Education 2024;10:e52746 View
  22. Peng L, Liang R, Zhao A, Sun R, Yi F, Zhong J, Li R, Zhu S, Zhang S, Wu S. Amplifying Chinese physicians’ emphasis on patients’ psychological states beyond urologic diagnoses with ChatGPT – a multicenter cross-sectional study. International Journal of Surgery 2024;110(10):6501 View
  23. Ezanno A, Fougerousse A, Pruvost-Balland C, Maccari F, Fite C. AI in Hidradenitis Suppurativa: Expert Evaluation of Patient-Facing Information. Clinical, Cosmetic and Investigational Dermatology 2024;Volume 17:2459 View
  24. Tokgöz Kaplan T, Cankar M. Evidence‐Based Potential of Generative Artificial Intelligence Large Language Models on Dental Avulsion: ChatGPT Versus Gemini. Dental Traumatology 2025;41(2):178 View
  25. Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton E, Malin B, Yin Z. Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review. Journal of Medical Internet Research 2024;26:e22769 View
  26. Zheng J, Ding X, Pu J, Chung S, Ai Q, Hung K, Shan Z. Unlocking the Potentials of Large Language Models in Orthodontics: A Scoping Review. Bioengineering 2024;11(11):1145 View
  27. Chatzopoulos G, Koidou V, Tsalikis L, Kaklamanos E. Large language models in periodontology: Assessing their performance in clinically relevant questions. The Journal of Prosthetic Dentistry 2024 View
  28. Diniz-Freitas M, López-Pintor R, Santos-Silva A, Warnakulasuriya S, Diz-Dios P. Assessing the accuracy and readability of ChatGPT-4 and Gemini in answering oral cancer queries—an exploratory study. Exploration of Digital Health Technologies 2024:334 View
  29. Bulut S. Bibliometric Analysis of Studies on Chat GPT with Vosviewer. Black Sea Journal of Engineering and Science 2024;7(6):1194 View
  30. Puleio F, Lo Giudice G, Bellocchio A, Boschetti C, Lo Giudice R. Clinical, Research, and Educational Applications of ChatGPT in Dentistry: A Narrative Review. Applied Sciences 2024;14(23):10802 View
  31. Mavrych V, Ganguly P, Bolgova O. Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis. Clinical Anatomy 2025;38(2):200 View
  32. Gosak L, Štiglic G, Pruinelli L, Vrbnjak D. PICOT questions and search strategies formulation: A novel approach using artificial intelligence automation. Journal of Nursing Scholarship 2025;57(1):5 View
  33. Ho C, Tian T, Ayers A, Aaron R, Phillips V, Wolf R, Mathioudakis N, Dai T, Klonoff D. Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review. BMC Medical Informatics and Decision Making 2024;24(1) View
  34. Guven Y, Ozdemir O, Kavan M. Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study. Dental Traumatology 2025;41(3):338 View
  35. Demir S. Evaluation of the reliability and readability of answers given by chatbots to frequently asked questions about endophthalmitis: A cross-sectional study on chatbots. Health Informatics Journal 2024;30(4) View
  36. Khosravi M, Mojtabaeian S, Demiray E, Sayar B. A Systematic Review of the Outcomes of Utilization of Artificial Intelligence Within the Healthcare Systems of the Middle East: A Thematic Analysis of Findings. Health Science Reports 2024;7(12) View
  37. Fanelli F, Saleh M, Santamaria P, Zhurakivska K, Nibali L, Troiano G. Development and Comparative Evaluation of a Reinstructed GPT‐4o Model Specialized in Periodontology. Journal of Clinical Periodontology 2025;52(5):707 View
  38. Demir S. Investigating the role of large language models on questions about refractive surgery. International Journal of Medical Informatics 2025;195:105787 View
  39. Farhadi Nia M, Ahmadi M, Irankhah E. Transforming dental diagnostics with artificial intelligence: advanced integration of ChatGPT and large language models for patient care. Frontiers in Dental Medicine 2025;5 View
  40. Sridhar G, Gumpeny L. Prospects and perils of ChatGPT in diabetes. World Journal of Diabetes 2025;16(3) View
  41. Nitsch K, Ivatury S. Do people prefer AI-generated patient educational materials over traditional ones?. Patient Education and Counseling 2025;134:108672 View
  42. Taşkıran Tepe H, Aslantürk H. Comparative Efficacy of AI LLMs in Clinical Social Work: ChatGPT-4, Gemini, Copilot. Research on Social Work Practice 2025 View
  43. Mustuloğlu Ş, Deniz B. Evaluation of Chatbots in the Emergency Management of Avulsion Injuries. Dental Traumatology 2025 View
  44. Öztürk Z, Bal C, Çelikkaya B. Evaluation of Information Provided by ChatGPT Versions on Traumatic Dental Injuries for Dental Students and Professionals. Dental Traumatology 2025 View
  45. Wang X, Ye H, Zhang S, Yang M, Wang X. Evaluation of the Performance of Three Large Language Models in Clinical Decision Support: A Comparative Study Based on Actual Cases. Journal of Medical Systems 2025;49(1) View
  46. Jin F, Peng X, Sun L, Song Z, Zhou K, Lin C. Knowledge (Co‐)Construction Among Artificial Intelligence, Novice Teachers, and Experienced Teachers in an Online Professional Learning Community. Journal of Computer Assisted Learning 2025;41(2) View
  47. Symeou L, Louca L, Kavadella A, Mackay J, Danidou Y, Raffay V. Development of Evidence‐Based Guidelines for the Integration of Generative AI in University Education Through a Multidisciplinary, Consensus‐Based Approach. European Journal of Dental Education 2025;29(2):285 View
  48. Dermata A, Arhakis A, Makrygiannakis M, Giannakopoulos K, Kaklamanos E. Evaluating the evidence-based potential of six large language models in paediatric dentistry: a comparative study on generative artificial intelligence. European Archives of Paediatric Dentistry 2025;26(3):527 View
  49. Koidou V, Chatzopoulos G, Tsalikis L, Kaklamanos E. Large Language Models in peri-implant disease: How well do they perform?. The Journal of Prosthetic Dentistry 2025 View
  50. Şişman A, Acar A. Artificial intelligence-based chatbot assistance in clinical decision-making for medically complex patients in oral surgery: a comparative study. BMC Oral Health 2025;25(1) View
  51. Ozdemir Z, Yapici E. Evaluating the Accuracy, Reliability, Consistency, and Readability of Different Large Language Models in Restorative Dentistry. Journal of Esthetic and Restorative Dentistry 2025;37(7):1740 View
  52. Terzi M, Yavuz M, Bicer T, Buyuk S. Evaluation of artificial intelligence robot’s knowledge and reliability on dental implants and peri-implant phenotype. Scientific Reports 2025;15(1) View
  53. Kim‐Berman H, Tarlie J, Herremans J. Student Performance on the American Board of Orthodontics Written Examination Following Flipped Classroom and Generative AI Approach. Journal of Dental Education 2025 View
  54. Durmazpinar P, Ekmekci E. Comparing diagnostic skills in endodontic cases: dental students versus ChatGPT-4o. BMC Oral Health 2025;25(1) View
  55. Mine Y, Okazaki S, Taji T, Kawaguchi H, Kakimoto N, Murayama T. Benchmarking multimodal large language models on the dental licensing examination: Challenges with clinical image interpretation. Journal of Dental Sciences 2025 View
  56. Gökcek Taraç M. Evaluation of Artificial Intelligence Chatbots in the Management of Primary Tooth Traumas: A Comparative Analysis. Journal of International Dental Sciences 2025;11(1):22 View
  57. Mao T, Zhao X, Jiang K, Xie Q, Yang M, Wang R, Gao F. A comparison of the responses between ChatGPT and doctors in the field of cholelithiasis based on clinical practice guidelines: a cross-sectional study. DIGITAL HEALTH 2025;11 View
  58. Lafourcade C, Kérourédan O, Ballester B, Richert R. Accuracy, consistency, and contextual understanding of large language models in restorative dentistry and endodontics. Journal of Dentistry 2025;157:105764 View
  59. Li S, Jiang J, Yang X. Preliminary assessment of large language models’ performance in answering questions on developmental dysplasia of the hip. Journal of Children's Orthopaedics 2025 View
  60. Du Y, Ji C, Xu J, Wei M, Ren Y, Xia S, Zhou J. Performance of ChatGPT and Microsoft Copilot in Bing in answering obstetric ultrasound questions and analyzing obstetric ultrasound reports. Scientific Reports 2025;15(1) View
  61. AlFarabi Ali S, AlDehlawi H, Jazzar A, Ashi H, Esam Abuzinadah N, AlOtaibi M, Algarni A, Alqahtani H, Akeel S, Almazrooa S. The Diagnostic Performance of Large Language Models and Oral Medicine Consultants for Identifying Oral Lesions in Text-Based Clinical Scenarios: Prospective Comparative Study. JMIR AI 2025;4:e70566 View
  62. Özbay Y, Erdoğan D, Dinçer G. Evaluation of the performance of large language models in clinical decision-making in endodontics. BMC Oral Health 2025;25(1) View
  63. Sarı M, Tufenkci P. Evaluation of the Competency of Large Language Models GPT-4o and Claude 3.5 Sonnet in Endodontic Emergencies. European Annals of Dental Sciences 2025;52(1):10 View
  64. Yang X, Xiao Y, Liu D, Deng H, Huang J, Zhou Y, Dai C, Wu J, Liu D, Liang M, Xu C. Cross language transformation of free text into structured lobectomy surgical records from a multi center study. Scientific Reports 2025;15(1) View
  65. Dawadi R, Vu T, Tay J, Tran Ngoc Hoang P, Oya A, Yamamoto M, Watanabe N, Kuriya Y, Araki M. Large language models and questions from older adults: a human and machine-based evaluation study. Discover Artificial Intelligence 2025;5(1) View
  66. Llorente de Pedro M, Suárez A, Algar J, Díaz-Flores García V, Andreu-Vázquez C, Freire Y. Assessing ChatGPT’s Reliability in Endodontics: Implications for AI-Enhanced Clinical Learning. Applied Sciences 2025;15(10):5231 View
  67. Khareedi R, Fernandez D. The Role of Chatbots in Enquiry‐Based Learning for Oral Health Students—An Exploratory Study. European Journal of Dental Education 2025 View
  68. Alhazmi N, Alshehri A, BaHammam F, Philip M, Nadeem M, Khanagar S. Can Large Language Models Serve as Reliable Tools for Information in Dentistry? A Systematic Review. International Dental Journal 2025;75(4):100835 View
  69. Chen Y, Lee S, Sheu H, Lin S, Hu C, Fu S, Yang C, Lin Y. Enhancing responses from large language models with role-playing prompts: a comparative study on answering frequently asked questions about total knee arthroplasty. BMC Medical Informatics and Decision Making 2025;25(1) View
  70. Zhang Y, Liu Z, Bai S, Xu T, Lau R. Reference decisions enhance LLM performance, amplified by source disclosure. DIGITAL HEALTH 2025;11 View
  71. Freire Y, Santamaría Laorden A, Orejas Pérez J, Ortiz Collado I, Gómez Sánchez M, Thuissard Vasallo I, Díaz-Flores García V, Suárez A, Kolahi J. Evaluating the influence of prompt formulation on the reliability and repeatability of ChatGPT in implant-supported prostheses. PLOS One 2025;20(5):e0323086 View
  72. Liu H, Peng J, Li L, Deng A, Huang X, Yin G, Luo H. Large Language Models as a Consulting Hotline for Patients With Breast Cancer and Specialists in China: Cross-Sectional Questionnaire Study. JMIR Medical Informatics 2025;13:e66429 View
  73. Abouzeid E, Wassef R, Jawwad A, Harris P. Chatbots’ Role in Generating Single Best Answer Questions for Undergraduate Medical Student Assessment: Comparative Analysis. JMIR Medical Education 2025;11:e69521 View
  74. Chatzopoulos G, Koidou V, Tsalikis L, Kaklamanos E. Clinical Applications of Artificial Intelligence in Periodontology: A Scoping Review. Medicina 2025;61(6):1066 View

Books/Policy Documents

  1. Niazai H, Monib W. Teaching in the Age of Medical Technology. View

Conference Proceedings

  1. Huang J, Wei Y, Zhang L, Chen W. 2024 International Symposium on Educational Technology (ISET). Evaluating generative artificial intelligence in answering course-related open questions: A pilot study View