Published on in Vol 26 (2024)
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/60807, first published
.
Journals
- Liu C, Ho C, Wu T. Custom GPTs Enhancing Performance and Evidence Compared with GPT-3.5, GPT-4, and GPT-4o? A Study on the Emergency Medicine Specialist Examination. Healthcare 2024;12(17):1726 View
- Semeraro F. AI-Powered clinical assessments: GPT-4o’s role in standardizing CPR skill evaluations. Resuscitation 2024:110411 View
- Liu M, Okuhara T, Dai Z, Huang W, Gu L, Okada H, Furukawa E, Kiuchi T. Evaluating the Effectiveness of advanced large language models in medical Knowledge: A Comparative study using Japanese national medical examination. International Journal of Medical Informatics 2025;193:105673 View
- Taniguchi M, Lindsey J. Performance of chatbots in queries concerning fundamental concepts in photochemistry. Photochemistry and Photobiology 2024 View
- Yau J, Saadat S, Hsu E, Murphy L, Roh J, Suchard J, Tapia A, Wiechmann W, Langdorf M. Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study. Journal of Medical Internet Research 2024;26:e60291 View
- Liu M, Okuhara T, Huang W, Ogihara A, Nagao H, Okada H, Kiuchi T. Large Language Models in Dental Licensing Examinations: Systematic Review and Meta-Analysis. International Dental Journal 2025;75(1):213 View
- Chen Y, Huang X, Yang F, Lin H, Lin H, Zheng Z, Liang Q, Zhang J, Li X. Performance of ChatGPT and Bard on the medical licensing examinations varies across different cultures: a comparison study. BMC Medical Education 2024;24(1) View
- Bongco E, Cua S, Hernandez M, Pascual J, Khu K. The performance of ChatGPT versus neurosurgery residents in neurosurgical board examination-like questions: a systematic review and meta-analysis. Neurosurgical Review 2024;47(1) View
- Zong H, Wu R, Cha J, Wang J, Wu E, Li J, Zhou Y, Zhang C, Feng W, Shen B. Large Language Models in Worldwide Medical Exams: Platform Development and Comprehensive Analysis. Journal of Medical Internet Research 2024;26:e66114 View
- Ferraz-Costa G, Griné M, Oliveira-Santos M, Teixeira R. Performance of ChatGPT in the Portuguese National Residency Access Examination. Acta Médica Portuguesa 2024;38(3):170 View
- Sabaner M, Anguita R, Antaki F, Balas M, Boberg-Ans L, Ferro Desideri L, Grauslund J, Hansen M, Klefter O, Potapenko I, Rasmussen M, Subhi Y. Opportunities and Challenges of Chatbots in Ophthalmology: A Narrative Review. Journal of Personalized Medicine 2024;14(12):1165 View
- Camlet A, Kusiak A, Świetlik D. Application of Conversational AI Models in Decision Making for Clinical Periodontology: Analysis and Predictive Modeling. AI 2025;6(1):3 View
- Yang H, Hu M, Most A, Hawkins W, Murray B, Smith S, Li S, Sikora A. Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education. Frontiers in Artificial Intelligence 2025;7 View
- Qiu Y, Liu C. Capable exam-taker and question-generator: the dual role of generative AI in medical education assessment. Global Medical Education 2025 View
- Erdat E, Kavak E. Benchmarking LLM chatbots’ oncological knowledge with the Turkish Society of Medical Oncology’s annual board examination questions. BMC Cancer 2025;25(1) View
- Chu H, Pasion E, Yeh S, Chu G. Assessing the Ethical and Professional Capabilities of AI: A Study of ChatGPT and Google Gemini versus PREview (Situational Judgement Test) for Medical Student Applicant. Journal of Clinical Question 2024;1(3):82 View
- Meyer A, Wetsch W, Steinbicker A, Streichert T. Through ChatGPT’s Eyes: The Large Language Model’s Stereotypes and what They Reveal About Healthcare. Journal of Medical Systems 2025;49(1) View
- Waaler P, Hussain M, Molchanov I, Bongo L, Elvevåg B. Prompt Engineering an Informational Chatbot for Education on Mental Health Using a Multiagent Approach for Enhanced Compliance With Prompt Instructions: Algorithm Development and Validation. JMIR AI 2025;4:e69820 View
- Zhu J, Jiang Y, Chen D, Lu Y, Huang Y, Lin Y, Fan P. High identification and positive‐negative discrimination but limited detailed grading accuracy of ChatGPT‐4o in knee osteoarthritis radiographs. Knee Surgery, Sports Traumatology, Arthroscopy 2025;33(5):1911 View
- Tseng L, Lu Y, Tseng L, Chen Y, Chen H. Performance of ChatGPT-4 on Taiwanese Traditional Chinese Medicine Licensing Examinations: Cross-Sectional Study. JMIR Medical Education 2025;11:e58897 View
- Wang J, Shue K, Liu L, Hu G. Preliminary evaluation of ChatGPT model iterations in emergency department diagnostics. Scientific Reports 2025;15(1) View
- Kopka M, von Kalckreuth N, Feufel M. Accuracy of online symptom assessment applications, large language models, and laypeople for self–triage decisions. npj Digital Medicine 2025;8(1) View
- Kim K. Technology-enhanced learning in medical education in the age of artificial intelligence. Forum for Education Studies 2025;3(2):2730 View
- Rodrigues Alessi M, Gomes H, Oliveira G, Lopes de Castro M, Grenteski F, Miyashiro L, do Valle C, Tozzini Tavares da Silva L, Okamoto C. Comparative Performance of Medical Students, ChatGPT-3.5 and ChatGPT-4.0 in Answering Questions From a Brazilian National Medical Exam: Cross-Sectional Questionnaire Study. JMIR AI 2025;4:e66552 View
- Al Barajraji M, Barrit S, Ben-Hamouda N, Harel E, Torcida N, Pizzarotti B, Massager N, Lechien J. AI-Driven Information for Relatives of Patients with Malignant Middle Cerebral Artery Infarction: A Preliminary Validation Study Using GPT-4o. Brain Sciences 2025;15(4):391 View
- Bolgova O, Shypilova I, Mavrych V. Large Language Models in Biochemistry Education: Comparative Evaluation of Performance. JMIR Medical Education 2025;11:e67244 View
- Krumsvik R. GPT-4’s capabilities for formative and summative assessments in Norwegian medicine exams—an intrinsic case study in the early phase of intervention. Frontiers in Medicine 2025;12 View
- Luo D, Liu M, Yu R, Liu Y, Jiang W, Fan Q, Kuang N, Gao Q, Yin T, Zheng Z. Evaluating the performance of GPT-3.5, GPT-4, and GPT-4o in the Chinese National Medical Licensing Examination. Scientific Reports 2025;15(1) View
- Yang X, Xiao Y, Liu D, Deng H, Huang J, Zhou Y, Dai C, Wu J, Liu D, Liang M, Xu C. Cross language transformation of free text into structured lobectomy surgical records from a multi center study. Scientific Reports 2025;15(1) View
- Hanss K, Sarma K, Glowinski A, Krystal A, Saunders R, Halls A, Gorrell S, Reilly E. Assessing the Accuracy and Reliability of Large Language Models in Psychiatry Using Standardized Multiple-Choice Questions: Cross-Sectional Study. Journal of Medical Internet Research 2025;27:e69910 View
- Fushimi A, Terada M, Tahara R, Nakazawa Y, Iwase M, Shibayama T, Kotti S, Yamashita N, Iesato A. Assessing the quality of Japanese online breast cancer treatment information using large language models: a comparison of ChatGPT, Claude, and expert evaluations. Breast Cancer 2025 View
- He F, Yang M, Liu J, Gong T, Ma J, Yang T, Zhao D, Li S, Tian D. Quality and reliability of pediatric pneumonia related short videos on mainstream platforms: cross-sectional study. BMC Public Health 2025;25(1) View
- Huang S, Wen C, Bai X, Li S, Wang S, Wang X, Yang D. Exploring the Application Capability of ChatGPT as an Instructor in Skills Education for Dental Medical Students: Randomized Controlled Trial. Journal of Medical Internet Research 2025;27:e68538 View
- Kuribara T, Hirayama K, Hirata K. Performance evaluation of large language models for the national nursing examination in Japan. DIGITAL HEALTH 2025;11 View
- Fallah H, Biazar E, Rezaei M. Artificial Intelligence in Dental Education. The Journal of the American Dental Association 2025;156(6):434 View
- Tan Y, Nah S, Saw S, Rajandram R, Ong T. Evaluating the performance of artificial intelligence chatbots in answering urology questions derived from guidelines or board examinations: A systematic review. Urological Science 2025 View
- Kim M, Hwang G, Jang J, Chang S, Roh H, Park R. Performance of Open-Source Large language Models in Psychiatry: A Comparative Analysis of Non-English Records and English Translations (Preprint). Journal of Medical Internet Research 2024 View
- Mert S, Muir L, Fuchs B, Lucksch V, Vollbach F, Haas-Lützenberger E, Giunta R, Thierfelder N, Demmer W. Can artificial intelligence pass the written European Board of Hand Surgery exam?. Hand Surgery and Rehabilitation 2025:102197 View