Published on in Vol 26 (2024)

This is a member publication of University of California, Irvine, Emergency Medicine, Orange, California

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/60291, first published .
Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study

Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study

Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study

Journals

  1. Kusaka S, Akitomo T, Hamada M, Asao Y, Iwamoto Y, Tachikake M, Mitsuhata C, Nomura R. Usefulness of Generative Artificial Intelligence (AI) Tools in Pediatric Dentistry. Diagnostics 2024;14(24):2818 View
  2. Tabanli A, Demirkiran N. Comparing ChatGPT 3.5 and 4.0 in Low Back Pain Patient Education: Addressing Strengths, Limitations, and Psychosocial Challenges. World Neurosurgery 2025;196:123755 View
  3. Soddu M, De Vito A, Madeddu G, Nicolosi B, Provenzano M, Ivziku D, Curcio F. Assessing the Accuracy, Completeness and Safety of ChatGPT-4o Responses on Pressure Injuries in Infants: Clinical Applications and Future Implications. Nursing Reports 2025;15(4):130 View
  4. Kocak B, Ponsiglione A, Romeo V, Ugga L, Huisman M, Cuocolo R. Radiology AI and sustainability paradox: environmental, economic, and social dimensions. Insights into Imaging 2025;16(1) View
  5. AlFarabi Ali S, AlDehlawi H, Jazzar A, Ashi H, Esam Abuzinadah N, AlOtaibi M, Algarni A, Alqahtani H, Akeel S, Almazrooa S. The Diagnostic Performance of Large Language Models and Oral Medicine Consultants for Identifying Oral Lesions in Text-Based Clinical Scenarios: Prospective Comparative Study. JMIR AI 2025;4:e70566 View
  6. Hegedűs M, Dadkhah M, Dávid L. Benchmarking AI chatbots: assessing their accuracy in identifying hijacked medical journals. Diagnosis 2025 View
  7. Shen M, Li Z, Wu J. A commentary on “Bots in white coats: are large language models the future of patient education? A multicenter cross-sectional analysis”. International Journal of Surgery 2025;111(6):4149 View
  8. Sungur U, Arıkan Y, Türkay A, Polat H. The Responses of Artificial Intelligence to Questions About Urological Emergencies: A Comparison of 3 Different Large Language Models. The New Journal of Urology 2025;20(2):89 View
  9. Aghajani M, Maye E, Burrell K, Kok C, Frew J. Evaluating the quality and readability of online information about hidradenitis suppurativa: a systematic review. Clinical and Experimental Dermatology 2025;50(10):1937 View
  10. Shenoy D, Lindsay C, Martin A, Snyderman R. Perspectives on navigating the use of artificial intelligence by patients for their surgical care management. The American Journal of Surgery 2025:116540 View
  11. Ozer V, Bulbul O, Pasli S, Karakullukcu S, Kazzi Z, Turedi S. Performance of GPT‐4o in the management of toxicological exposures: A comparative analysis with emergency medicine residents. Hong Kong Journal of Emergency Medicine 2025;32(4) View
  12. Tilton A, Caplan B, Cole B. Generative AI in consumer health: leveraging large language models for health literacy and clinical safety with a digital health framework. Frontiers in Digital Health 2025;7 View
  13. Birkun A, Kosova Y, Rudenko A. Generative artificial intelligence-mediated counselling on first aid for seizures: The performance of publicly available chatbot versus its customised version. Epilepsy & Behavior 2025;171:110680 View
  14. DeTemple D, Meine T. Comparison of the readability of ChatGPT and Bard in medical communication: a meta-analysis. BMC Medical Informatics and Decision Making 2025;25(1) View
  15. Tanas Y, Gasper G, Rashidi K, Swed S. Evaluating large language models in patient education on facial plastic surgery: a standardized protocol. International Journal of Surgery Protocols 2025;29(3):108 View
  16. Jeon S, Lee S, Kim E, Eun J, Lee K, Lim H, Lee J. Generative AI Chatbot for Diabetes Management: Formative 2-Part Qualitative Study Using DTalksBot Involving Patients and Clinicians. JMIR Formative Research 2025;9:e72553 View
  17. Skamagki G, Savadi A, Greaves C, Fenton S, Stathi A. From information to action: a co-created evaluation of digital resources for musculoskeletal disorders. BMC Musculoskeletal Disorders 2025;26(1) View
  18. Tarhan M, Sahin Ozdemir M. Comparison of the accuracy and reliability of ChatGPT-4o and Gemini in answering HIV-related questions. BMC Infectious Diseases 2025;25(1) View
  19. Brzeziński J, Watros K, Mańczak M, Owoc J, Jeziorski K, Olszewski R. Readability and source transparency of AI-generated health information on human metapneumovirus: A comparative evaluation of five chatbots. Journal of Public Health 2025 View
  20. Sivakumar I, Arunachalam S, Gadde P, Sharan J. Performance of AI chatbots in responding to geriatric patient questions on denture issues: A mixed method study of accuracy and empathy. The Journal of Prosthetic Dentistry 2025 View
  21. Schuss P, Gonschorek A, Kämper M, Lemcke J, Meisel H, Rogge W, Schaan M, Schwenkreis P, Strowitzki M, Wohlfahrt K, Schmehl I. Artificial Intelligence Chatbot Responses to Patient Queries on Traumatic Brain Injury: An Expert Assessment of Reliability and Accuracy. Journal of Neurotrauma 2025 View