Assessing the Accuracy and Reliability of Large Language Models in Psychiatry Using Standardized Multiple-Choice Questions: Cross-Sectional Study

doi:10.2196/69910

Published on 20.May.2025 in Vol 27 (2025)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/69910, first published 11.Dec.2024.

Hands typing on a laptop keyboard in a modern office setting.

Assessing the Accuracy and Reliability of Large Language Models in Psychiatry Using Standardized Multiple-Choice Questions: Cross-Sectional Study

Kaitlin Hanss¹

; Karthik V Sarma¹

; Anne L Glowinski¹

; Andrew Krystal¹

; Ramotse Saunders¹

; Andrew Halls¹

; Sasha Gorrell¹

; Erin Reilly¹

Article Authors Cited by (12) Tweetations (1) Metrics

Kaitlin Hanss ¹ , MD, MPH ; Karthik V Sarma ¹ , MD, PhD ; Anne L Glowinski ¹ , MD, MPE ; Andrew Krystal ¹ , MD ; Ramotse Saunders ¹ , MD ; Andrew Halls ¹ , MD ; Sasha Gorrell ¹ , PhD ; Erin Reilly ¹ , PhD

¹ Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, San Francisco, CA, United States

Corresponding Author:

Kaitlin Hanss, MD, MPH
Department of Psychiatry and Behavioral Sciences
University of California, San Francisco
675 18th Street, Box 3134
San Francisco, CA 94143
United States
Phone: 1 415 476-7000
Fax: 1 415-502-6361
Email: Kaitlin.Hanss@ucsf.edu

Citation

Please cite as:

Hanss K, Sarma KV, Glowinski AL, Krystal A, Saunders R, Halls A, Gorrell S, Reilly E
Assessing the Accuracy and Reliability of Large Language Models in Psychiatry Using Standardized Multiple-Choice Questions: Cross-Sectional Study
J Med Internet Res 2025;27:e69910
doi: 10.2196/69910 PMID: 40392576 PMCID: 12134693

Export Metadata

END for: Endnote

BibTeX for: BibDesk, LaTeX

RIS for: RefMan, Procite, Endnote, RefWorks

Add this article to your Mendeley library

This paper is in the following e-collection/theme issue:

Artificial Intelligence (4591) Psychiatry (131) Generative Language Models Including ChatGPT (1442) AI Language Models in Health Care (707)

Download

Download PDF Download XML

Share Article

Share on Bluesky Share on Twitter Share on Facebook Share on LinkedIn