Published on in Vol 26 (2024)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/53724, first published .
Evaluating the Diagnostic Performance of Large Language Models on Complex Multimodal Medical Cases

Evaluating the Diagnostic Performance of Large Language Models on Complex Multimodal Medical Cases

Evaluating the Diagnostic Performance of Large Language Models on Complex Multimodal Medical Cases

Journals

  1. Shmilovitch A, Katson M, Cohen-Shelly M, Peretz S, Aran D, Shelly S. GPT-4 as a Clinical Decision Support Tool in Ischemic Stroke Management: Evaluation Study. JMIR AI 2025;4:e60391 View
  2. Takita H, Kabata D, Walston S, Tatekawa H, Saito K, Tsujimoto Y, Miki Y, Ueda D. A systematic review and meta-analysis of diagnostic performance comparison between generative AI and physicians. npj Digital Medicine 2025;8(1) View
  3. Lafourcade C, Kérourédan O, Ballester B, Richert R. Accuracy, consistency, and contextual understanding of large language models in restorative dentistry and endodontics. Journal of Dentistry 2025;157:105764 View