%0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e71521 %T Evaluating ChatGPT in Qualitative Thematic Analysis With Human Researchers in the Japanese Clinical Context and Its Cultural Interpretation Challenges: Comparative Qualitative Study %A Sakaguchi,Kota %A Sakama,Reiko %A Watari,Takashi %+ , Integrated Clinical Education Center, Kyoto University Hospital, Shogoin Kawaramachi 54, Sakyo-ku, Kyoto, 606-8506, Japan, 81 075 751 4839, wataritari@gmail.com %K ChatGPT %K large language models %K qualitative research %K sacred moment(s) %K thematic analysis %D 2025 %7 24.4.2025 %9 Original Paper %J J Med Internet Res %G English %X Background: Qualitative research is crucial for understanding the values and beliefs underlying individual experiences, emotions, and behaviors, particularly in social sciences and health care. Traditionally reliant on manual analysis by experienced researchers, this methodology requires significant time and effort. The advent of artificial intelligence (AI) technology, especially large language models such as ChatGPT (OpenAI), holds promise for enhancing qualitative data analysis. However, existing studies have predominantly focused on AI’s application to English-language datasets, leaving its applicability to non-English languages, particularly structurally and contextually complex languages such as Japanese, insufficiently explored. Objective: This study aims to evaluate the feasibility, strengths, and limitations of ChatGPT-4 in analyzing qualitative Japanese interview data by directly comparing its performance with that of experienced human researchers. Methods: A comparative qualitative study was conducted to assess the performance of ChatGPT-4 and human researchers in analyzing transcribed Japanese semistructured interviews. The analysis focused on thematic agreement rates, interpretative depth, and ChatGPT’s ability to process culturally nuanced concepts, particularly for descriptive and socio-culturally embedded themes. This study analyzed transcripts from 30 semistructured interviews conducted between February and March 2024 in an urban community hospital (Hospital A) and a rural university hospital (Hospital B) in Japan. Interviews centered on the theme of “sacred moments” and involved health care providers and patients. Transcripts were digitized using NVivo (version 14; Lumivero) and analyzed using ChatGPT-4 with iterative prompts for thematic analysis. The results were compared with a reflexive thematic analysis performed by human researchers. Furthermore, to assess the adaptability and consistency of ChatGPT in qualitative analysis, Charmaz’s grounded theory and Pope’s five-step framework approach were applied. Results: ChatGPT-4 demonstrated high thematic agreement rates (>80%) with human researchers for descriptive themes such as “personal experience of a sacred moment” and “building relationships.” However, its performance declined for themes requiring deeper cultural and emotional interpretation, such as “difficult to answer, no experience of sacred moments” and “fate.” For these themes, agreement rates were approximately 30%, revealing significant limitations in ChatGPT’s ability to process context-dependent linguistic structures and implicit emotional expressions in Japanese. Conclusions: ChatGPT-4 demonstrates potential as an auxiliary tool in qualitative research, particularly for efficiently identifying descriptive themes within Japanese-language datasets. However, its limited capacity to interpret cultural and emotional nuances highlights the continued necessity of human expertise in qualitative analysis. These findings emphasize the complementary role of AI-assisted qualitative research and underscore the importance of further advancements in AI models tailored to non-English linguistic and cultural contexts. Future research should explore strategies to enhance AI’s interpretability, expand multilingual training datasets, and assess the applicability of emerging AI models in diverse cultural settings. In addition, ethical and legal considerations in AI-driven qualitative analysis require continued scrutiny. %R 10.2196/71521 %U https://www.jmir.org/2025/1/e71521 %U https://doi.org/10.2196/71521