TY - JOUR AU - Mylläri, Sanna AU - Saarni, Suoma Eeva AU - Ritola, Ville AU - Joffe, Grigori AU - Stenberg, Jan-Henry AU - Solbakken, Ole André AU - Czajkowski, Nikolai Olavi AU - Rosenström, Tom PY - 2022 DA - 2022/11/9 TI - Text Topics and Treatment Response in Internet-Delivered Cognitive Behavioral Therapy for Generalized Anxiety Disorder: Text Mining Study JO - J Med Internet Res SP - e38911 VL - 24 IS - 11 KW - iCBT KW - CBT KW - psychotherapy KW - internet therapy KW - anxiety KW - topic modeling KW - natural language processing AB - Background: Text mining methods such as topic modeling can offer valuable information on how and to whom internet-delivered cognitive behavioral therapies (iCBT) work. Although iCBT treatments provide convenient data for topic modeling, it has rarely been used in this context. Objective: Our aims were to apply topic modeling to written assignment texts from iCBT for generalized anxiety disorder and explore the resulting topics’ associations with treatment response. As predetermining the number of topics presents a considerable challenge in topic modeling, we also aimed to explore a novel method for topic number selection. Methods: We defined 2 latent Dirichlet allocation (LDA) topic models using a novel data-driven and a more commonly used interpretability-based topic number selection approaches. We used multilevel models to associate the topics with continuous-valued treatment response, defined as the rate of per-session change in GAD-7 sum scores throughout the treatment. Results: Our analyses included 1686 patients. We observed 2 topics that were associated with better than average treatment response: “well-being of family, pets, and loved ones” from the data-driven LDA model (B=–0.10 SD/session/∆topic; 95% CI –016 to –0.03) and “children, family issues” from the interpretability-based model (B=–0.18 SD/session/∆topic; 95% CI –0.31 to –0.05). Two topics were associated with worse treatment response: “monitoring of thoughts and worries” from the data-driven model (B=0.06 SD/session/∆topic; 95% CI 0.01 to 0.11) and “internet therapy” from the interpretability-based model (B=0.27 SD/session/∆topic; 95% CI 0.07 to 0.46). Conclusions: The 2 LDA models were different in terms of their interpretability and broadness of topics but both contained topics that were associated with treatment response in an interpretable manner. Our work demonstrates that topic modeling is well suited for iCBT research and has potential to expose clinically relevant information in vast text data. SN - 1438-8871 UR - https://www.jmir.org/2022/11/e38911 UR - https://doi.org/10.2196/38911 UR - http://www.ncbi.nlm.nih.gov/pubmed/36350678 DO - 10.2196/38911 ID - info:doi/10.2196/38911 ER -