Attrition in Conversational Agent–Delivered Mental Health Interventions: Systematic Review and Meta-Analysis

doi:10.2196/48168

Review

¹Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore

²Future Health Technologies, Singapore-ETH Centre, Campus for Research Excellence And Technological Enterprise, Singapore, Singapore

³Department of Neuroscience, Monash University, Melbourne, Australia

⁴Centre for Healthy and Sustainable Cities, Wee Kim Wee School of Communication and Information, Nanyang Technological University Singapore, Singapore, Singapore

⁵Department of Primary Care and Public Health, School of Public Health, Imperial College London, London, United Kingdom

Corresponding Author:

Lorainne Tudor Car, MD, PhD

Lee Kong Chian School of Medicine

Nanyang Technological University Singapore

11 Mandalay Road, Level 18

Singapore, 308232

Singapore

Phone: 65 69041258

Email: lorainne.tudor.car@ntu.edu.sg

Background: Conversational agents (CAs) or chatbots are computer programs that mimic human conversation. They have the potential to improve access to mental health interventions through automated, scalable, and personalized delivery of psychotherapeutic content. However, digital health interventions, including those delivered by CAs, often have high attrition rates. Identifying the factors associated with attrition is critical to improving future clinical trials.

Objective: This review aims to estimate the overall and differential rates of attrition in CA-delivered mental health interventions (CA interventions), evaluate the impact of study design and intervention-related aspects on attrition, and describe study design features aimed at reducing or mitigating study attrition.

Methods: We searched PubMed, Embase (Ovid), PsycINFO (Ovid), Cochrane Central Register of Controlled Trials, and Web of Science, and conducted a gray literature search on Google Scholar in June 2022. We included randomized controlled trials that compared CA interventions against control groups and excluded studies that lasted for 1 session only and used Wizard of Oz interventions. We also assessed the risk of bias in the included studies using the Cochrane Risk of Bias Tool 2.0. Random-effects proportional meta-analysis was applied to calculate the pooled dropout rates in the intervention groups. Random-effects meta-analysis was used to compare the attrition rate in the intervention groups with that in the control groups. We used a narrative review to summarize the findings.

Results: The systematic search retrieved 4566 records from peer-reviewed databases and citation searches, of which 41 (0.90%) randomized controlled trials met the inclusion criteria. The meta-analytic overall attrition rate in the intervention group was 21.84% (95% CI 16.74%-27.36%; I²=94%). Short-term studies that lasted ≤8 weeks showed a lower attrition rate (18.05%, 95% CI 9.91%- 27.76%; I²=94.6%) than long-term studies that lasted >8 weeks (26.59%, 95% CI 20.09%-33.63%; I²=93.89%). Intervention group participants were more likely to attrit than control group participants for short-term (log odds ratio 1.22, 95% CI 0.99-1.50; I²=21.89%) and long-term studies (log odds ratio 1.33, 95% CI 1.08-1.65; I²=49.43%). Intervention-related characteristics associated with higher attrition include stand-alone CA interventions without human support, not having a symptom tracker feature, no visual representation of the CA, and comparing CA interventions with waitlist controls. No participant-level factor reliably predicted attrition.

Conclusions: Our results indicated that approximately one-fifth of the participants will drop out from CA interventions in short-term studies. High heterogeneities made it difficult to generalize the findings. Our results suggested that future CA interventions should adopt a blended design with human support, use symptom tracking, compare CA intervention groups against active controls rather than waitlist controls, and include a visual representation of the CA to reduce the attrition rate.

Trial Registration: PROSPERO International Prospective Register of Systematic Reviews CRD42022341415; https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42022341415

J Med Internet Res 2024;26:e48168

doi:10.2196/48168

Keywords

conversational agent; chatbot; mental health; mHealth; attrition; dropout; mobile phone; artificial intelligence; AI; systematic review; meta-analysis; digital health interventions

Description of the Problem

Mental health disorders are among the largest contributors to the global disease burden, affecting 1 in every 8 people, or 970 million people around the world [1,2]. However, access to evidence-based interventions for the prevention and treatment of mental health disorders is limited [3,4]. This is due to various factors such as a lack of mental health services and professionals, poor mental health literacy, fear of stigma, and low perceived need for treatment [5-10]. There is a need for scalable and accessible mental health services. Digital technologies such as smartphones or websites are increasingly being used for the delivery of mental health interventions and have the potential to improve access to mental health care. Digital mental health interventions allow for the scalable delivery of diverse therapeutic approaches such as cognitive behavioral therapy and mindfulness for the treatment of mental health conditions such as depression, anxiety, substance abuse, and eating disorders [11-16].

Description of the Intervention

Conversational agents (CAs) or chatbots are a more recent type of digital intervention, and they are becoming a popular method to deliver mental health interventions. CAs can be defined as computer algorithms designed to simulate human conversations textually or via speech through an interface [17]. CA-delivered mental health interventions (CA interventions) combine the delivery of psychotherapeutic content with an automated dialogue system that simulates the interaction between a mental health expert and the user [18]. These interventions provide an alternative avenue of psychotherapy to individuals who are not able to access mental health services owing to issues regarding time, location, or availability of resources [19]. CAs can also be a useful addition to traditional in-person therapy [20,21]. The presence of a CA can further contribute to improved therapeutic alliances with users to enhance adherence to the intervention [22,23]. Evidence for the efficacy of CAs in delivering mental health support is growing steadily. A recent meta-analysis showed that CA-delivered psychotherapy in adults significantly improved depressive symptoms with a medium effect size [19]. Providing self-guided therapy remotely via CAs may help address barriers to mental health access such as cost, long waiting time, and stigma [24]. Although the impact of mental health interventions delivered by CAs seems promising, studies evaluating such interventions also suggest high study attrition among participants [19]. Attrition or dropout occurs when participants do not complete the randomized controlled trial (RCT) assessments or complete the research protocol.

Digital health interventions typically report rapid and high attrition [13,25]. The overall attrition rate quantifies the level of attrition for the whole sample in a clinical trial, and the differential attrition rate refers to the level of attrition in the intervention group compared with that in the comparison group [26]. Attrition in clinical trials may introduce bias by modifying the random composition of the trial groups, limiting the generalizability of the study, and reducing the study power owing to reduced sample size [13,27]. To improve the quality of future clinical trials on CA interventions, there is a need to determine the attrition rates and the factors contributing to attrition in CA interventions.

Why Is It Important to Conduct This Review?

There is scant evidence on the possible factors associated with attrition in CA interventions for mental health and health care in general. The review conducted by Lim et al [19] on the effectiveness of CA interventions for depression and anxiety symptoms indicated that almost a fifth of the participants (19%) attrited throughout the trials without exploring factors associated with the attrition. This was comparable with other reviews on digital health and digital mental health interventions reporting attrition rates that ranged from 24.1% to 47.8% after adjusting for publication bias [13,28]. In general, factors shown to be associated with attrition in trials of digital health interventions include poor user experience, a lack of perceived value, and privacy concerns [28,29]; for example, studies on mental health apps reported technical issues and errors that might affect users’ overall experience [15,30]. Qualitative findings further suggested that factors such as a lack of human interactions in digital health interventions and users’ technological competence also played a role in participants’ attrition [31].

In addition, for smartphone-based mental health interventions, providing monetary compensation and reminders to engage were associated with significantly lower attrition rates [13]. Conversely, participants in the intervention condition were more likely to drop out than the waitlist participants [13,32]. These reviews focused only on smartphone-delivered interventions and included studies published before 2020, omitting several more recently published studies on CA interventions. To fully harness CA interventions, there is a need to better understand the factors associated with both overall attrition as well as differential attrition in these interventions.

To this end, we aimed to (1) estimate the overall and differential rates of attrition in CA interventions, (2) evaluate the impact of the study on design- and intervention-related aspects on the overall and differential attrition rates in CA interventions, and (3) map and describe study design features aimed at reducing or mitigating study attrition in the trials.

Overview

We performed a systematic review of attrition rates in RCTs of CA health interventions in line with the Cochrane gold standard methodology [33] and the meta-analysis approach outlined by Linardon and Fuller-Tyszkiewicz [13]. We reported this review in line with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [34]. The PRISMA checklist is included in Multimedia Appendix 1.

Criteria for Study Selection

Types of Studies

Our inclusion criteria included RCTs, cluster RCTs, crossover RCTs, quasi-RCTs, and pilot RCTs in English. We decided to include these variations of RCTs because the field is still nascent, and findings from different forms of RCTs could be beneficial to understand the attrition rate in CA interventions. The publication types included peer-reviewed journals and conference papers that were published up to June 2022.

Types of Participants

Participants’ characteristics included healthy participants and participants with subclinical or clinically diagnosed mental health disorders such as depression, anxiety, attention-deficit/hyperactivity disorder, and bipolar disorder. Participants of any age were included so long as they personally interacted with the CA.

Types of Interventions

We included studies reporting a synchronous 2-way exchange with the participants via a CA. We excluded studies where the CA dialogues were wholly or partially delivered by human operators (Wizard of Oz) and studies with asynchronous response systems.

The interventions included either the delivery of psychotherapeutic content or those that provided training to improve mental well-being, reduced the symptoms of mental health conditions, or both. This included studies that aimed to reduce the symptoms of depression for clinical and subclinical populations or studies implementing mindfulness training for healthy adults. Detailed inclusion and exclusion criteria are outlined in Multimedia Appendix 2 [13,17,33,35-39].

Types of Outcome Measures

The primary outcome was the reported attrition number and the attrition rate calculated by the weighted risk of attrition of participants against the sample size of the studies for participants assigned to the CA intervention who then discontinued the study. This included the total attrition rate and the differential attrition rate in the intervention and comparison groups.

Search Methods for the Identification of Studies

The search strategy included index terms and keywords that describe CAs, such as “conversational agent,” “embodied conversational agent,” “virtual assistant,” and “virtual coach” (Multimedia Appendix 3). The search strategy was previously developed for our scoping reviews [40,41] and was updated to include studies from 2020 to 2022 with the assistance of a medical librarian. We conducted the searches in the following databases on June 6, 2022: PubMed, Embase (Ovid), PsycINFO (Ovid), Cochrane Central Register of Controlled Trials, and Web of Science. In addition, we conducted a gray literature search on the first 200 entries from Google Scholar as suggested by the literature for the most optimal combination of databases [42,43]. We did not include any filter terms in the search. We also performed citation chasing by searching for all records citing ≥1 of the included studies (forward citation chasing) and all records referenced in the included studies (backward citation chasing) using Citationchaser [44]. The tables of excluded studies are presented in Multimedia Appendix 4.

Data Collection and Analysis

Selection of Studies

On updating the searches from 2020 to 2022, we imported all identified references from the different electronic databases into a single file. The duplicated records were removed using revtool on R [35] and manually on Zotero (Corporation for Digital Scholarship). One reviewer (AIJ) performed the title and abstract screening using ASReview [36], an open-source machine learning software tool. The tool uses an active learning algorithm to actively sort and re-sort the records by prioritizing the most relevant records first based on the user’s inclusion and exclusion decisions. The title and abstract screening steps are detailed in Multimedia Appendix 2.

Three reviewers (AIJ, XL, and LM) were engaged in the full-text review. One reviewer (AIJ) retrieved the full text of the studies, and 2 reviewers (AIJ and XL) assessed their eligibility independently and in parallel. Any disagreements were discussed and resolved between the reviewers and with a third reviewer (LM) acting as the arbiter. Studies that were identified in our previous reviews (up to 2020) [41] and met the inclusion criteria of this review were included based on discussions among the 3 reviewers (AIJ, XL, and LM).

Data Extraction and Management

Data were extracted using a data extraction form on Microsoft Excel. The data extraction form was piloted by 2 reviewers (AIJ and XL) on the same papers (5/41, 12%) and amended in line with the feedback. We also included additional fields as required from the data extraction form that we referenced from Linardon and Fuller-Tyszkiewicz [13]. Four reviewers (AIJ, XL, GS, and Nileshni Fernando) extracted the data independently and in parallel.

We extracted the year of publication; study design; the type of comparison group (active or inactive); the type of intervention; and details of the CAs, including the type of CA (rule based or artificial intelligence [AI] enhanced), the personality of the CA [17], the level of human support, the presence of a reminder mechanism, and the input and output modalities of the CA. In addition, we extracted information on the study duration, compensation paid to study participants, and any other mechanism included specifically to increase user engagement. Any disagreements among the reviewers were resolved by discussion.

Assessment of the Risk of Bias in Included Studies

Four reviewers (AIJ, XL, GS, and Nileshni Fernando) independently assessed the risk of bias in the included studies using the Cochrane Risk of Bias Tool 2.0 [33] and visualized using robvis [45]. The risk of bias assessment was piloted with 10 (24%) of the 41 studies for consistency and clarity of judgment by 2 reviewers (AIJ and XL). The steps involved in the assessment of the risk of bias are detailed in Multimedia Appendix 2, and we have provided a summary, along with a table, in Multimedia Appendix 5. We requested clarification or more data from the authors of 1 (2%) of the 41 studies but did not receive any response even after sending a reminder 2 weeks later. The assessment of publication bias was reported via funnel plots and the Egger test for publication bias [37].

Data Analysis

The meta-analysis was conducted based on the approach outlined by Linardon and Fuller-Tyszkiewicz [13] and the Cochrane Handbook for Systematic Reviews of Interventions (version 6.3) [33]. We defined attrition as the number of participants who dropped out of the study during the intervention period by not completing the postintervention assessment. We did not include the follow-up period [13]. For crossover design studies, we only included the data before the crossover following the aforementioned definition. The second part of the crossover was not considered as the follow-up period.

The study’s overall attrition rate was estimated by calculating the weighted pooled event rate using random-effect models based on a meta-proportional approach [33] using Freeman-Tukey double arcsine transformed proportion [38]. This indicated the relative risk of attrition against the sample size of the studies for participants assigned to the CA intervention group. The aim of this overall attrition analysis was to compute the overall rate of attrition in the intervention group after controlling for the different sample sizes in the included studies. Event rates were then converted to percentages of events per 100 participants and calculated separately for all included studies (short-term studies [≤8 wk from baseline] as well as longer-term studies [>8 wk from baseline]). We used short-term and long-term groupings to facilitate a comparison between our results and those of the previous study on attrition in smartphone-delivered interventions for mental health problems [13].

The differential attrition rate was calculated as the odds ratio (OR) of the likelihood to attrit between the CA intervention condition and the comparison condition. The aim of the differential attrition analysis was to understand the odds of attrition compared with the control group. The ORs were calculated using random-effect models separately for short-term and long-term studies weighted by their inverse variance. Studies with 0 events in both arms were weighted as 0, and a correction of 0.5 was added to the arm with 0 events as a continuity correction. A log OR of >1 indicated a higher likelihood of attrition in the CA intervention groups compared with the control groups. We also conducted subgroup analyses to explore the sources of heterogeneity in both overall and differential meta-analyses. The detailed meta-analysis procedure and subgroup analyses conducted are specified in Multimedia Appendix 2. We also performed post hoc sensitivity analyses for subgroups with <5 studies because the estimate of tau-square might be imprecise [39]. In addition, we conducted exploratory analyses of all included studies regardless of the intervention duration using the same set of prespecified subgroup analyses on the overall and differential meta-analyses.

Meta-analysis was not conducted on the participant-level factors and the predictors of attrition owing to variability in reporting. We also identified additional factors significantly associated with attrition (P<.05) in the included RCTs. We collated and narratively presented these factors associated with attrition as reported by the studies.

Overview

The updated search strategy retrieved 2228 records from peer-reviewed databases and 2319 from citation searching. After removing duplicates, of the total 4547 (2228+2319) records, 4030 (1877+2319[citation searching]-147[duplicates in citation searching]-19[records from other methods]) (88.63%) were screened on ASReview. Of these 4030 studies, 179 (4.4%) were then considered for full-text screening. We included 11 (6.1%) of the 179 studies identified from the full-text screening. We further identified 2 systematic reviews on CA intervention and included 14 studies that were not identified from our search strategy [19,46]. These studies used the Deprexis and electronic problem-solving treatment (ePST) interventions that did not explicitly identify themselves as CAs; for instance, both Deprexis and ePST described themselves as simulated dialogue that tailored their responses based on users’ input [47,48]. Subsequently, we followed up with an additional search on PubMed using “Deprexis OR ePST” as the search term and included 3 additional studies. We also included 13 studies identified in our previous review [41]. Thus, the total number of included studies is 41 (studies included in previous scoping review: n=13, 32%; new studies included from databases: n=11, 27%; and new studies included via other methods: n=17, 41%). Figure 1 presents the study selection process.

**Figure 1.** PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flowchart. CA: conversational agent; ePST: electronic problem-solving treatment; RCT: randomized controlled trial. ^a[19]; ^b[46].

Study Characteristics

Of the 41 studies included in this review, 22 (54%) were published before 2020 [15,47-67], and 19 (46%) were published in 2020 or later [14,68-85] (Table 1). Most of the studies were conducted in the United States (13/41, 32%) [14,15,48,56,58,59,64-66,68,70,82,85] and Germany (13/41, 32%) [47,50,52-55,60-62,69,75], with some studies (2/41, 5%) conducted in multiple countries such as Switzerland and Germany [51] and Switzerland, Germany, and Austria [57]. Most of the interventions were short-term interventions and lasted ≤8 weeks (26/41, 63%) [14,15,48,49,52,56,58,62-68,70-74,76-81,84], whereas some of the studies (15/41, 37%) lasted >8 weeks [47,50,51,53-55,57,59-61,69,75,82,83,85].

Table 1. Characteristics of included studies (n=41).

Study characteristics		Values, n (%)
Year of publication
	Before 2020	22 (54)
	2020 or later	19 (46)
Country
	United States	13 (32)
	Germany^a,b	13 (32)
	South Korea	3 (7)
	Switzerland^a,b	2 (5)
	United Kingdom	2 (5)
	Other (European Union)^b,c	5 (12)
	Other^d	6 (15)
Type of study design
	RCT^e	29 (71)
	Pilot RCT	8 (20)
	Crossover RCT	4 (10)
Study duration
	≤8 wk	26 (63)
	>8 wk	15 (37)
Sample population
	General population	11 (27)
	Population considered at risk	18 (44)
	Clinical population	12 (29)
Target clinical outcome
	Treatment and monitoring	17 (41)
	Education and training	24 (59)
Target disorder
	Depression only	17 (41)
	Mental well-being	9 (22)
	Co-occurring depression and anxiety	5 (12)
	Other^f	10 (24)
Type of control
	Waitlist control	18 (44)
	Active control	15 (37)
	Treatment as usual	8 (20)
Enrollment method
	Remote enrollment option only	23 (56)
	In-person enrollment option provided	16 (39)
	Not specified	2 (5)
Session type
	Defined session length	29 (71)
	User-determined session length	12 (29)
Attrition range (%)
	0-10	13 (32)
	11-20	6 (15)
	21-30	11 (27)
	31-40	2 (5)
	41-50	3 (7)
	>50	6 (15)

^aConducted in both Switzerland and Germany.

^bConducted in Switzerland, Germany, and Austria.

^cIreland, Sweden, Italy, and the Netherlands.

^dJapan, Ukraine, Argentina, New Zealand, China, and Russia.

^eRCT: randomized controlled trial.

^fAnxiety only, panic disorder, height phobia, gambling disorder, substance abuse, attention-deficit/hyperactivity disorder, irritable bowel syndrome, eating disorder, and personality disorder.

Psychoeducation and training were the focus of 24 (59%) of the 41 studies [47-57,59-62,64,66,69,70,75,77-79,84]. In almost half of the studies (18/41, 44%), participants were screened for mental health symptoms before the start of the study [14,50,52-56,59,62,63,66,68,72-74,80,82,83], and more than half of the studies (23/41, 56%) enrolled participants remotely using the web or the telephone [14,15,47,49-53,56,57,62,64,65,68-70,72,75,77,81-84]. More than one-third of the studies (17/41, 41%) focused on depression as the target disorder [47,48,50-56,58-62,67,69,83]. Of the 41 studies, 18 (44%) used waitlist control group participants [14,48-54,56,58,62,63,66,68,72,77,78,82], and 15 (37%) used an active control that included information or self-help (n=10, 67%) [15,60,65,71,74-76,80,81,83], alternative or comparable treatments such as stress-management cognitive behavioral therapy without a digital component (n=3, 20%) [73,84,86], or symptoms rating only (n=2, 13%) [64,70].

In the intervention group, of the 41 studies, 13 (32%) reported attrition between 0% and 10% [15,48,49,51,58,59,63,65,70,71,73,79,80], 6 (15%) reported attrition between 11% and 20% [14,57,67,74,78,83], 11 (27%) reported attrition between 21% and 30% [47,52,54-56,60,61,66,69,72,85], 2 (5%) reported attrition between 31% and 40% [53,64], 3 (7%) reported attrition between 41% and 50% [68,75,81], and 6 (15%) reported >50% attrition [50,62,76,77,82,84].

Risk-of-Bias Assessment

The most common risk of bias in the included studies was the bias in the measurement of the outcomes because they were all self-reported outcomes (Multimedia Appendix 5). The second most common bias was due to the selection of the reported results because most of the studies (18/41, 44%) [15,48-51,58,59,62,64,66,70,71,74,77,80,81,83,84] did not cite the RCT protocol or statistical analysis plan. Most of the studies (29/41, 71%) reported an intention-to-treat analysis. Figure 2 shows the summary plot of the risk of bias.

CA Characteristics

Most of the CAs were rule based (29/41, 71%), and 12 (29%) were AI enhanced using natural language processing or other AI-based algorithms (Table 2). More than one-third of the studies (15/41, 37%) did not describe any specific visual representation of the CA. These were mainly studies that included Deprexis or Deprexis-based interventions (14/15, 93%) because they did not specifically identify themselves as CAs but used dialogue-based interventions. Of the 41 studies, 14 (34%) included an avatar or a static visual representation of the CA and 8 (20%) represented the CA using an embodied CA (ECA). With regard to the ECAs, of the 8 studies, 4 (50%) used relational agents, 3 (38%) used 3D-generated renders, and 2 (25%) used prerecorded videos. The CAs mostly presented a coach-like personality characterized by encouraging and nurturing personalities (19/41, 46%), followed by a factual personality characterized by being nonjudgmental and offering responses based on facts and observations (14/41, 34%). Of the 41 studies, in 5 (12%), the CA was designed to look like a physician or a health care professional, and in 3 (7%), the CA conversed with users using informal language characterized by exclamations, abbreviations, and emoticons in the dialogue (7%). More than one-third of the interventions were delivered via web-based applications (15/41, 37%), followed by those delivered by a stand-alone smartphone app (11/41, 27%).

Table 2. Characteristics of conversational agents (CAs; n=41).

Characteristics			Values, n (%)
Type ofCA
	No avatar or no visual representation	15 (37)
	Avatar only	14 (34)
	ECA^a	8 (20)
	Not specified	4 (10)
Delivery channel
	Web-based application	15 (37)
	Stand-alone smartphone app	11 (27)
	Computer- or laptop computer- or tablet computer–embedded program	7 (17)
	Messaging app based^b	7 (17)
	Not specified	1 (2)
Dialogue modality
	Rule based	29 (71)
	AI^c enhanced	12 (29)
Personality
	Coach like	19 (46)
	Factual	14 (34)
	Health care professional like	5 (12)
	Informal	3 (7)

^aECA: embodied CA.

^bSlack, Facebook Messenger, or WeChat.

^cAI: artificial intelligence.

Study Attrition Rates

Overview

The overall weighted attrition rate for the intervention groups in all included studies was 21.84% (95% CI 16.74%-27.36%; I²=94%), whereas the differential attrition rate differed from 0% (log OR 1.28, 95% CI 1.10-1.48; I²=34.6%), indicating that the participants who received CA interventions were more likely to attrit than the control group participants. Figure 3 [14,15,47-85] shows the attrition rates for all included studies.

**Figure 3.** Overall attrition rates for the intervention group in conversational agent–delivered mental health interventions.

Short-Term Studies (≤8 Wk)

Overview

In the short-term studies, the overall weighted attrition rate in the intervention groups was 18.05% (95% CI 9.91%-27.76%), and there was evidence of high trial heterogeneity (I²=94.6%, 95% CI 93.05%-95.74%). The high heterogeneity was due to high variations among the studies in terms of symptoms, types of interventions, and study populations. Of the 26 studies, 5 (19%) reported 0% attrition in the intervention group [48,49,70,73,86]. The lowest attrition rate was 6.12% (95% CI 1.48%-17.15%) [63], and the highest was 71.05% (95% CI 63.38%-77.69%) [77].

The differential attrition rate did not differ from 0% (log OR 1.22, 95% CI 0.99-1.50), indicating that the attrition rates were similar across the intervention and control groups.

The heterogeneity was low to moderate (I²=21.89%, 95% CI 0%-54.6%). Of the 26 studies, 1 (4%) was identified as a potential outlier [15]. Removing this study from the model improved the I² value greatly (I²=1.49%, 95% CI 0%-49.68%), and the differential attrition rate differed from 0% (log OR 1.27, 95% CI 1.04-1.54). This indicated that the attrition rate in the intervention group was larger than that in the control group after removing the outlying study. Multimedia Appendix 6 [14,15,47-85] shows the forest plot for the differential attrition meta-analysis [14,15,47-85].

Publication Bias for Short-Term Studies (≤8 Wk)

For the overall attrition rate, the Egger test was significant (intercept −4.70, 95% CI −8.12 to −1.28; P=.01), indicating possible publication bias. A closer look at the funnel plot showed missing studies toward the bottom right of the plot, which suggested possible publication bias for small sample–sized studies with high attrition rates (Figure 4A [14,15,48,49,52,56,58,62-68,70-74,76-81,84]). For the differential attrition rate, the funnel plot indicated evidence of plot symmetry, and the Egger test was not significant (intercept −4.85, 95% CI −1.56 to 0.58; P=.39; Figure 4B [14,15,48,49,52,56,58,62-68,70-74,76-81,84]).

**Figure 4.** Funnel plots and the Egger test for publication bias for (A) overall attrition and (B) differential attrition in meta-analyses of the short-term studies.

Subgroup Analyses of the Attrition Rates in Short-Term Studies (≤8 Wk)

Subgroup analyses were conducted for both overall attrition rate (Table 3) and differential attrition rate (Multimedia Appendix 6) in the CA-intervention groups compared with the control groups.

Table 3. Subgroup analyses for overall attrition rate in conversational agent (CA)–delivered mental health interventions.

Subgroups			Short-term studies (≤8 wk)									Long-term studies (>8 wk)
			Interventions, n		Attrition rate, % (95% CI)		I² (%)	P value			Interventions, n			Attrition rate, % (95% CI)		I² (%)	P value
Risk of bias									.21									.22
	High risk of bias	8		26.18 (15.69-38.13)		89.6				4			22.31 (17.85-27.10)		0
	Low risk of bias	18		14.76 (4.27-29.25)		95.6				11			28.35 (20.25-37.20)		95.4
Funding source									.71									.38
	Industry funding	9		20.43 (8.78-35.06)		92.4				4			32.36 (18.14-48.46)		97.0
	Public funding only	17		16.56 (6.35-29.79)		95.1				11			23.19 (15.92-33.65)		92.1
Duration (wk)									.43									.68
	0-4	17		15.57 (5.10-29.92)		95.3				N/A^a			N/A		N/A
	5-8	9		23.15 (11.13-37.67)		93.4				N/A			N/A		N/A
	9-12	N/A		N/A		N/A				11			25.30 (19.35-31.75)		90.3
	>13	N/A		N/A		N/A				4			30.04 (10.75-53.88)		96.1
Study design									.65									.19
	RCT^b	19		19.68 (10.34-30.95)		95.2				14			26.00 (19.33-33.27)		94.3
	Pilot RCT	7		12.73 (0-38.45)		91.8				1			N/A		N/A
Type of disorder									.80									.32
	Depression	6		17.89 (5.44-34.60)		91.5				11			24.03 (17.74-30.93)		91.4
	Depression and anxiety	5		7.68 (0-40.93)		97.1				N/A			N/A		N/A
	Mental well-being	9		24.29 (8.37-44.57)		94.2				N/A			N/A		N/A
	Other^c	4		20.02 (10.61-31.36)		81.9				3			33.77 (17.01- 52.90)		94.9
With CBT^d									.41									.65
	Yes	17		21.35 (11.26-33.41)		95.1				12			27.51 (20.16-35.31)		94.5
	No	9		11.64 (0.21-33.29)		93.8				3			22.99 (7.94-42.72)		92
With mindfulness component									.02									.27
	Yes	12		30.24 (17.02-45.27)		95.5				11			23.89 (17.67-30.71)		91.5
	No	14		8.66 (0.89-21.20)		92.8				4			34.35 (18.00-52.84)		93.8
Personalization									.63									.53
	No personalization	6		17.04 (5.08-33.33)		84.7				2			38.10 (10.38-70.94)		97.1
	Minimal personalization	2		44.57 (2.50-92.50)		96.0				N/A			N/A		N/A
	Substantial personalization	9		21.56 (12.52-32.12)		89.0				12			25.25 (19.16-31.85)		91.3
	Major personalization	9		10.98 (0-35.05)		96.5				1			19.51 (8.61-33.23)		N/A
CA algorithm									.42									.49
	Rule based	16		21.53 (11.72-33.12)		93.1				13			25.02 (19.30-31.19)		90.6
	AI^e enhanced	10		13.25 (1.29-32.79)		96.2				2			37.16 (8.05-72.63)		95
Type of CA									.14									.16
	No avatar	3		33.36 (15.39-54.24)		94.36				12			29.21 (21.81-37.19)		94.6
	ECA^f	6		8.03 (1.37-18.04)		59				2			15.35 (5.05-29.63)		80.3
	Avatar	13		20.13 (6.67-37.97)		96.4				1			19.51 (8.61-33.23)		N/A
	Not specified	4		15.15 (1.79-35.93)		86.4				N/A			N/A		N/A
With rewards component									.74									<.001^g
	Yes	15		16.63 (9.93-27.41)		92.4				1			54.83 (49.60-60.00)		89.9
	No	11		20.07 (5.77;-39.39)		96				14			24.68 (19.20-30.59)
Reminder									.20									.98
	With reminder	14		23.96 (12.19-37.99)		94.8				7			26.42 (14.80-39.96)		95.7
	Without reminder	12		11.43 (1.76-26.27)		94.7				8			26.51 (17.85-36.16)		92.1
Delivery channel									.02^g									<.001^g
	Web based	4		30.74 (15.26-48.70)		91.5				11			26.96 (20.73-33.66)		90.8
	Computer- or laptop computer– or tablet computer–embedded program	6		4.72 (0.06-13.49)		61.5				1			9.33 (3.62-17.12)		N/A
	Smartphone app	10		14.32 (5.71-25.56)		87.5				1			54.83 (49.60-60.00)		N/A
	Messenging app based (meaning Slack, Facebook Messenger, or WeChat based)	6		33.27 (10.58-60.80)		96.9				1			19.51 (8.61-33.23)		N/A
	Not specified	N/A		N/A		N/A				1			22.11 (14.27-31.06)		N/A
With blended component									.38									.02
	Yes	3		19.12 (10.38-29.56)		95.1				6			18.65 (12.86-25.20)		75.4
	No	23		9.89 (0.35-26.31)		38.8				9			32.54 (23.01-42.84)		94.9
Enrollment method									.15									.12
	Remote options only	14		26.63 (15.19-39.82)		95.4				9			30.18 (20.38-40.95)		95.6
	With in-person option	10		9.68 (0.15-27.14)		92				6			21.28 (16.69-26.26)		53.7
	Not specified	2		5.02 (0-33.08)		88				N/A			N/A		N/A
Study population									.05^g									.41
	At risk	11		19.92 (1.53-29.81)		90.1				7			30.10 (17.09-44.93)		96.4
	Clinical	4		3.53 (0-13.05)		36				8			23.87 (18.55-29.61)		76.5
	General	11		22.05 (6.73-42.30)		96.1				N/A			N/A		N/A
Session length									.61									.34
	Defined session length	15		20.26 (9.93-32.79)		94.1				14			27.05 (20.28-34.38)		94.3
	User determined	11		15.31 (3.17-33.12)		95.4				1			19.51 (8.61-33.23)		N/A
With symptom trackers component									.99									.003
	Yes	14		17.77 (6.44-32.52)		93.9				6			16.36 (10.32-23.39)		64.1
	No	12		18.22 (7.21-32.44)		95.3				9			33.48 (24.91-42.62)		95.4

^aN/A: not applicable.

^bRCT: randomized controlled trial.

^cAnxiety only, panic disorder, height phobia, gambling disorder, substance abuse, attention-deficit/hyperactivity disorder, irritable bowel syndrome, eating disorder, and personality disorder.

^dCBT: cognitive behavioral therapy.

^eAI: artificial intelligence.

^fECA: embodied CA.

^gSubgroup analyses were not significant after dropping subgroups with <5 studies.

For the overall attrition rate, there were significant differences in the attrition rates in short-term studies depending on the inclusion of mindfulness content (χ²₁=5.1; P=.02). Interventions that included mindfulness content (n=12) showed a higher rate of attrition in the intervention group (30.24%, 95% CI 17.02%-42.27%) compared with interventions without mindfulness content (n=14; 8.66%, 95% CI 0.89%-21.2%). There were also significant differences depending on the population types, delivery channels, and types of disorders. However, these differences were not significant after excluding subgroups with <5 studies.

Subgroup analysis of the differential attrition rates showed that there were significantly different attrition rates in the intervention group compared with the control group depending on the subdurations (χ²₁=5.8; P=.02). There were also significantly different attrition rates between study populations (χ²₂=9.3; P=.009), and the types of controls (χ²₁=4.7; P=.03). The relative risks of attrition for studies that lasted between 5 and 8 weeks were significantly different (n=9; log OR 1.61, 95% CI 1.22-2.13) compared with studies that lasted <5 weeks (n=17; log OR 0.99, 95% CI 0.75-1.31). Studies that recruited populations considered to be at risk (n=9) showed significantly higher attrition rates in the intervention group than in the control group (log OR 1.65, 95% CI 1.26-2.15) when compared with general populations (n=7; log OR 0.96, 95% CI 0.71-1.30) and clinical populations (n=3; log OR 0.47, 95% CI 0.13-1.66). The subgroup analysis was still significant when compared between the general population and the group considered to be at risk only (χ²₁=6.9; P=.03). Finally, there were higher attrition rates in the intervention studies than in studies that used waitlist controls (n=11; log OR 1.52 95% CI 1.18-1.95) than those that used active controls (n=7; log OR 0.96, 95% CI 0.69-1.54). Only 1 (2%) of the 41 studies used treatment as usual as the control group [67]. No other comparisons were significant.

Long-Term Studies (>8 Wk)

Overview

The weighted attrition rate for the intervention group in long-term studies was 26.59% (95% CI 20.09%-33.63%), and there was evidence of high trial heterogeneity (I²=93.89%, 95% CI 91.43%-95.64%). The lowest relative attrition rate was 6% (95% CI 1.44%-16.84%) [51], and the highest was 54.83% (95% CI 49.61%-59.95%) [77].

The differential attrition rate differed from 0% (log OR 1.33, 95% CI 1.08-1.65), indicating that the attrition rates in the intervention group were higher than those in the control group. The heterogeneity was moderate (I²=49.43%, 95% CI 0.083%-72.11%). However, 1 (2%) of the 41 studies was identified as a potential outlier [50]. Removing this study from the model improved the I² value greatly (I²=24.22%, 95% CI 0%-59.80%), and the differential attrition rate still differed from 0% (log OR 1.22, 95% CI 1.05-1.42); again, this indicated that the attrition rates in the intervention group were significantly larger than those in the control group even after removing the outlying study. The outlying study [50] used a weighted randomization method in which 80% of the participants were allocated to the intervention group. The subgroup analyses were conducted without the outlier because the study seemed to explain >20% of the heterogeneity in the model. However, sensitivity analyses conducted with and without the outlying study did not change the results of the subgroup analysis.

Publication Bias in Long-Term Studies (>8 Wk)

For the overall attrition rate, the funnel plot indicated evidence of plot asymmetry, but the Egger test was not significant (intercept −0.79, 95% CI −4.34 to 2.75; P=.67), suggesting a low likelihood of publication bias. For the differential attrition rate, the funnel plot indicated evidence of plot symmetry, and the Egger test was not significant (intercept 0.46, 95% CI −0.66 to 1.58; P=.43; Figure 5 [14,15,47-85]).

**Figure 5.** Funnel plots and the Egger test for publication bias for (A) overall attrition and (B) differential attrition in meta-analyses of the long-term studies.

Subgroup Analyses of the Attrition Rates in Long-Term Studies (>8 Wk)

For overall attrition, there were significant differences in the attrition rates in the intervention groups of long-term studies that had a blended design (χ²₁=4.7; P=.03) and included symptom trackers or mood monitoring (χ²₁=9.0; P=.003). Interventions that included blended designs (n=6) showed a significantly lower attrition rates (20.41%, 95% CI 15.49%-25.81%) than those without (n=9; 33.3%, 95% CI 23.01%-44.44%). Interventions that included symptom trackers (n=6) showed a significantly lower attrition rates (16.36%, 95% CI 10.32%-23.39%) than those without (n=9; 33.48%, 95% CI 24.91%-42.62%; Table 3). No other comparisons were significant.

The differential attrition rates were significantly different across studies that included mindfulness content compared with those without (χ²₁=5.0; P=.03) and studies that targeted depression symptoms compared with those that targeted other mental health symptoms (χ²₁=8.6; P=.003). Studies without mindfulness intervention showed higher attrition rates in the intervention groups than in the controls (n=10; log OR 1.56, 95% CI 1.25%-1.96%) compared with studies that included mindfulness content (n=4; log OR 1.11, 95% CI 1.05%-1.42%). Studies that targeted depression symptoms specifically showed relatively similar attrition rates in both intervention and control groups (n=10; log OR 1.09, 95% CI 0.96%-1.22%) compared with studies that targeted other mental health symptoms such as gambling disorder and substance abuse (n=4; log OR 1.63, 95% CI 1.28%-2.08%). No other comparisons were significant.

Exploratory Subgroup Analyses

The overall weighted attrition rate for the intervention group in all included studies was 21.84% (95% CI 16.74%-27.36%; I²=94%). Exploratory subgroup analyses using the prespecified subgroup analyses were conducted to explain the heterogeneity of the included studies regardless of the duration of the interventions. We have included the full findings in Multimedia Appendix 7 and only highlight findings that differed from our prespecified analysis here. For overall attrition, as in our prespecified analysis, there were significant differences in the attrition rates in the intervention groups depending on the inclusion of mindfulness content (χ²₁=4.0; P=.05). However, we did not find significant differences in the inclusion of blended support compared with nonblended intervention and the inclusion of symptom tracker compared with intervention without symptom tracker. In addition, we found significant differences depending on the type of CA used (χ²₃=13.1; P=.005), the CA delivery channel (χ²₄=21.3; P<.001), and the study enrollment method (χ²₂=7.4; P=.02). Studies that did not use any identifiable avatar reported the highest rate of attrition (n=15; 30%, 95% CI 23.44%-37.01%), followed by studies that did not specify the use of an avatar or a visual representation of the CA (n=14; 20.12%, 95% CI 7.29%-36.82%), studies that used a static avatar (n=4; 15.15%, 95% CI 1.79%-35.93%), and studies that used an ECA (n=8; 10.3%, 95% CI 4.29%-18.04%). Interventions that were delivered via messaging app (meaning “Slack, Facebook Messenger, or WeChat” based) showed the highest rate of attrition (n=7; 31.19%, 95% CI 10.68%-56.28%), followed by those delivered by web-based applications (n=15; 27.9%, 95% CI 22.35%-33.78%), and those delivered by stand-alone smartphone apps (n=11; 17.36%, 95% CI 6.54%-31.48%). CAs installed on a computer, a laptop computer, or a tablet computer showed the lowest rate of attrition (n=7; 5.61%, 95% CI 1.09%-12.3%). Finally, studies that offered remote onboarding only (n=23) showed a higher attrition rate (28.42%, 95% CI 21.30%-36.1%) than studies that offered an in-person onboarding process (n=16; 15.01%, 95% CI 8.46%-22.82%).

The differential attrition rate differed from 0% (log OR 1.28, 95% CI 1.10-1.48; I²=34.6%), indicating that the participants who received CA interventions were more likely to attrit than the control group participants.

For differential attrition, our findings mostly concurred with our prespecified analysis. Unlike in the prespecified analysis, there was a significant difference for studies that included symptom trackers (χ²₁=5.0; P=.02). Studies that included symptom trackers (n=17) showed relatively lower attrition in the intervention group than in the control group (log OR 1.02, 95% CI 0.81-1.29) compared with studies without symptom trackers (n=18; log OR 1.44, 95% CI 1.19-1.74).

Additional Factors Associated With Attrition

Of the 41 studies, 16 (39%) assessed the association of different study features on participants’ attrition (short-term studies: n=8, 50%; long-term studies: n=8, 50%). We grouped the findings into two categories: (1) demographic predictors and (2) baseline measurement predictors or symptom severity.

The associations between participants’ demographics and attrition were assessed and reported in 10 (63%) of the 16 studies (short-term studies: n=4, 40%; long-term studies: n=6, 60%). Only 1 (25%) of the 4 short-term studies found age to be significantly associated with attrition. Participants who dropped out were found to be significantly younger than those who completed the whole intervention [62]. Other demographics-related factors assessed in these 10 studies were not associated with attrition, including sex, years of education, marital status, employment, actively receiving therapy or medication, and current diagnosis, for both short-term and long-term studies.

Of the 41 studies, 12 (29%) explored the association between baseline predictors or symptom severity and attrition (short-term studies: n=6, 50%; long-term studies: n=6, 50%). More severe baseline symptoms were associated with attrition for some of the short-term studies (2/6, 33%) but not for the long-term studies. Higher anxiety-related symptoms measured using the General Anxiety Disorder-7 questionnaire [62] and the Visceral Sensitivity Index [68] were significantly related to attrition. Other factors found to be associated with higher attrition included lower quality of life measured using the World Health Organization Quality of Life Brief Questionnaire [52], higher Fear of Food Questionnaire score [68], higher severity of gambling pathology measured using the Pathological Gambling Adaptation of the Yale-Brown Obsessive-Compulsive Scale [62], and lower self-esteem [81]. Of the 10 studies, 1 (10%) reported that participants who attrited significantly greater positive affect compared to those who completed the study using the Positive and Negative Affect Schedule [77].

Principal Findings

To our knowledge, this systematic review and meta-analysis is the first to examine attrition in RCTs of CA interventions for mental health. A total of 41 RCTs met the inclusion criteria. Our findings showed that approximately one-fifth of the participants (18.05%) dropped out in short-term studies, and approximately a quarter (26.59%) dropped out in long-term studies. Participants who received CA interventions were more likely to attrit than the control group participants for both long-term and short-term studies. Several study-level moderators were identified. For short-term studies, higher overall attrition rates were found in intervention groups that included mindfulness content and those that included participants from the general population. Compared with the control group participants, participants in the short-term CA interventions were also more likely to attrit in studies that lasted >1 month, those that included participants considered to be at risk, and studies in which intervention group participants were compared against waitlist controls as opposed to alternative active controls. For long-term studies, higher overall attrition rates were found in interventions that did not include human support and studies that did not include a symptom tracker. Exploratory subgroup analyses conducted on all included studies regardless of the study duration largely supported the analysis except for the inclusion of blended support. In addition, exploratory analyses found that studies that used an ECA, delivered via a computer or smartphones, and provided in-person enrollment options were associated with lower attrition rates.

Comparison With Prior Research

Overall Attrition

The overall attrition rates for short-term and long-term studies in our review are lower than the attrition estimates of short-term and long-term studies on smartphone-delivered interventions (24.1% and 35.5%, respectively) [13] and electronic mental health treatments for eating disorders (21%) [87]. Our findings are comparable with attrition rates in studies evaluating face-to-face mental health treatments such as interpersonal psychotherapy (20.6%) [88], individual psychotherapy for depressive disorders (19.9%) [89], and generalized anxiety disorders (17%) [90]. When focusing only on studies evaluating smartphone-delivered CAs in our review (n=11), we found that only 14.32% of the participants dropped out of the short-term studies and 17.36% dropped out of all included studies; these rates are lower than the estimated attrition rate previously reported for smartphone-delivered mental health interventions [13]. Taken together, the delivery channel may indirectly influence study attrition [13]. Although we found lower attrition in studies that were delivered via programs installed on a computer or a laptop computer compared with other delivery channels, these studies [48,58,59,63,79] were conducted in a laboratory setting compared with the participants’ environment, which might influence the retention rate.

Factors Associated With Attrition

CA interventions used as adjuvants of psychotherapy sessions or with human support may aid in retaining participants, particularly in long-term studies. A closer look at the long-term studies that included human support revealed that most of these studies (6/15, 40%) used the CA interventions as an adjunctive tool with no specific instruction given to the primary therapist on how to support participants’ journey through the CA interventions. This suggests that the presence of the therapist alone could suffice to retain participation in the study over a longer period. Although participants may be staying for the primary therapist and not engaging with the CA intervention, a study with both therapist-guided and unguided groups found no significant differences in the time spent with the CA intervention [51]. It is also possible that participants may have consulted their primary therapist about the CA interventions, which was not reported by the studies. A recent scoping review reported that many studies that included human support did not consistently report the type of support provided by humans [91]. This was similarly observed in our study because none of the included studies mentioned whether the therapist discussed the CA intervention during the participants’ routine sessions. Furthermore, it is also possible that participants within clinical settings are more likely to stay with the intervention, as was found in our results, unrelated to the blended support provided. Therefore, it is difficult to draw conclusions regarding the impact of using CA interventions adjunctively in terms of retaining participants within the study. However, this finding should be taken with caution because we found that there were no significant differences in the attrition rates between studies that included blended and nonblended approaches when we included all studies regardless of the study duration. This may be because fewer studies (9/41, 22%) included the blended approach overall.

In terms of the intervention content, short-term studies that included mindfulness content had higher overall attrition rates than those without mindfulness content. This was contrary to 2 previous meta-analyses that showed that the inclusion of the mindfulness component had no impact on attrition rates [13,28]. However, the attrition rate was similar to that in a systematic review of self-guided mindfulness interventions that reported an unweighted mean attrition rate of 31% (range 0%-79%) [92]. Future interventions may need to pay closer attention to participants’ engagement when including mindfulness content as part of a CA intervention. Factors such as using symptom trackers and personalized feedback may increase the engagement rate [93]. This is aligned with our findings and prior meta-analyses that suggest that including feedback may improve the retention rate [28].

Interestingly, our results also found relatively lower differential attrition rates in the intervention groups of long-term studies that included mindfulness compared with studies without. However, this finding was not replicated when we analyzed all studies regardless of the study duration. A recent review suggested that longer mindfulness practice sessions may be associated with the development of mindfulness skills [92]. Therefore, this result should be interpreted with caution because the relationship between the frequency and the duration of mindfulness practice sessions is still unclear [92].

Our study also found that including any form of visual representation of a CA may be associated with lower attrition rates compared with no visualization at all. This is aligned with many studies on the design of CAs that stressed the importance of design to create positive perceptions of the CA [94]. However, a recent scoping review reported that visual representation of the CA showed mixed and no association with subjective user experience [95]. A closer look at the studies included in this review suggested that most of them (35/41, 85%) lasted only 1 session [95]. It is possible that not having any CA visualization could affect user experience over time as alluded to by our findings. Future studies should explore the relationship between CA visual representation and user experience and study adherence over a longer duration.

In terms of the study sample population, sample populations considered to be at risk were more likely to attrit than samples from general and clinical sample populations. However, other meta-analyses of digital interventions for mental health issues found no difference in attrition rates across sample populations [13,28]. This finding is difficult to interpret because there could be multiple factors that may affect this relationship, such as symptom severity and other demographic factors not included in our analysis. Future studies may explore this relationship further to better understand this association.

Factors Not Associated With Attrition

Providing monetary incentives did not affect the attrition rates significantly. This finding is similar to that of a previous meta-analysis focusing on smartphone apps specifically for depressive symptoms [28] but contrary to that of a study focusing on smartphone interventions for mental health in general [13]. Time spent within the study may be a greater driver for attrition, as can be seen in the higher attrition rates for longer-term studies found in our results. However, research on the impact of monetary incentives on participants’ retention in digital health interventions is still in its infancy [96]. More needs to be done to understand how monetary incentives affect participants’ retention as well as effective engagement in the intervention.

Finally, several features such as providing reminders and the level of personalization provided by the CA also did not affect attrition rates. This is noteworthy because a personalized intervention that is responsive to users’ inputs is related to better engagement with the intervention [93]. There may be further nuances between attrition and effective engagement; for instance, factors that lower the risk of attrition might not be directly related to factors that promote study adherence [28].

Strengths and Limitations

We conducted a comprehensive literature search that included both peer-reviewed databases and gray literature sources; in addition, we conducted backward and forward citation searches. As this is a nascent area, we prioritized the sensitivity of our search to capture the various representations of CAs.

However, our study has some limitations. First, some unpublished literature presented at niche conferences and meetings may have been omitted. In addition, some studies might have escaped our search strategy, as evidenced by the inclusion of Deprexis, Deprexis-based interventions, and ePST intervention that did not explicitly mention terms related to conversational agent in the studies concerned. Second, the heterogeneity of the mental health conditions addressed by the CA intervention made it difficult to generalize the findings to a specific disorder. The recommendations provided here should be taken as a general suggestion to improve retention rates in CA interventions for mental health but not for a specific mental health disorder. Third, our results indicated possible publication bias in short-term studies. A closer look at the funnel plot suggested a lack of small sample–sized studies (n<20) with high attrition rates in the intervention groups. It is possible that these studies were not being published because they could be too experimental and small scaled. However, it is possible that the findings are reported elsewhere at niche conferences and internal sharing, which may have been omitted based on our search strategy. Fourth, our results may not directly translate into understanding factors that may increase engagement with the interventions. Although we recognize that engagement is interlinked with attrition [97], the lack of consensus and reporting of engagement metrics limits our understanding of this relationship. A recent review identified several patient-, intervention-, and system-level factors associated with engagement [98]. However, many of these associations were not consistent across different digital mental health interventions, and there was poor consistency in the reporting of the metrics. We echoed others’ recommendations to include and standardize the reporting of engagement metrics to better understand the indicators of nonuse attrition [41,93,97,98]. Our findings can inform future researchers of the potential factors for attrition in CA interventions. These may serve as a basis to make informed decisions on the sample size required or to conduct further studies on the specific mechanisms that may or may not motivate attrition. Fifth and last, our subgroup analyses for all studies regardless of the intervention duration were exploratory post hoc analyses and should be interpreted as such.

Implications and Recommendations

Several implications and recommendations emerged from our findings. First, researchers may want to account for the attrition of approximately one-third of the participants when designing RCTs involving CA interventions. This number may need to be further adjusted depending on the sample population, delivery modes, and comparison group used in the intervention to minimize the potential threats to the external and internal validity of studies that evaluate the efficacy of CA interventions for mental health. Second, researchers may want to consider including active controls in RCTs. Our results and the findings from other similar reviews on attrition in digital health research [13,28] suggested that comparing digital interventions with waitlist controls might not be the ideal way because participants in the comparison group may be more motivated to remain in the study than those in the intervention group [13]. Control interventions consisting of periodic mood assessments via an app or a nonconversational version of the app may be more appropriate for the assessment of CA effectiveness. Third, the inclusion of a visual representation of the CA may help create a more positive perception of the CA and reduce attrition. A recent review suggested that design considerations such as having a humanlike representation and having medical attire for the CA may be helpful to reduce attrition [95]. Fourth, CA interventions should be adjunctive to ongoing therapy sessions. Although the association between attrition and the use of blended support may be inconclusive, the use of CA interventions may further enrich participants’ experience between sessions and provide ongoing support to practice the skills learned during regular sessions. Fifth and last, clinicians interested in implementing CA interventions in their practice should be aware of the high attrition rate and should closely monitor patients’ progress within their practice.

Conclusions

According to our findings, at least one-fifth of the intervention group participants in RCTs of CA interventions will drop out of the trials within the intervention period. This attrition rate is comparable with that of face-to-face mental health interventions and less than that of other digital health interventions. Furthermore, intervention group participants were more likely to drop out than control group participants. Attrition was lower in shorter-term studies, studies that included participants considered to be at risk, and studies in which intervention group participants were compared against waitlist controls as opposed to alternative active controls. In addition, not including mindfulness content or symptom trackers was found to be associated with a smaller risk of attrition. Future studies may benefit from delivering CA interventions in a blended setting, with symptom screening; comparing the CA interventions against active controls such as symptom tracking only without the CA component; and including a visual representation of the CA to reduce attrition rates in the intervention group.

Acknowledgments

The authors would like to acknowledge Ms Yasmin Ally, Lee Kong Chian School of Medicine librarian, for her assistance in translating and executing the search strategy. The authors would also like to acknowledge Ms Nileshni Fernando for her assistance in the data extraction and risk-of-bias assessment. This research was supported by the Singapore Ministry of Education under the Singapore Ministry of Education Academic Research Fund Tier 1 (RG36/20). The research was conducted as part of the Future Health Technologies program, which was established collaboratively between ETH Zürich and the National Research Foundation, Singapore. This research was also supported by the National Research Foundation, Prime Minister’s Office, Singapore, under its Campus for Research Excellence and Technological Enterprise program.

Data Availability

This systematic review and meta-analysis does not contain primary data. The data sets generated and analyzed during this study are available from the corresponding author upon reasonable request. The data will be made available by emailing the corresponding author.

Authors' Contributions

LTC conceptualized the study and provided supervision at all steps of the research. LTC and AIJ designed the study. AIJ, XL, and GS extracted the data. AIJ conducted the analysis and wrote the original manuscript. LTC, LM, XL, GS, and YLT provided a critical review of the manuscript. All authors approved the final version of the manuscript and take accountability for all aspects of the work.

Conflicts of Interest

LTC is an associate editor of JMIR Medical Education. The other authors reported no conflict of interest.

Multimedia Appendix 1

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 checklist.

PDF File (Adobe PDF File), 198 KB

Multimedia Appendix 2

Criteria for study selection and extended meta-analysis method.

DOCX File , 51 KB

Multimedia Appendix 3

Search strategy for PubMed.

DOCX File , 22 KB

Multimedia Appendix 4

Tables of excluded studies.

DOCX File , 184 KB

Multimedia Appendix 5

Risk-of-bias assessment of the included studies.

DOCX File , 242 KB

Multimedia Appendix 6

Differential attrition analysis for the short-term and long-term studies.

DOCX File , 87 KB

Multimedia Appendix 7

Extended results for the exploratory subgroup analysis.

DOCX File , 50 KB

GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry. Feb 2022;9(2):137-150. [FREE Full text] [CrossRef] [Medline]
Vigo D, Jones L, Atun R, Thornicroft G. The true global disease burden of mental illness: still elusive. Lancet Psychiatry. Feb 2022;9(2):98-100. [CrossRef] [Medline]
Silverman AL, Teachman BA. The relationship between access to mental health resources and use of preferred effective mental health treatment. J Clin Psychol. Jun 2022;78(6):1020-1045. [CrossRef] [Medline]
Thornicroft G, Chatterji S, Evans-Lacko S, Gruber M, Sampson N, Aguilar-Gaxiola S, et al. Undertreatment of people with major depressive disorder in 21 countries. Br J Psychiatry. Feb 2017;210(2):119-124. [FREE Full text] [CrossRef] [Medline]
Chong SA, Abdin E, Vaingankar JA, Kwok KW, Subramaniam M. Where do people with mental disorders in Singapore go to for help? Ann Acad Med Singap. Apr 2012;41(4):154-160. [FREE Full text] [Medline]
Gulliver A, Griffiths KM, Christensen H. Perceived barriers and facilitators to mental health help-seeking in young people: a systematic review. BMC Psychiatry. Dec 30, 2010;10:113. [FREE Full text] [CrossRef] [Medline]
Ku BS, Li J, Lally C, Compton MT, Druss BG. Associations between mental health shortage areas and county-level suicide rates among adults aged 25 and older in the USA, 2010 to 2018. Gen Hosp Psychiatry. May 2021;70:44-50. [FREE Full text] [CrossRef] [Medline]
Oladeji BD, Gureje O. Brain drain: a challenge to global mental health. BJPsych Int. Aug 2016;13(3):61-63. [FREE Full text] [CrossRef] [Medline]
Clement S, Schauman O, Graham T, Maggioni F, Evans-Lacko S, Bezborodovs N, et al. What is the impact of mental health-related stigma on help-seeking? A systematic review of quantitative and qualitative studies. Psychol Med. Jan 2015;45(1):11-27. [CrossRef] [Medline]
Weisel KK, Fuhrmann LM, Berking M, Baumeister H, Cuijpers P, Ebert DD. Standalone smartphone apps for mental health-a systematic review and meta-analysis. NPJ Digit Med. 2019;2:118. [CrossRef] [Medline]
Richards D, Richardson T. Computer-based psychological treatments for depression: a systematic review and meta-analysis. Clin Psychol Rev. Jun 2012;32(4):329-342. [CrossRef] [Medline]
Linardon J, Hindle A, Brennan L. Dropout from cognitive-behavioral therapy for eating disorders: a meta-analysis of randomized, controlled trials. Int J Eat Disord. May 01, 2018;51(5):381-391. [CrossRef] [Medline]
Linardon J, Fuller-Tyszkiewicz M. Attrition and adherence in smartphone-delivered interventions for mental health problems: a systematic and meta-analytic review. J Consult Clin Psychol. Jan 2020;88(1):1-13. [CrossRef] [Medline]
Prochaska JJ, Vogel EA, Chieng A, Kendra M, Baiocchi M, Pajarito S, et al. A therapeutic relational agent for reducing problematic substance use (Woebot): development and usability study. J Med Internet Res. Mar 23, 2021;23(3):e24850. [FREE Full text] [CrossRef] [Medline]
Fitzpatrick KK, Darcy A, Vierhile M. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR Ment Health. Jun 06, 2017;4(2):e19. [FREE Full text] [CrossRef] [Medline]
Inkster B, Sarda S, Subramanian V. An empathy-driven, conversational artificial intelligence agent (Wysa) for digital mental well-being: real-world data evaluation mixed-methods study. JMIR Mhealth Uhealth. Nov 23, 2018;6(11):e12106. [FREE Full text] [CrossRef] [Medline]
Tudor Car L, Dhinagaran DA, Kyaw BM, Kowatsch T, Joty S, Theng YL, et al. Conversational agents in health care: scoping review and conceptual analysis. J Med Internet Res. Aug 07, 2020;22(8):e17158. [FREE Full text] [CrossRef] [Medline]
Bendig E, Erb B, Schulze-Thuesing L, Baumeister H. The next generation: chatbots in clinical psychology and psychotherapy to foster mental health – a scoping review. Verhaltenstherapie. Aug 20, 2019;32(Suppl. 1):64-76. [CrossRef]
Lim SM, Shiau CW, Cheng LJ, Lau Y. Chatbot-delivered psychotherapy for adults with depressive and anxiety symptoms: a systematic review and meta-regression. Behav Ther. Mar 2022;53(2):334-347. [CrossRef] [Medline]
Beilharz F, Sukunesan S, Rossell SL, Kulkarni J, Sharp G. Development of a positive body image chatbot (KIT) with young people and parents/carers: qualitative focus group study. J Med Internet Res. Jun 16, 2021;23(6):e27807. [FREE Full text] [CrossRef] [Medline]
Salamanca-Sanabria A, Jabir AI, Lin X, Alattas A, Kocaballi AB, Lee J, et al. Exploring the perceptions of mHealth interventions for the prevention of common mental disorders in university students in Singapore: qualitative study. J Med Internet Res. Mar 20, 2023;25:e44542. [FREE Full text] [CrossRef] [Medline]
Scholten MR, Kelders SM, Van Gemert-Pijnen JE. Self-guided web-based interventions: scoping review on user needs and the potential of embodied conversational agents to address them. J Med Internet Res. Nov 16, 2017;19(11):e383. [FREE Full text] [CrossRef] [Medline]
Nißen M, Rüegger D, Stieger M, Flückiger C, Allemand M, V Wangenheim F, et al. The effects of health care chatbot personas with different social roles on the client-chatbot bond and usage intentions: development of a design codebook and web-based study. J Med Internet Res. Apr 27, 2022;24(4):e32630. [FREE Full text] [CrossRef] [Medline]
Miner AS, Milstein A, Hancock JT. Talking to machines about personal mental health problems. JAMA. Oct 03, 2017;318(13):1217-1218. [CrossRef] [Medline]
Eysenbach G. The law of attrition. J Med Internet Res. Mar 31, 2005;7(1):e11. [CrossRef] [Medline]
What Works Clearinghouse™ standards handbook, version 4.1. Institute of Education Sciences. URL: https://ies.ed.gov/ncee/wwc/Docs/referenceresources/WWC-Standards-Handbook-v4-1-508.pdf [accessed 2023-12-07]
Li P, Stuart EA, Allison DB. Multiple imputation: a flexible tool for handling missing data. JAMA. Nov 10, 2015;314(18):1966-1967. [FREE Full text] [CrossRef] [Medline]
Torous J, Lipschitz J, Ng M, Firth J. Dropout rates in clinical trials of smartphone apps for depressive symptoms: a systematic review and meta-analysis. J Affect Disord. Feb 15, 2020;263:413-419. [CrossRef] [Medline]
Torous J, Nicholas J, Larsen ME, Firth J, Christensen H. Clinical review of user engagement with mental health smartphone apps: evidence, theory and improvements. Evid Based Ment Health. Aug 2018;21(3):116-119. [FREE Full text] [CrossRef] [Medline]
Bendig E, Erb B, Meißner D, Bauereiß N, Baumeister H. Feasibility of a software agent providing a brief intervention for self-help to uplift psychological wellbeing ("SISU"). A single-group pretest-posttest trial investigating the potential of SISU to act as therapeutic agent. Internet Interv. Feb 24, 2021;24:100377. [FREE Full text] [CrossRef] [Medline]
Sanders GJ, Cooke C, Gately P. Exploring reasons for attrition among vulnerable and under-served sub-groups across an online integrated healthy lifestyles service during COVID-19. SAGE Open Med. Oct 22, 2021;9:20503121211054362. [FREE Full text] [CrossRef] [Medline]
Goldberg SB, Bolt DM, Davidson RJ. Data missing not at random in mobile health research: assessment of the problem and a case for sensitivity analyses. J Med Internet Res. Jun 15, 2021;23(6):e26749. [FREE Full text] [CrossRef] [Medline]
Higgins JP, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, et al. Cochrane Handbook for Systematic Reviews of Interventions version 6.3. London, UK. The Cochrane Collaboration; 2022.
Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. Jul 21, 2009;6(7):e1000097. [FREE Full text] [CrossRef] [Medline]
Westgate MJ. revtools: an R package to support article screening for evidence synthesis. Res Synth Methods. Dec 2019;10(4):606-614. [CrossRef] [Medline]
van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell. Feb 01, 2021;3(2):125-133. [CrossRef]
Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ. Sep 13, 1997;315(7109):629-634. [FREE Full text] [CrossRef] [Medline]
Harrer M, Cuijpers P, Furukawa TA, Ebert DD. Doing Meta-Analysis with R: A Hands-On Guide. Boca Raton, FL. CRC Press; Sep 14, 2021.
Borenstein M, Hedges LV, Higgins JP, Rothstein HR. Introduction to Meta-Analysis. Hoboken, NJ. Wiley; Mar 12, 2009.
Martinengo L, Jabir AI, Goh WW, Lo NY, Ho MH, Kowatsch T, et al. Conversational agents in health care: scoping review of their behavior change techniques and underpinning theory. J Med Internet Res. Oct 03, 2022;24(10):e39243. [FREE Full text] [CrossRef] [Medline]
Jabir AI, Martinengo L, Lin X, Torous J, Subramaniam M, Tudor Car L. Evaluating conversational agents for mental health: scoping review of outcomes and outcome measurement instruments. J Med Internet Res. Apr 19, 2023;25:e44548. [FREE Full text] [CrossRef] [Medline]
Bramer WM, Rethlefsen ML, Kleijnen J, Franco OH. Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study. Syst Rev. Dec 06, 2017;6(1):245. [FREE Full text] [CrossRef] [Medline]
Haddaway NR, Collins AM, Coughlin D, Kirk S. The role of Google scholar in evidence reviews and its applicability to grey literature searching. PLoS One. Sep 17, 2015;10(9):e0138237. [FREE Full text] [CrossRef] [Medline]
Haddaway NR, Grainger MJ, Gray CT. Citationchaser: a tool for transparent and efficient forward and backward citation chasing in systematic searching. Res Synth Methods. Jul 2022;13(4):533-545. [CrossRef] [Medline]
McGuinness LA, Higgins JP. Risk-of-bias VISualization (robvis): an R package and Shiny web app for visualizing risk-of-bias assessments. Res Synth Methods. Jan 2021;12(1):55-61. [CrossRef] [Medline]
Twomey C, O'Reilly G, Meyer B. Effectiveness of an individually-tailored computerised CBT programme (Deprexis) for depression: a meta-analysis. Psychiatry Res. Oct 2017;256:371-377. [CrossRef] [Medline]
Meyer B, Bierbrodt J, Schröder J, Berger T, Beevers CG, Weiss M, et al. Effects of an internet intervention (Deprexis) on severe depression symptoms: randomized controlled trial. Internet Interv. Mar 2015;2(1):48-59. [CrossRef]
Cartreine JA, Locke SE, Buckey JC, Sandoval L, Hegel MT. Electronic problem-solving treatment: description and pilot study of an interactive media treatment for depression. JMIR Res Protoc. Sep 25, 2012;1(2):e11. [FREE Full text] [CrossRef] [Medline]
Ly KH, Ly AM, Andersson G. A fully automated conversational agent for promoting mental well-being: a pilot RCT using mixed methods. Internet Interv. Dec 2017;10:39-46. [FREE Full text] [CrossRef] [Medline]
Meyer B, Berger T, Caspar F, Beevers CG, Andersson G, Weiss M. Effectiveness of a novel integrative online treatment for depression (Deprexis): randomized controlled trial. J Med Internet Res. May 11, 2009;11(2):e15. [FREE Full text] [CrossRef] [Medline]
Berger T, Hämmerli K, Gubser N, Andersson G, Caspar F. Internet-based treatment of depression: a randomized controlled trial comparing guided with unguided self-help. Cogn Behav Ther. 2011;40(4):251-266. [CrossRef] [Medline]
Moritz S, Schilling L, Hauschildt M, Schröder J, Treszl A. A randomized controlled trial of internet-based therapy in depression. Behav Res Ther. Aug 2012;50(7-8):513-521. [CrossRef] [Medline]
Schröder J, Brückner K, Fischer A, Lindenau M, Köther U, Vettorazzi E, et al. Efficacy of a psychological online intervention for depression in people with epilepsy: a randomized controlled trial. Epilepsia. Dec 2014;55(12):2069-2076. [FREE Full text] [CrossRef] [Medline]
Fischer A, Schröder J, Vettorazzi E, Wolf OT, Pöttgen J, Lau S, et al. An online programme to reduce depression in patients with multiple sclerosis: a randomised controlled trial. Lancet Psychiatry. Mar 2015;2(3):217-223. [CrossRef] [Medline]
Klein JP, Berger T, Schröder J, Späth C, Meyer B, Caspar F, et al. Effects of a psychological internet intervention in the treatment of mild to moderate depressive symptoms: results of the EVIDENT study, a randomized controlled trial. Psychother Psychosom. 2016;85(4):218-228. [FREE Full text] [CrossRef] [Medline]
Beevers CG, Pearson R, Hoffman JS, Foulser AA, Shumake J, Meyer B. Effectiveness of an internet intervention (Deprexis) for depression in a united states adult sample: a parallel-group pragmatic randomized controlled trial. J Consult Clin Psychol. Apr 2017;85(4):367-380. [CrossRef] [Medline]
Berger T, Urech A, Krieger T, Stolz T, Schulz A, Vincent A, et al. Effects of a transdiagnostic unguided Internet intervention ('velibra') for anxiety disorders in primary care: results of a randomized controlled trial. Psychol Med. Jan 2017;47(1):67-80. [FREE Full text] [CrossRef] [Medline]
Sandoval LR, Buckey JC, Ainslie R, Tombari M, Stone W, Hegel MT. Randomized controlled trial of a computerized interactive media-based problem solving treatment for depression. Behav Ther. May 2017;48(3):413-425. [FREE Full text] [CrossRef] [Medline]
Shamekhi A, Bickmore T, Lestoquoy A, Gardiner P. Augmenting group medical visits with conversational agents for stress management behavior change. In: Proceedings of the 12th International Conference, PERSUASIVE 2017. 2017. Presented at: 12th International Conference, PERSUASIVE 2017; April 4–6, 2017, 2017; Amsterdam, The Netherlands. [CrossRef]
Zwerenz R, Becker J, Knickenberg RJ, Siepmann M, Hagen K, Beutel ME. Online self-help as an add-on to inpatient psychotherapy: efficacy of a new blended treatment approach. Psychother Psychosom. 2017;86(6):341-350. [CrossRef] [Medline]
Berger T, Krieger T, Sude K, Meyer B, Maercker A. Evaluating an e-mental health program ("deprexis") as adjunctive treatment tool in psychotherapy for depression: results of a pragmatic randomized controlled trial. J Affect Disord. Feb 2018;227:455-462. [CrossRef] [Medline]
Bücker L, Bierbrodt J, Hand I, Wittekind C, Moritz S. Effects of a depression-focused internet intervention in slot machine gamblers: a randomized controlled trial. PLoS One. Jun 08, 2018;13(6):e0198859. [FREE Full text] [CrossRef] [Medline]
Freeman D, Reeve S, Robinson A, Ehlers A, Clark D, Spanlang B, et al. Virtual reality in the assessment, understanding, and treatment of mental health disorders. Psychol Med. Oct 2017;47(14):2393-2400. [FREE Full text] [CrossRef] [Medline]
Greer S, Ramo D, Chang YJ, Fu M, Moskowitz J, Haritatos J. Use of the chatbot "Vivibot" to deliver positive psychology skills and promote well-being among young people after cancer treatment: randomized controlled feasibility trial. JMIR Mhealth Uhealth. Oct 31, 2019;7(10):e15018. [FREE Full text] [CrossRef] [Medline]
Fulmer R, Joerin A, Gentile B, Lakerink L, Rauws M. Using psychological artificial intelligence (Tess) to relieve symptoms of depression and anxiety: randomized controlled trial. JMIR Ment Health. Dec 13, 2018;5(4):e64. [CrossRef] [Medline]
Sidner CL, Bickmore T, Nooraie B, Rich C, Ring L, Shayganfar M, et al. Creating new technologies for companionable agents to support isolated older adults. ACM Trans Interact Intell Syst. Jul 24, 2018;8(3):1-27. [CrossRef]
Burton C, Szentagotai Tatar A, McKinstry B, Matheson C, Matu S, Moldovan R, et al. Pilot randomised controlled trial of Help4Mood, an embodied virtual agent-based system to support treatment of depression. J Telemed Telecare. Sep 2016;22(6):348-355. [CrossRef] [Medline]
Hunt M, Miguez S, Dukas B, Onwude O, White S. Efficacy of Zemedy, a mobile digital therapeutic for the self-management of irritable bowel syndrome: crossover randomized controlled trial. JMIR Mhealth Uhealth. May 20, 2021;9(5):e26152. [FREE Full text] [CrossRef] [Medline]
Gräfe V, Moritz S, Greiner W. Health economic evaluation of an internet intervention for depression (deprexis), a randomized controlled trial. Health Econ Rev. Jun 16, 2020;10(1):19. [FREE Full text] [CrossRef] [Medline]
Narain J, Quach T, Davey M, Park HW, Breazeal C, Picard R. Promoting wellbeing with sunny, a chatbot that facilitates positive messages within social groups. In: Proceedings of the Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 2020. Presented at: CHI EA '20; April 25-30, 2020, 2020; Honolulu, HI. [CrossRef]
Oh J, Jang S, Kim H, Kim JJ. Efficacy of mobile app-based interactive cognitive behavioral therapy using a chatbot for panic disorder. Int J Med Inform. Aug 2020;140:104171. [CrossRef] [Medline]
So R, Furukawa TA, Matsushita S, Baba T, Matsuzaki T, Furuno S, et al. Unguided chatbot-delivered cognitive behavioural intervention for problem gamblers through messaging app: a randomised controlled trial. J Gambl Stud. Dec 2020;36(4):1391-1407. [CrossRef] [Medline]
Danieli M, Ciulli T, Mousavi SM, Riccardi G. A conversational artificial intelligence agent for a mental health care app: evaluation study of its participatory design. JMIR Form Res. Dec 01, 2021;5(12):e30053. [CrossRef] [Medline]
Jang S, Kim JJ, Kim SJ, Hong J, Kim S, Kim E. Mobile app-based chatbot to deliver cognitive behavioral therapy and psychoeducation for adults with attention deficit: a development and feasibility/usability study. Int J Med Inform. Jun 2021;150:104440. [CrossRef] [Medline]
Klein JP, Hauer-von Mauschwitz A, Berger T, Fassbinder E, Mayer J, Borgwardt S, et al. Effectiveness and safety of the adjunctive use of an internet-based self-management intervention for borderline personality disorder in addition to care as usual: results from a randomised controlled trial. BMJ Open. Sep 08, 2021;11(9):e047771. [FREE Full text] [CrossRef] [Medline]
Klos MC, Escoredo M, Joerin A, Lemos VN, Rauws M, Bunge EL. Artificial intelligence-based chatbot for anxiety and depression in university students: pilot randomized controlled trial. JMIR Form Res. Aug 12, 2021;5(8):e20678. [FREE Full text] [CrossRef] [Medline]
Lavelle J, Dunne N, Mulcahy HE, McHugh L. Chatbot-delivered cognitive defusion versus cognitive restructuring for negative self-referential thoughts: a pilot study. Psychol Rec. Aug 24, 2021;72(2):247-261. [CrossRef]
Loveys K, Sagar M, Pickering I, Broadbent E. A digital human for delivering a remote loneliness and stress intervention to at-risk younger and older adults during the COVID-19 pandemic: randomized pilot trial. JMIR Ment Health. Nov 08, 2021;8(11):e31586. [FREE Full text] [CrossRef] [Medline]
Park S, Thieme A, Han J, Lee S, Rhee W, Suh B. “I wrote as if I were telling a story to someone I knew.”: designing chatbot interactions for expressive writing in mental health. In: Proceedings of the 2021 ACM Designing Interactive Systems Conference. 2021. Presented at: DIS '21; June 28-July 2, 2021, 2021; Virtual Event. [CrossRef]
Romanovskyi O, Pidbutska N, Knysh A. Elomia chatbot: the effectiveness of artificial intelligence in the fight for mental health. In: Proceedings of the 5th International Conference on Computational Linguistics and Intelligent Systems. 2021. Presented at: COLINS-2021; April 22-23, 2021, 2021; Kharkiv, Ukraine. URL: https://ceur-ws.org/Vol-2870/paper89.pdf
Troitskaya O, Batkhina A. Mobile application for couple relationships: results of a pilot effectiveness study. Fam Process. Jun 14, 2022;61(2):625-642. [CrossRef] [Medline]
Fitzsimmons-Craft EE, Chan WW, Smith AC, Firebaugh ML, Fowler LA, Topooco N, et al. Effectiveness of a chatbot for eating disorders prevention: a randomized clinical trial. Int J Eat Disord. Mar 2022;55(3):343-353. [CrossRef] [Medline]
Liu H, Peng H, Song X, Xu C, Zhang M. Using AI chatbots to provide self-help depression interventions for university students: a randomized trial of effectiveness. Internet Interv. Mar 2022;27:100495. [FREE Full text] [CrossRef] [Medline]
Medeiros L, Bosse T, Gerritsen C. Can a chatbot comfort humans? Studying the impact of a supportive chatbot on users’ self-perceived stress. IEEE Trans Hum Mach Syst. Jun 2022;52(3):343-353. [CrossRef]
Rubin A, Livingston NA, Brady J, Hocking E, Bickmore T, Sawdy M, et al. Computerized relational agent to deliver alcohol brief intervention and referral to treatment in primary care: a randomized clinical trial. J Gen Intern Med. Jan 2022;37(1):70-77. [FREE Full text] [CrossRef] [Medline]
Park S, Choi J, Lee S, Oh C, Kim C, La S, et al. Designing a chatbot for a brief motivational interview on stress management: qualitative case study. J Med Internet Res. Apr 16, 2019;21(4):e12231. [FREE Full text] [CrossRef] [Medline]
Linardon J, Shatte A, Messer M, Firth J, Fuller-Tyszkiewicz M. E-mental health interventions for the treatment and prevention of eating disorders: an updated systematic review and meta-analysis. J Consult Clin Psychol. Nov 2020;88(11):994-1007. [CrossRef] [Medline]
Linardon J, Fitzsimmons-Craft EE, Brennan L, Barillaro M, Wilfley DE. Dropout from interpersonal psychotherapy for mental health disorders: a systematic review and meta-analysis. Psychother Res. Oct 2019;29(7):870-881. [CrossRef] [Medline]
Cooper AA, Conklin LR. Dropout from individual psychotherapy for major depression: a meta-analysis of randomized clinical trials. Clin Psychol Rev. Aug 2015;40:57-65. [CrossRef] [Medline]
Gersh E, Hallford DJ, Rice SM, Kazantzis N, Gersh H, Gersh B, et al. Systematic review and meta-analysis of dropout rates in individual psychotherapy for generalized anxiety disorder. J Anxiety Disord. Dec 2017;52:25-33. [CrossRef] [Medline]
Bernstein EE, Weingarden H, Wolfe EC, Hall MD, Snorrason I, Wilhelm S. Human support in app-based cognitive behavioral therapies for emotional disorders: scoping review. J Med Internet Res. Apr 08, 2022;24(4):e33307. [FREE Full text] [CrossRef] [Medline]
Taylor H, Strauss C, Cavanagh K. Can a little bit of mindfulness do you good? A systematic review and meta-analyses of unguided mindfulness-based self-help interventions. Clin Psychol Rev. Nov 2021;89:102078. [CrossRef] [Medline]
Jakob R, Harperink S, Rudolf AM, Fleisch E, Haug S, Mair JL, et al. Factors influencing adherence to mHealth apps for prevention or management of noncommunicable diseases: systematic review. J Med Internet Res. May 25, 2022;24(5):e35371. [FREE Full text] [CrossRef] [Medline]
Ter Stal S, Sloots J, Ramlal A, Op den Akker H, Lenferink A, Tabak M. An embodied conversational agent in an eHealth self-management intervention for chronic obstructive pulmonary disease and chronic heart failure: exploratory study in a real-life setting. JMIR Hum Factors. Nov 04, 2021;8(4):e24110. [FREE Full text] [CrossRef] [Medline]
Curtis RG, Bartel B, Ferguson T, Blake HT, Northcott C, Virgara R, et al. Improving user experience of virtual health assistants: scoping review. J Med Internet Res. Dec 21, 2021;23(12):e31737. [FREE Full text] [CrossRef] [Medline]
Boucher EM, Ward HE, Mounts AC, Parks AC. Engagement in digital mental health interventions: can monetary incentives help? Front Psychol. Nov 18, 2021;12:746324. [FREE Full text] [CrossRef] [Medline]
Pham M, Do HM, Su Z, Bishop A, Sheng W. Negative emotion management using a smart shirt and a robot assistant. IEEE Robot Autom Lett. Apr 2021;6(2):4040-4047. [CrossRef]
Lipschitz JM, Pike CK, Hogan TP, Murphy SA, Burdick KE. The engagement problem: a review of engagement with digital mental health interventions and recommendations for a path forward. Curr Treat Options Psych. Aug 25, 2023;10(3):119-135. [CrossRef]

‎

AI: artificial intelligence

CA: conversational agent

ECA: embodied conversational agent

ePST: electronic problem-solving treatment

OR: odds ratio

PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses

RCT: randomized controlled trial

Edited by T de Azevedo Cardoso; submitted 25.04.23; peer-reviewed by X B, Y Xi, E Kim, C Muñoz; comments to author 24.07.23; revised version received 21.09.23; accepted 04.12.23; published 27.02.24.

©Ahmad Ishqi Jabir, Xiaowen Lin, Laura Martinengo, Gemma Sharp, Yin-Leng Theng, Lorainne Tudor Car. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 27.02.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Attrition in Conversational Agent–Delivered Mental Health Interventions: Systematic Review and Meta-Analysis