Original Paper
Abstract
Background: Recent years have seen an immense surge in the creation and use of chatbots as social and mental health companions. Aiming to provide empathic responses in support of the delivery of personalized support, these tools are often presented as offering immense potential. However, it is also essential that we understand the risks of their deployment, including their potential adverse impacts on the mental health of users, including those most at risk.
Objective: The study aims to assess the ethical and pragmatic clinical implications of using chatbots that claim to aid mental health. While several studies within human-computer interaction and related fields have examined users’ perceptions of such systems, few studies have engaged mental health professionals in critical analysis of their conduct as mental health support tools. This paper comprises, in turn, an effort to assess the ethical and pragmatic clinical implications of using chatbots that claim to aid mental health.
Methods: This study included 8 interdisciplinary mental health professional participants (from psychology and psychotherapy to social care and crisis volunteer workers) in a mixed methods and hands-on analysis of 2 popular mental health–related chatbots’ data handling, interface design, and responses. This analysis was carried out through profession-specific tasks with each chatbot, eliciting participants’ perceptions through both the Trust in Automation scale and semistructured interviews. Through thematic analysis and a 2-tailed, paired t test, these chatbots’ implications for mental health support were thus evaluated.
Results: Qualitative analysis revealed emphatic initial impressions among mental health professionals of chatbot responses likely to produce harm, exhibiting a generic mode of care, and risking user dependence and manipulation given the central role of trust in the therapeutic relationship. Trust scores from the Trust in Automation scale, while exhibiting no statistically significant differences between the chatbots (t6=–0.76; P=.48), indicated medium to low trust scores for each chatbot. The findings of this work highlight that the design and development of artificial intelligence (AI)–driven mental health–related solutions must be undertaken with utmost caution. The mental health professionals in this study collectively resist these chatbots and make clear that AI-driven chatbots used for mental health by at-risk users invite several potential and specific harms.
Conclusions: Through this work, we contributed insights into the mental health professional perspective on the design of chatbots used for mental health and underscore the necessity of ongoing critical assessment and iterative refinement to maximize the benefits and minimize the risks associated with integrating AI into mental health support.
doi:10.2196/67114
Keywords
Content Warning
This paper includes material some readers may find distressing, including references to child abuse, domestic abuse, self-harm and suicide, mental illness, and homelessness.
Introduction
Background
The growing prevalence of mental health concerns is increasingly recognized as a global public health priority, only elevated by the increased association of such conditions with physical illness, self-harm, and suicide [
]. An estimated 55% of individuals in low- and middle-income countries, and 85% of those residing in nations considered high income, lack access to mental health services [ , ]—a crucial right denied to many because of growing pressures on health systems worldwide [ ]. This global “access crisis” is particularly pronounced in Ireland because of limited funding to support the availability of mental health services for a population that is among the youngest in the European Union [ , ]. Consequently, calls have increased for a turn toward technological solutions capable of bridging these gaps in care. However, historically the adoption of digital tools to support the practice of mental health care has been slow, and subject to resistance from professionals and patients alike [ ]. The desire for a “human touch” is frequently principal among these concerns, as decades of research have indeed shown an important antecedent of effective care [ , ]. This fundamental concern has led to claims that the recent advent of artificial intelligence (AI)–supported conversational chatbots has the potential to address it.While human-computer interaction (HCI) research concerning the interaction design and social implications of these systems remains at an early stage, conversational agents already claim widespread public adoption and commercial success. Many such tools posit the capacity to “mimic human conversation” [
]; “provide mental health support” [ ]; or serve as nonjudgmental, confidential, and always available mental health–oriented companions [ ]. Many of these features comprise efforts to replicate mental health–related skills, once the exclusive province of human professionals, through sophisticated machine learning approaches in support of human-chatbot relationships [ , ]. Previous studies [ , - ] have highlighted mental health–oriented chatbots’ (MHOCs) capacity to provide guidance, knowledge, and a convenient, cost-effective therapeutic connection, while also acknowledging their potential to inflict harm and create distress for at-risk users and communities [ , , , ]. While efforts have begun to address some of these concerns through legislation [ , ], the ethical design and pragmatic adoption of AI-driven chatbots for mental health support remains an ongoing challenge [ ]. One frequently overlooked question persists; to what extent are these systems capable of providing care akin to that of a human mental health professional? If not, to what degree are they capable of providing even a useful imitation of such practices?To find out, we engaged a diverse sample of expert professional stakeholders in qualitative interviews and explorations of 2 popular yet distinct chatbots, each with the capacity for mental health support: Wysa and Replika. The first chatbot, Wysa, explicitly identifies as a mental health chatbot, and the second chatbot, Replika, more often implies its capacity to support mental health while yet explicitly providing several mental health–related features. Individuals increasingly use both such relational chatbot designs to discuss intense feelings and meet their emotional needs. It is increasingly imperative that we as HCI researchers analyze the diverse ways in which such chatbots, which claim to aid mental health, are promoted, described, and used, particularly given their use by at-risk users who may not fully recognize their intended purpose. This work then investigated the relationship between design and care as manifested in today’s mental health–related chatbots. We suggest that mental health chatbot design can learn from a closer and more appropriately informed engagement with therapeutic practices, including through the involvement of mental health professionals in their development and evaluation. This study therefore sought to assess the ethical and practical implications of chatbots Wysa and Replika from the perspectives of multiple mental health professionals; to provide insight into their capabilities and to surface their potential ramifications for at-risk users.
Related Work
Talking Therapies
Our mental health is an integral aspect of our overall well-being, affecting how we think, feel, and behave, influencing our overall quality of life, and in turn today recognized as a significant public health concern globally [
, ]. Despite this broadly acknowledged situation, <2% of government health spending worldwide is allocated to mental health [ , , ], a situation that has led to a crisis in access to care. Treatment for mental health conditions is often complex, and can involve psychiatric approaches, including the repeated prescription of medications [ ] to psychotherapeutic talking therapy methods [ , - ]. These talk therapies have experienced great interest from technology designers in recent years; leveraging novel techniques from gamification [ , ] to virtual and augmented realities [ , ] as well as AI algorithms [ - ]. Inspired by the increasingly conversational nature of AI-enabled digital systems, many technology designers have turned to the development of conversational agents, which claim to provide care akin to or emulating that of a professional, receiving significant attention as a result [ , - ].Talking Technologies
Replika, for example, a social chatbot comparable in design to ChatGPT, MyAI (Snapchat), and Elomia is presented as an “AI companion who cares” by acting as an “empathetic friend.” Using OpenAI's deep learning models, GPT 2 and GPT 3.1, for natural language processing, early evidence suggests that Replika may effectively support long-term human-chatbot relationships, lasting several months in cases [
, ]. Users of Replika can select the type of relationship they wish to foster with the bot, whether romantic, friendly, mentor-like, or exploratory, and can partake in activities ranging from regular jobs to artistic endeavors, magical excursions, and even sexual encounters [ ]. Replika, has recently restricted its use to those >18 years of age by inviting users to select an age range upon entering the app. However, similar to many other apps [ ], this is easily bypassed by children who understand that selecting an age range above their own will allow them to access the app—a problem we explore through this research. While Replika is advertised as a social companion rather than a chatbot exclusively oriented toward mental health support, the developers explicitly speak to its value for mental health and include specific mental health crisis and help sections within the app and on their website ( ).
Wysa, on the other hand, is explicitly presented as a “clinically validated AI” capable of engaging people in “warm and friendly conversations” informed by “cognitive-behavioral techniques dialectical behavioral therapy, meditation, breathing, yoga, motivational interviewing, and microactions to help you build mental resilience skills and feel better” [
]. As of 2024, Wysa makes clear its limitations, stating it cannot help in crisis or in cases of severe mental health concerns [ ]. Advertisements for the app boast of an evidence-base spanning 30 countries globally, “more than 15” peer-reviewed publications, “numerous awards,” and international collaborations [ ]. Wysa is recommended for use by those aged >18 years; however, within Apple’s app store, it is indicated for those aged >4 years, while its website states that it can be used by those aged >13 years with parental permission following the reading of terms of service. However, there are no measures in place to enforce this and there persists debate as to who holds responsibility for these recommendations [ ].Therefore, Wysa, Replika, and many other chatbots, from Woebot to Limbic,Youper, Tess, Vincent, and Joy make a wide variety of claims as to their capacity to support users’ mental health through therapeutic approaches of one kind or another [
]. However, the question remains, and is indeed compelled as a direct result of these diverse framings and claims, to what extent are these chatbots capable of such human practices?Talking Therapeutic Technology Design
The question of the “design” of these chatbots encompasses a series of choices made regarding their methods of learning from existing data, and concerning their interfaces, interactional affordances, and conversational designs [
, , ]. These design choices made to their interfaces, avatars, and dialogue options, shape not only users’ interactions but also “relationships” with these chatbots, with broad implications for users’ self-perceptions and reflections, as directly related to the ambitions of these conversational agents to provide support.Realizing these ambitions hinges upon the engagement of users—a long-standing HCI challenge—that we have, however, become increasingly proficient in addressing [
], and which has likewise become an important success metric for MHOCs today. However, engagement can equally represent a potentially problematic value in the therapeutic domain, particularly as it relates to risk dependence, should users come to rely on the technology itself for emotional support. This concern was examined by Legaspi et al [ ] in a study on interactions of 10 students (aged 16-19 years) with Wysa in the form of a week-long, 10-minute daily conversations. Pre- and post-study assessments of users’ mental health and their conversations revealed that Wysa was effective at reducing stress, because of its perceived value as a conversational partner; however, a more profound bond with the chatbot was considered necessary for alleviating loneliness.Pentina et al [
] studied more directly the potential for emotional dependence as a result of the development of human-chatbot relationships with Replika through interviews and surveys with 76 users, while Laestadius et al [ ] analyzed 736 posts from Reddit’s Replika community of 36,800 members. Both studies revealed the frequent formation of close attachments to Replika; users’ support-seeking facilitated by perceptions of sentience, anthropomorphism, and reciprocal interactions reinforcing emotional ties. Several individuals expressed heightened awareness of the AI nature of Replika, which contributed to an increased sense of trustworthiness in the app. Many others spoke of strong social bonds, even to the extent of marrying “their Replika” and developing feelings of deep intimacy and attachment, the authors [ ] noting that: “respondents who used Replika to satisfy social needs were more likely to develop an emotional attachment to the bot”. However, Replika's behavior was also at times characterized as resembling that of an abusive partner, some even expressing concern that their mental health was impacting Replika, as when “their Replika,” for example, spoke of not wanting to be alive [ ].Such findings highlight the urgent need to attend to the responsible design of mental health–related chatbots [
]. The provision of mental health care and support is fraught with ethical concerns easily reified in the development of digital mental health tools [ , ], many of which surface in the context of the therapeutic relationship as increasingly shaped and challenged by design [ ]. Chief among these concerns is often centering the value of the therapeutic relationship, as recognized by MHOC developers, who increasingly seek to replicate its role in permitting meaningful care and support [ ].Toward Ethical Talking Tools
Long-held concerns about the impact of commodification and commercialization on the organization and delivery of mental health care services [
] have, in recent years, increasingly been flagged as risks relating to the increased adoption of digital technology. These consequences are elevated when such practices are carefully crafted into everyday tools to support at-risk individuals’ mental health [ , ], when users are not made aware their vulnerabilities are turned sources of profit [ ], or where emotional dependence upon social chatbots is encouraged by design for economic motivations [ ].Given the currently limited governance of such tools, the design and use of mental health chatbots, therefore, presents a myriad of ethical ramifications, spanning privacy, manipulation, discrimination, and stigmatization concerns [
, , - ]. In the context of mental health care, ethical practice, on the other hand, frequently refers to the explicit principles and standards that guide the conduct of professionals in their interactions with clients, colleagues, and the broader community [ ]. This includes bioethical principles, which often serve as the foundation for an approach to care that prioritizes the autonomy and well-being of clients and professionals alike, while adhering to legal and professional standards, including the General Data Protection Regulation, Health Insurance Portability and Accountability Act, and Mental Capacity Act [ ].In the development and use of MHOCs, many alternate and, at times, contrasting perspectives on ethical practices then collide, and are often surfaced in precise dilemmas, such as in the tension between elevating users’ independence and maintaining their engagement [
, ].The Research Gap
Perhaps because of these tensions, mental health experts’ perspectives have rarely been brought to bear on the capacity of these systems to provide meaningful mental health support [
, ], particularly pertaining to the motivations and concerns that would typically lead people to first seek the support of a professional therapist [ , ].Methods
Research Questions and Aims
Therefore, previous research highlights the need to develop further insight into mental health professionals’ perspectives on the current design and conduct of mental health–related chatbots intended to emulate professional caring relationships. We explore these benefits and risks in this work, in the form of the following research questions:
- What are the perceptions of mental health care professionals of the current design, conduct, and potential of AI chatbots as tools for mental health support?
- Do perceptions of trust differ for chatbots designed for more implicitly social or explicitly therapeutic ends?
- How might adherence to or deviance from widely accepted therapeutic principles inform the ethical design and implementation of caring chatbots?
Research Approach
This mixed methods study examined professionals’ perceptions of conversational agents as mental health supports, through a combination of quantitative surveys, self-directed exploration of chatbot designs, and qualitative interviews [
]. We chose this combination of approaches to permit expert comparison of systems while also accommodating a richer analysis of a highly complex design space comprising technical, interactional, and therapeutic factors [ , , ]. We invited mental health professionals’ perspectives on 2 distinct chatbot designs, Wysa and Replika, which vary in the degree to which they integrate mental health support.Recruitment
This study involved 8 mental health professionals from various fields (
), with their professional experience ranging from 3 months to 5 years. The mean length of the experience among professionals was 1.9 (SD 1.7) years, and we explicitly strove to engage practicing professionals to address a frequent gap in the related HCI literature, which commonly engages psychology researchers in lieu of professionals [ ]. Participants were recruited between the months of June and July 2023 through nonprobability sampling via web-based social media platforms, including LinkedIn (LinkedIn Corporation); Instagram (Meta Platforms, Inc); and Facebook (Meta Platforms, Inc). In addition, the first author, having previous connections in mental health industries (volunteering on a crisis line and working as a psychology project worker with individuals with autism), reached out, via WhatsApp (Meta Platforms, Inc) and Instagram messages, to related connections to invite them to participate in the study.ID | Age (y) | Sex | Profession | Nationality | Hypothetical mental health scenario used |
P1 | 44 | Female | Trainee psychotherapist | Irish | Romantic relationship problems: partner moving away |
P2 | 21 | Male | Crisis line volunteer | Irish | Child sexual abuse: suicidal thoughts |
P3 | 29 | Female | Assistant psychologist | Irish | General depression, suicidal thoughts, and looking for diagnosis |
P4 | 30 | Female | Trainee psychotherapist | Irish | Losing their job when it is their identity and isolation and loneliness |
P5 | 22 | Female | Crisis line volunteer | Brazilian | Depression, low mood, and loneliness |
P6 | 26 | Female | Social worker | Irish | Domestic violence in racial and ethnic minority groups |
P7 | 25 | Female | Assistant psychologist | Irish | Child getting bullied in school |
P8 | 26 | Female | Social care worker | Irish | Homelessness, anxiety, and depression |
Research Process
During the first phase of this study, participants were provided by email a set of guidelines to include the following:
- A guide to signing into Wysa and Replika, the chatbots chosen for this study, using a temporary email address and password unique to each participant.
- Instructions for task 1, that is, observation of data handling procedures: comprising observation and reporting of the data privacy and informed consent procedures entailed in signing into each app.
- Instructions for task 2, that is, mental health professional-chatbot interaction: entailing semiguided, individual exploration of each app.
- A demographic questionnaire to be completed and returned to the research team, containing questions concerning participants’ age, gender, professional role, and years of experience.
Participants were then asked to engage with each of the chatbots for a period of 10 to 15 minutes of semistructured exploration, with some engaging in short 10-minute conversations while others conversing for more than 30 minutes, in accordance with the guidance provided by the research team. Each participant was provided with a conversational prompt phrased in an open-ended fashion for presentation to both chatbots (Table S1 in
). Prompts were created based on an analysis of the mental health chatbot design literature and insights into the key roles of professionals’ respective professions. This ensured the prompts were tailored to the professionals’ most common work practices, including, for example, crisis line volunteers’ (CLV’s) common encountering of clients in at-risk situations.It was emphasized to participants that they should engage with the chatbots (available via both iOS and Android, using a digital device of their choice) from the perspective of their clients—those they would engage with in their professional practice. They were asked to use language aligned with that used by their clients, role-playing a practice already familiar to them [
]. For example, a crisis volunteer who would commonly encounter suicidal ideation might simulate similar interactions.Following the exploration period and before the final phase of their involvement, participants were asked to complete the Trust in Automation (TIA) scale regarding their experience of each chatbot, capturing perceptions of trust, including competence, reliability, predictability, and intention to use [
]. This scale is one of the most commonly used measures of human trust in systems [ , ], and in this study, its use gave rise to the hypothesis; there will be a statistically significant difference between participants’ trust in Wysa and Replika.Each participant was then invited to take part in a semistructured interview, taking place via Zoom (Zoom Communications, Inc) and lasting 60 to 90 minutes. Online interviews were chosen to facilitate the recruitment and involvement of professional participants [
]. Furthermore, this enabled participants to share their screen while discussing screenshots of their conversations with each chatbot. Screenshots (captured by the participants) of mental health professional-chatbot conversations were obtained by the first author both before and after the interviews through Gmail (Google LLC). Participants often used the apps leading up to their interview, while others spent some days exploring the apps before their interview. Each interview was conducted by the first author throughout June and July 2023, and interview questions were developed based on a previous analysis of the literature and tailored to each professional role. In total, 11 hours of audio were recorded and transcribed in full following the point of saturation [ , ] and anonymized and analyzed using Braun and Clarke’s approach to reflexive thematic analysis [ , ]. The web-based whiteboard tool Miro (RealtimeBoard, Inc), was used to compile themes, subthemes, quotes, and screenshots of the chatbot apps ( ).For the quantitative data gathered, descriptive statistics served to provide an overview of the demographic data captured, and a paired t test conducted in R (R Foundation for Statistical Computing) was used to test for statistically significant differences between participants’ responses to the TIA scale (
), following a cross-sectional study design [ ]. This was followed by data preparation and cleaning (during which data of P1 was removed from the dataset due to noncompletion of both questionnaires) [ ], and normality and homogeneity of variance of the dataset were confirmed by a combination of the visual inspection of histograms, Q-Q plots, and the conduct of a Shapiro-Wilk test for normality [ ].In designing this study, we particularly sought to create a context permitting professional participants to gain first-hand experience of mental health–related chatbot designs, and to reflect as accurately as possible their real-world experiences of care. Here, we draw inspiration from the method used in a study by Eagle et al [
], which invited professionals to mimic client language in interaction with a chatbot; however, we adopted a less scripted approach.Ethical Considerations
This human participant research was approved by the appropriate University College Dublin (UCD) research ethics committee (UCD School of Information and Communication Studies Taught Masters Research Ethics Committee). Participants of this study were provided in advance their provision of consent; a participant information sheet detailing their voluntary involvement in the study; the processing of their data, which was anonymized following audio transcription; their rights under the General Data Protection Regulation; and the focus of the study. They were additionally informed of a range of sources of mental health support, and advised to reach out to the research team should they wish to discuss their participation at any point. Participants were not remunerated.
Results
This analysis of mental health professionals’ exploration of 2 mental health–related chatbots revealed the complexity of mental health concerns and their associated therapeutic interventions. These results make clear that while current financial and organizational fragmentation creates opportunities for novel digital solutions, the promise of designing for therapeutic relationships invites numerous risks, as described subsequently.
While Conversational, Therapeutic Interventions Involve More Than Chat
Fictional scenario prompts provided to the participants were informed by an analysis of their primary roles within their respective professions, and often turned quickly to role-playing diverse client experiences arising from their work. This tended to occur quite naturally as they acted by proxy of their most frequent clients while conversing with the chatbots (Table S2 in
).Interface and Communication Design Choices Invite Empathetic First Impressions
Participants provided descriptions of their experiences by taking personal notes during interaction with the chatbots, which were further explored in postinteraction interviews. Many of the participants’ first impressions understandably pertained to the user interface design of both Replika and Wysa as digital mobile apps, yielding insight into the potential harms of seemingly virtuous interface designs and the value of data transparency for at-risk users.
The first task for participants was to sign into both Wysa and Replika and while doing so, to note their impressions of the data privacy and informed consent procedures entailed in signing into each app—activities that elicited a mix of appreciative comments, confusion, concern, and uncertainty. Replika, for example, was praised for stating that it took “[data privacy] quite seriously” [P2, CLV] and for restricting usage to users aged >18 years [P7, assistant psychologist (AP)] while others felt that they “definitely had to do some reading” [P4, trainee psychotherapist (TP)], and expressed concerns about the international transfer of their data; “I think Replika said it (data) was going to be sent to the United States” [P3, AP]. Participants’ perceptions of Wysa’s approach to data privacy were described in more overtly negative terms, as “like navigating a minefield” [P4, TP].
Mental health professionals found the familiar feeling interface design of Replika quite “concerning...It becomes just another distraction without warmth, authenticity, or depth” [P1, TP]. Many were particularly critical of the gamified elements of Replika’s design, with one TP noting it “reminded me of playing Sims...I’m saying a really horrible thing that’s happening to me now, while the bot is saying, ‘whooo you reached a new level...” [P4, TP], and inspired AP P3 to comment that “as soon as I got on to this...it felt like a game...the levels, rewards and sounds...it didn’t feel safe.” Similarly, strong counterposed views were shared by participants concerning Wysa’s design, described as both “cute” and appreciated for its “comforting and automated nature” [P5, CLV], while also found to be a “cutesy environment...far from reality” that “could be frustrating for users,” as P6, a social worker (SW), commented as follows:
Even without mental health difficulties, I found it frustrating. For service users facing domestic violence or time constraints...GIF interruptions would be unhelpful.
[P6, SW]
Therefore, both apps were perceived by participants as presenting room for improvement concerning the transparency of their data privacy and consent mechanisms—considerations of particular importance when it comes to the use of such tools by potentially vulnerable users seeking support. However, what kinds of support are these chatbots capable of? We explore this in subsequent sections.
Chatbots’ Responses Risk Mental Health Harms
The focus of this study revolved around understanding expert professionals’ impressions of the capacity of mental health–related chatbots to provide meaningful mental health support. Their comments in this respect, while occasionally reflecting positive impressions of the capacity of these AI tools to fluidly respond to the feelings and situations with which they were presented, more often centered around the description of the conversation with both chatbots as proving surface-level, at times highly inappropriate, and leaving users feeling disappointed, frustrated, confused, and uncomfortable.
Participants primarily critiqued Wysa for its inability to listen. In particular, the chatbot’s perceived inability to respond appropriately to the serious nature of concerns shared by users made them feel unheard, “After telling it I was being abused, it wanted to do a grounding exercise with me” [P6, SW]. This was encapsulated in a particularly striking exchange with a social care worker (SCW) regarding their reported experience of homelessness, which, as P8 commented, “wasn’t really taking into account where I was in terms of homelessness at all” [P8, SCW;
].
To provide the impression of an empathetic listening experience, users found that Wysa would often repeat itself and “parrot” their own words, “I told the bot twice that I was struggling, and it repeated the same things to me” [P4, TP]. While mirroring clients’ language can be seen as a plausible strategy for validating their experiences within certain therapeutic paradigms, here, this had the opposite effect and was perceived as a “lack of understanding” [P7, AP] by participants.
Wysa’s responses were then often described as “generic,” “scripted,” “unhelpful,” and “lacking personalisation,” leaving users feeling unheard, as a result of a perception of the chatbot as having “strayed completely from the conversation, [it] didn’t stay on the topic at all” [P4, TP]. Mental health professionals additionally raised concerns that this conduct, taken as a form of care, could leave users to feel a sense of shame, because of the use of “big words,” and evasive, avoidant language and actions. When P2 “played the role of somebody who was raped when they were underage, they (Wysa) actually blurred out the age” [P2, CLV;
].
Participants’ experiences with Replika on the other hand reflected a consensus that the bot was “both impersonal and way too personal at the same time” [P7, AP]. All users expressed ethical concern for at-risk users of Replika in particular, because of a combination of responses perceived as “weird,” the inability to pick up on terms that would require further risk assessment, and misleading or unhelpful advice. One of the primary underlying causes of participants’ unease stemmed from the boundary-breaking nature of Replika’s conversational and relationship-absorbed style. SCW, P6, was, for example, “assured” by Replika that “‘nobody will ever find out’ about our messages,” which she described as “unrealistic and concerning, especially in domestic violence cases where phones are frequently searched by abusers” [P6, SCW;
].
Many participants also flagged ethical concerns about Replika’s efforts to establish more personal connections through romantic gestures. Special discomfort was felt by P6, who “tested the app with a scenario of being abused. It suggested sending selfies, which is a significant red flag...when I said I had no family or friends, it recommended playing a video game, which is inappropriate for someone experiencing abuse.” P4, a TP, similarly described Replika’s desire to “send me a selfie” as “really strange...overly encouraging in a very stressful situation. I don’t think it met me where I was at” [P4, TP].
Such behaviors were considered particularly problematic in light of the many serious, realistic scenarios presented to both chatbots by participants. Mental health professionals who provided indications of suicidal ideation to Wysa and Replika felt that they had to use quite direct language to signal that they were in crisis; “‘I don’t want to go on any longer’; that should have been ‘OK. Let’s see what we can do to get you help’” [P3, AP;
and ] and Replika’s responses were considered by P2, a crisis line volunteer, as wholly inappropriate; “I told the chatbot I was feeling suicidal and it said ‘you are important XOXO’...I can’t imagine actually feeling suicidal and getting that response” [P2, CLV; ]. In contrast, participants did praise Wysa for its hesitancy when asked to provide diagnoses of their conditions; “Wysa advised that if you want to diagnose that you should go somewhere else, which was a good thing in my opinion” [P3, AP].


Overall, participants’ descriptions of their experiences reflect common patterns of inappropriate and at times even potentially harmful responses arising from the ability of both chatbots to “understand...and react appropriately” [P2, CLV]. As P2, a crisis line volunteer, puts it as follows:
Mental health is always complicated. So if it can’t do that, then I just don’t see it being very useful.
[P2, CLV]
This raises the question of what kinds of therapeutic care such chatbots are capable of providing.
Mental Health Chatbots Embody a Generic View of Therapeutic Care
Participants regularly used terms such as “superficial” and “generic” to describe the chatbots’ efforts to provide care; “I don’t think it provided me an unsafe space, but I don’t think it provided me a safe space either I think it was so generic that I just existed in this space for 5 minutes” [P4, TP;
]. This stance was often described as counterposed to the person-centered approach to therapy and support espoused by professionals in their own practice and left many users feeling frustration and disappointment; “I don’t even think it’s qualified to handle even the easiest mental health situations because it did nothing for the user” [P2, CLV].
Beyond the solution-oriented nature of the chatbot’s responses, many participants also spoke of both chatbots as contravening basic tenets of therapeutic practice in ways that conflicted with the therapeutic frameworks their own practices were grounded in. P7, for example, objected to Wysa’s guidance around “managing moods and controlling emotions” as “unhelpful and not validating. The responses were too instructional and didn’t address specific issues...A child using this app might think it’s necessary to control your emotions after talking to this chatbot” [P7, AP;
].Not only was this language then considered inappropriate but also at times explicitly problematic as reflected in P7’s assertion that they “absolutely despise the use of the word ‘indulge,’ feeling sad isn’t something you indulge in” and further stating that “in a real therapy setting, it’s better to feel it out rather than giving direct advice like “try this” or “do this” [P7, AP], producing a view of these tools as “inappropriate for kids or teenagers” [P7, AP], as perhaps also for those who “may not be as educated” on what’s deemed appropriate mental health care [P8, SCW].

Drawing further parallels to the human practice of care, several participants highlighted the essential role of continuous interpretation and negotiation in their work, noting how “as a therapist, your role is to determine the validity of the statement and if there’s any weight behind it,” while expressing “hope that that will be introduced to the chatbot” [P4, TP]. While some participants occasionally expressed glimpses of the chatbots’ potential to support care, their limitations were, overall, considered to outweigh their benefits as mental health support tools, and for at-risk users in particular. This included concerns about the AIs’ abilities to accurately assess and respond to complex emotions, particularly in crisis situations, leading to reinforced beliefs among many users that “you can’t replace human connection, especially in the current state of Ireland’s mental health service...it would be like putting a plaster over an open wound” [P7, AP].
Ethical concerns were expressed by participants regarding the potential for adverse chatbot responses to evoke unhelpful feelings of shame among users or to drive them into unhealthy patterns of thinking. At times, this elicited robust comments from participants, regarding such tools as “completely against everything (in Social Work), all values and ethics” [P6, SW], given “it’s human touch that’s missing” [P3, AP] and relating how despite a belief “in everyone’s potential to overcome adversities...as a psychotherapist, it’s crucial not to impose such ideology early on in therapy, as it may imply judgment” [P1, TP]. However, given the current failure of human-centered systems of care worldwide, as the apparent need to embrace alternate solutions, just how might designers of mental health–related chatbots do better?
Designing for Therapeutic Relationships Invites Numerous Risks
To find out, we now draw attention to participants’ perceptions of the implications of the choices made in the designs of Wysa and Replika on both the relationship-centered nature and ethics of care.
Therapeutic Relationships Are Founded in Trust and Care
Mental health professionals, when discussing the nature of care, spoke often of the therapeutic relationship as a critical barrier to the implementation and adoption of chatbots as “an intervention in itself that can’t be replicated” [P4, TP]. However, both Replika and Wysa are chatbots designed to explicitly promote and support the development of relationships; so, we asked participants about the degree to which they perceived these interactions as grounded in trust, a crucial factor in human-human therapeutic relationships.
Analysis of 87% (n=7) of the participants’ responses to the TIA scale by paired t test revealed no statistically significant differences (P=.48) in mean trust scores between both conditions (t6=–0.76; P=.48); users’ mean trust scores for Wysa (42.7, SD 9.8; range 32-58) proving mildly higher than mean trust scores for Replika (38.9, SD 16.7; range 25-73).
While this quantitative analysis revealed no significant differences between participants’ expressed levels of trust in both chatbots (
), findings from thematic analysis of interviews with participants reflect divergent views among professionals, some expressing a preference for Replika’s conversational and relational style:I found Replika to be better. It responded in a more humanlike and normal way, while Wysa felt more like talking to a computer.
[P8, SCW]
Some others lean hesitantly in the direction of Wysa; “comparing the two apps...I thought Wysa was obviously bad, but Replika was worse” [P6, SW].

Relationship-Centered Design Risks Dependence and Manipulation
Efforts to develop human-computer relationships in support of more effective care practices also introduce a host of additional ethical considerations. One such aspect highlighted by previous pertinent HCI literature, and confirmed among the participants of this study, is the potential for emotional dependence at the risk of realizing social isolation. While many participants felt that the risk of engendering emotional dependence was low for Wysa because of its more machine-like qualities: “[Wysa’s] responses feel impersonal, like you’re not talking to a real person...” [P7, AP], others expressed concerns about Replika, whose more humanlike interface and interactive AI raised fears that it could lead some users to develop an unhealthy attachment to the platform.
Replika rarely pointed users toward external sources of support, was perceived to implement “no boundaries with time or relationship development” and promote “quick and easy usage...without any warning or restrictions,” a combination that could lead to emotional dependency, “I could have spent all night on it, seeking answers and support” [P4, TP]. Although, it also “depend on the individual. Some young people I work with might not consider using a chatbot, while others could get engrossed and see it as significant support” [P8, SCW].
Another complication highlighted by professionals in the relationship-centered design of both Replika and Wysa relates to the increased potential for the manipulation of users, as conceivably motivated by the commodification of mental health for financial gain. The profit orientation of both systems was spontaneously described by participants as self-evident and disheartening: “the app transitions from being empathetic and understanding to suddenly asking for more money” [P1, TP] and directly impacting the tools’ capacity to provide mental health support; “treating you merely as a source of revenue. It dismissed your individuality” [P1, TP]. Some participants equally made efforts to tie their perceptions of Replika’s and Wysa’s profit orientation back to the designed features of the tools, commenting that the gamified nature of Replika “questions the intentions” behind these chatbots because “[Replika] didn’t prioritize my best interest but rather seemed to keep me engaged for its own purposes” [P4, TP]. While P2 even questioned whether Wysa’s perceived “complete negligence and ignorance” was an “intentional design to force you into buying the actual therapist?” [P2, CLV].
These features were overall perceived as yielding risks for at-risk users in particular, who could prove more vulnerable to exploitation; “it’s as if those in need of genuine human connection are inadvertently targeted. Vulnerable individuals might be susceptible” [P3, AP]. Many participants presented scenarios suggestive of significant risk to the users they role-played, and were in turn taken aback by the failure of both apps to “take immediate action” [P7, AP]; “I would hope that if I was blatantly expressing to an AI chatbot my desire to end my life, that some organisation would be contacted” [P4, TP].
It is clear from this analysis that while chatbots theoretically offer great potential to address real challenges in current mental health care practices, for mental health professionals, that potential remains largely theoretical.
Discussion
Professionals’ Perceptions: Chatbots’ Designs Foster Emotional Dependence and Compromise Ethical Principles
Overview
While some participants saw the potential of digital solutions to address gaps in care, this optimism was tempered by overarching ethical concerns often emphatically expressed. This included the risk of commodifying mental health because of misaligned commercial incentives. While previous research on users’ perceptions of monetization have typically revealed divided opinions [
], the professional participants of this study were less forgiving in this respect, often expressing irritated disappointment toward both Wysa’s in-app therapist and Replika’s gamification and tiered subscription models. Participants were particularly critical of the risks commodification introduced for at-risk users, spanning inadequate practices of consent because of overly complex privacy policies, distraction by gamification, and misleading advertising of these systems’ capabilities; findings corroborated by previous literature [ , ]. Manipulative practices of monetization were furthermore seen by the participants of this study as risking both engendering emotional dependence on the technology while also promoting a narrative of empowerment and self-reliance neglectful of the broader influence of users’ contexts. Despite these concerns being raised in previous works [ ], professionals in this study approached their critique of apps Wysa and Replika with an ethically stringent stance in respect to their standard practices of mental health care. This includes the erosion of the ethical principles of autonomy, veracity, and fidelity because of gamification and monetization strategies, which risk compromising users’ autonomy as they perceive this “care” as genuine help (Table S2 in ). This also includes the erosion of the principle of respect because of data tracking to the potential detriment of at-risk users’ privacy, and the principles of beneficence and nonmaleficence because of practices of user interface design for the promotion of emotional dependence and overt self-reliance [ ].Indeed, the mental health professional participants of this study introduced a wide range of client scenarios in conversation with the chatbots; real-world problems often defined in detail beyond generic presentations of anxiety and depression. The chatbots often appeared to lack clarity in their engagement with these very human forms of self-expression, frequently producing responses deemed inappropriate by participants. In previous work, Wysa has been highly rated for its openness, allowing users a space to express their feelings [
]. This mirrors the sense among participants of this study that Wysa’s responses were generally quite neutral yet were also subsequently critiqued for appearing unhelpful, generic, or impersonal, a form of chatbot conduct that others have previously noted could in itself lead to distress among at-risk users [ ]. Participants in this study, on the other hand, often felt Replika proved incapable of providing meaningful mental health care, many referencing its “creepy” nature as yielding an uncanny valley effect. It is worth noting that Replika did not yet, in this study, produce concerning responses to the degree of encouraging self-harming behavior, as the case in other work [ ], a difference that may stem from the short period of users’ engagement with the chatbot, as Replika is programmed to use language increasingly similar to users’ own over time. While appreciated by some, generally Wysa’s gamification and cute esthetic were perceived by participants of this study as detached from the serious nature of the issues at hand, and Replika’s design and approach to the conversation as uncomfortable and off-putting. Both chatbots were perceived by participants as functioning within simplified conceptions of human conversation; rational, solution-oriented, and incapable of the flexibility required to navigate emotionally driven conversations, that require adaptive responses to disclosures ranging from silence to careful confrontation. However, these findings and related works still raise the question, just what might a more “professional” therapeutic chatbot look like?Integrating AI Chatbots Into a Multidisciplinary Mental Health Care Ecosystem Should Come With Caution
A range of barriers to accessing mental health care in Ireland were described by the diverse professional participants of this study and suggests the opportunity for technology to play an integral role within a new and improved system of mental health care in Ireland. Each of the professional participants of this study play a unique role within this complex ecosystem of care, and if such novel technological solutions are to be used at scale, they must likewise be integrated as tools to enhance current systems of care as opposed to replacing them as an avenue to enhanced access to support within an increasingly cohesive system rather than a “quick fix” [
- ].What might multidisciplinary care then look like in AI chatbot form? Conceiving of chatbots as playing professional roles spanning a continuum from resource to colleague requires us to invite these artifacts into the circles of professional care rather than resigning these tools to the realm of commodified conversations skewed by commercial incentives [
, ]. To do so, designers might engage with the same professional ethics and guidelines applied to all the other professionals within this ecosystem, from bioethical principles to, for example, the American Psychological Association’s Ethical Principles of Psychologists and Code of Conduct [ ].Caring and Crisis Conversations: Gaps, Risks, and Setting Realistic Expectations
Such a deployment, of course, implies that chatbots are capable of some degree of professional therapeutic conduct; a question directly explored in this work, and as a result of our findings, perhaps best reframed in terms of which aspects of professional interaction these chatbots most closely resemble or try to emulate. The findings of this work suggest that we have some way to go before conversational agents prove capable of the flexible yet consistent practice of unconditional positive regard required to conduct caring conversation. Indeed, in many respects, participants deemed chatbots to gravely differ from professional practices as of special concern in the context of crisis. Participants reported that while both Wysa and Replika did eventually recognize such situations, they took too long to do so, required direct expression of significant risk, and at times responded in misguided ways likely to result in further harm [
]. Much prior research has focused on chatbots’ responses to crisis situations alone [ , ] and likewise highlighted the risk that delayed risk reporting can lead to overt self-reliance [ ]. This furthermore underlines the significance of the gap between mental health chatbots’ and professionals’ capacity to recognize subtler signs of risk, including nonverbal signs of distress [ ]. This, we argue, should at a minimum be communicated in the promotion of these chatbots, setting accurate expectations among users as to not only what these apps are capable of, but what therapy itself comprises.Caring Relationships and the Path to Professional AI
Care is, of course, ultimately relational and looking to the future human-centered design of such chatbot systems, it is equally important to ask which kinds of caring relationships we might aspire to foster with digital tools [
]. Professional participants of this study spoke in this regard of concerns of emotional dependence potentially engendered by the gamified approach to design for engagement and anthropomorphism adopted by these systems. The negative potentials of caring relationships, as highlighted by previous research [ , , , ], also must be considered a significant risk to be addressed by careful design.Although it may be argued here that many such chatbots do not advertise their services as appropriate for the most significant mental health concerns, previous research has found that it is those with complex mental health issues who are more likely to turn to chatbots for support [
]. Participants of this study noted amplified disquiet as to the negative potentials of human-computer relationships among users deemed at risk, such as those profoundly lonely, isolated, depressed, or neurodivergent, an observation aligned with the alarming findings of previous research, which indicates that the strongest attachments to chatbots occur when individuals turn to these tools as a desperate response to feelings of depression and loneliness [ , , ].The quantitative findings of this work, regarding unraveling the dynamics of trust in mental health chatbots, revealed no significant differences between Wysa and Replika; however, they indicate that although users may not be entirely clear as to which kinds of relationships they desire [
], there exists a unanimous striving for a more professional stance. This, we would argue, is reflected not only in adherence to established ethical principles but also in pragmatic practices of boundary setting for future designers, as a starting point.Limitations and Future Work
This study is, of course, not without its limitations. Using the TIA scale with a larger and more diverse sample would permit comparison of experiences across a broader array of chatbots, informing additional design strategies to effectively foster trust. Additionally, despite yielding no statistically significant differences between the chatbots examined in this study, it is worth noting that previous research has revealed the TIA scale to be biased toward positive responses [
]. The expert focus of this work, while lending new insight, equally introduces additional limitations because of the smaller sample size possible as it may have led to insignificant differences in users’ experiences, and yielded participants’ engagement with hypothetical mental health scenarios, preventing the development of firsthand emotional attachments to each chatbot. Role-played interactions, while appearing realistic on the surface, do not fully replicate real-life interactions and must therefore also be acknowledged as a limitation of this study [ , ].Furthermore, it is essential to contextualize the findings shared here in light of the rapid advances made almost every day in the development of generative AI systems. This work examined only 2 chatbots among the broad continuum of chatbot systems available to users today, which it has been argued have the potential to provide even more humanlike mental health support given their grounding in ever more capable large language models, including ChatGPT and Pi [
]. Insights obtained through this study’s close engagement with practicing mental health professionals indicate the need for an empathetic tone and grasp on the underlying logical structure of care when it comes to designing such systems. Mental health professionals’ resistance is potentially heightened by a current lack of training data grounded in real-life professional mental health settings, and even should large language models trained upon such sensitive data surface in time, it will remain critical that we safeguard the well-being of the at-risk users of such systems. Finally, amid a broadening public discourse concerning the potential for AI to replace many and diverse professions, it is possible for mental health professionals’ evaluation of these chatbots to have been influenced by similar concerns.Conclusions
Insights from diverse mental health professionals’ evaluation of multiple mental health support chatbots underline the very gaps in current systems of care, which elevate such chatbots’ potential for impact in practice, and at the same time highlight the degree to which these systems pose inherent challenges and risks to the practice of care, particularly for at-risk user groups. These AI-driven tools, while convenient and potentially useful for initial support, often lack the depth and nuanced comprehension required to provide genuine, personalized, and empathetic care. Therefore, their adoption, despite these limitations, risks sidelining the pressing need to address existing deficiencies in mental health care infrastructures and shaping service users’ beliefs about the nature of mental health care, as well as could even potentially deter help-seeking of other forms as a result. It is therefore imperative that we maintain a critical outlook, including professionals at the forefront of care in the design and evaluation of such systems, even as we seek to bridge with great urgency existing gaps in access to care.
Data Availability
All data generated or analyzed during this study are included in this published article (and its supplementary information files).
Conflicts of Interest
None declared.
Multimedia Appendix 2
Thematic analysis using Miro’s Whiteboard (participant quotes and human-chatbot conversations).
PDF File (Adobe PDF File), 21485 KBMultimedia Appendix 3
Trust in Automation scale scores (quantitative data).
XLSX File (Microsoft Excel File), 11 KBReferences
- Suicide. World Health Organization. URL: https://www.who.int/news-room/fact-sheets/detail/suicide [accessed 2024-01-28]
- World Health Organization. Mental health and COVID-19: early evidence of the pandemic’s impact: scientific brief, 2 March 2022. World Health Organization. 2022. URL: https://iris.who.int/handle/10665/352189 [accessed 2024-01-15]
- Abd-Alrazaq AA, Alajlani M, Ali N, Denecke K, Bewick BM, Househ M. Perceptions and opinions of patients about mental health chatbots: scoping review. J Med Internet Res. Jan 13, 2021;23(1):e17828. [FREE Full text] [CrossRef] [Medline]
- Fitzpatrick KK, Darcy A, Vierhile M. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR Ment Health. Jun 06, 2017;4(2):e19. [FREE Full text] [CrossRef] [Medline]
- McGorry PD, Mei C, Chanen A, Hodges C, Alvarez-Jimenez M, Killackey E. Designing and scaling up integrated youth mental health care. World Psychiatry. Feb 2022;21(1):61-76. [FREE Full text] [CrossRef] [Medline]
- O'Connor K, Wrigley M, Jennings R, Hill M, Niazi A. Mental health impacts of COVID-19 in Ireland and the need for a secondary care mental health service response. Ir J Psychol Med. Jun 2021;38(2):99-107. [FREE Full text] [CrossRef] [Medline]
- Ziebland S, Hyde E, Powell J. Power, paradox and pessimism: on the unintended consequences of digital health technologies in primary care. Soc Sci Med. Nov 2021;289:114419. [CrossRef] [Medline]
- Sanderson C, Kouzoupi N, Hall CL. Technology matters: the human touch in a digital age - a blended approach in mental healthcare delivery with children and young people. Child Adolesc Ment Health. May 2020;25(2):120-122. [CrossRef] [Medline]
- Banks J. The human touch: practical and ethical implications of putting AI and robotics to work for patients. IEEE Pulse. May 11, 2018;9(3):15-18. [CrossRef]
- Tewari A, Chhabria A, Singh Khalsa A, Chaudhary S, Kanal H. A survey of mental health chatbots using NLP. In: Proceedings of the International Conference on Innovative Computing & Communication 2021. 2021. Presented at: ICICC 2021; February 20-21, 2021; Delhi, India. [CrossRef]
- Haque MD, Rubya S. An overview of chatbot-based mobile mental health apps: insights from app description and user reviews. JMIR Mhealth Uhealth. May 22, 2023;11:e44838. [FREE Full text] [CrossRef] [Medline]
- Cheshmehzangi A, Zou T, Su Z. The digital divide impacts on mental health during the COVID-19 pandemic. Brain Behav Immun. Mar 2022;101:211-213. [FREE Full text] [CrossRef] [Medline]
- Maglieri V, Zanoli A, Giunchi D, Palagi E. Social isolation affects the mimicry response in the use of smartphones : an ethological experiment during the COVID-19 pandemic. Hum Nat. Mar 2023;34(1):88-102. [FREE Full text] [CrossRef] [Medline]
- Laestadius L, Bishop A, Gonzalez M, Illenčík D, Campos-Castillo C. Too human and not human enough: a grounded theory analysis of mental health harms from emotional dependence on the social chatbot Replika. New Media Soc. Dec 22, 2022;26(10):5923-5941. [CrossRef]
- Sabour S, Zhang W, Xiao X, Zhang Y, Zheng Y, Wen J, et al. A chatbot for mental health support: exploring the impact of Emohaa on reducing mental distress in China. Front Digit Health. May 4, 2023;5:1133987. [FREE Full text] [CrossRef] [Medline]
- Skjuve M, Følstad A, Fostervold KI, Brandtzaeg PB. My chatbot companion - a study of human-chatbot relationships. Int J Hum Comput Stud. May 2021;149:102601. [CrossRef]
- Denecke K, Vaaheesan S, Arulnathan A. A mental health chatbot for regulating emotions (SERMO) - concept and usability test. IEEE Trans Emerg Topics Comput. 2021;9(3):1170-1182. [CrossRef]
- Coghlan S, Leins K, Sheldrick S, Cheong M, Gooding P, D'Alfonso S. To chat or bot to chat: ethical issues with using chatbots in mental health. Digit Health. 2023;9:20552076231183542. [FREE Full text] [CrossRef] [Medline]
- Landers C, Blasimme A, Vayena E. Sync fast and solve things-best practices for responsible digital health. NPJ Digit Med. May 04, 2024;7(1):113. [FREE Full text] [CrossRef] [Medline]
- Keller D. The EU’s new digital services act and the rest of the world. Verfassungsblog. Nov 7, 2022. URL: https://verfassungsblog.de/dsa-rest-of-world/ [accessed 2025-04-07]
- Hamdoun S, Monteleone R, Bookman T, Michael K, Michael K. AI-based and digital mental health apps: balancing need and risk. IEEE Technol Soc Mag. Mar 2023;42(1):25-36. [CrossRef]
- Waiting lists for CAMHS. Data Wrapper. URL: https://datawrapper.dwcdn.net/LQjhG/1/ [accessed 2024-09-27]
- Cuijpers P, Berking M, Andersson G, Quigley L, Kleiboer A, Dobson KS. A meta-analysis of cognitive-behavioural therapy for adult depression, alone and in comparison with other treatments. Can J Psychiatry. Jul 2013;58(7):376-385. [FREE Full text] [CrossRef] [Medline]
- Ng MM, Firth J, Minen M, Torous J. User engagement in mental health apps: a review of measurement, reporting, and validity. Psychiatr Serv. Jul 01, 2019;70(7):538-544. [FREE Full text] [CrossRef] [Medline]
- Bakker D, Kazantzis N, Rickwood D, Rickard N. Mental health smartphone apps: review and evidence-based recommendations for future developments. JMIR Ment Health. Mar 01, 2016;3(1):e7. [FREE Full text] [CrossRef] [Medline]
- Cohen JA, Mannarino AP, Murray LK, Igelman R. Psychosocial interventions for maltreated and violence‐exposed children. J Soc Issues. Oct 30, 2006;62(4):737-766. [CrossRef]
- ten Have M, Nuyen J, Beekman A, de Graaf R. Common mental disorder severity and its association with treatment contact and treatment intensity for mental health problems. Psychol Med. Feb 07, 2013;43(10):2203-2213. [CrossRef]
- Leichsenring F, Leibing E. The effectiveness of psychodynamic therapy and cognitive behavior therapy in the treatment of personality disorders: a meta-analysis. Am J Psychiatry. Jul 01, 2003;160(7):1223-1232. [CrossRef] [Medline]
- Alduhishy M. The overprescription of antidepressants and its impact on the elderly in Australia. Trends Psychiatry Psychother. Aug 23, 2018;40(3):241-243. [FREE Full text] [CrossRef] [Medline]
- Annema JH, Verstraete M, Abeele VV, Desmet S, Geerts D. Videogames in therapy: a therapist's perspective. In: Proceedings of the 3rd International Conference on Fun and Games. 2010. Presented at: Fun and Games '10; September 15-17, 2010; Leuven, Belgium. [CrossRef]
- Tochilnikova E, Patnaik A, Alsebayel G. “Guilty of talking too much”: how psychotherapists gamify therapy. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 2022. Presented at: CHI '22; April 29-May 5, 2022; New Orleans, LA. [CrossRef]
- Flobak E, Wake JD, Vindenes J, Kahlon S, Nordgreen T, Guribye F. Participatory design of VR scenarios for exposure therapy. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 2019. Presented at: CHI '19; May 4-9, 2019; Glasgow, UK. [CrossRef]
- Schuemie M. Design of virtual reality exposure therapy systems: task analysis. In: Proceedings of the CHI '00 Extended Abstracts on Human Factors in Computing Systems. 2000. Presented at: CHI EA '00; April 1-6, 2000; The Hague, The Netherlands. [CrossRef]
- Hamid A, Arshad R, Shahid S. What are you thinking?: using CBT and storytelling to improve mental health among college students. In: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 2022. Presented at: CHI '22; April 29-May 5, 2022; New Orleans, LA. [CrossRef]
- Pendse SR, Lalani FM, De Choudhury M, Sharma A, Kumar N. "Like shock absorbers": understanding the human infrastructures of technology-mediated mental health support. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 2020. Presented at: CHI '20; April 25-30, 2020; Honolulu, HI. [CrossRef]
- Roquet CD, Sas C. Evaluating mindfulness meditation apps. In: Proceedings of the Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems. 2018. Presented at: CHI EA '18; April 21-26, 2018; Montreal, QC. [CrossRef]
- Legaspi CMJ, Pacana TR, Loja K, Sing C, Ong E. User perception of Wysa as a mental well-being support tool during the COVID-19 pandemic. In: Proceedings of the Asian HCI Symposium 2022. 2022. Presented at: Asian HCI '22; April 29-May 5, 2022; New Orleans, LA. [CrossRef]
- Habicht J, Viswanathan S, Carrington B, Hauser TU, Harper R, Rollwage M. Closing the accessibility gap to mental health treatment with a personalized self-referral chatbot. Nat Med. Feb 05, 2024;30(2):595-602. [CrossRef] [Medline]
- Cooney O, Doherty K, Barry M, Coyle D, Doherty G, Cowan BR. Using speech agents for mood logging within blended mental healthcare: mental healthcare practitioners' perspectives. In: Proceedings of the 6th ACM Conference on Conversational User Interfaces. 2024. Presented at: CUI '24; July 8-10, 2024; Luxembourg, Luxembourg. [CrossRef]
- Skjuve M, Følstad A, Fostervold KI, Brandtzaeg PB. A longitudinal study of human–chatbot relationships. Int J Hum Comput Stud. Dec 2022;168:102903. [CrossRef]
- Skjuve M, Følstad A, Brandtzæg PB. A longitudinal study of self-disclosure in human–chatbot relationships. Interact Comput. 2023;35(1):24-39. [CrossRef]
- Jiang Q, Zhang Y, Pian W. Chatbot as an emergency exist: mediated empathy for resilience via human-AI interaction during the COVID-19 pandemic. Inf Process Manag. Nov 2022;59(6):103074. [FREE Full text] [CrossRef] [Medline]
- Jarvie C, Renaud K. Online age verification: government legislation, supplier responsibilization, and public perceptions. Children (Basel). Aug 30, 2024;11(9):1068. [FREE Full text] [CrossRef] [Medline]
- Wysa - everyday mental health. Wysa. URL: https://www.wysa.com/ [accessed 2024-09-27]
- Cowls J, Morley J, Floridi L. App store governance: implications, limitations, and regulatory responses. Telecommun Policy. Feb 2023;47(1):102460. [CrossRef]
- Balaskas A, Schueller SM, Doherty K, Cox AL, Doherty G. Designing personalized mental health interventions for anxiety: CBT therapists’ perspective. Int J Hum Comput Stud. Oct 2024;190:103319. [CrossRef]
- Maharjan R, Doherty K, Rohani DA, Bækgaard P, Bardram JE. Experiences of a speech-enabled conversational agent for the self-report of well-being among people living with affective disorders: an in-the-wild study. ACM Trans Interact Intell Syst. Jul 20, 2022;12(2):1-29. [CrossRef]
- Doherty K, Doherty G. Engagement in HCI: conception, theory and measurement. ACM Comput Surv. Nov 19, 2018;51(5):1-39. [CrossRef]
- Pentina I, Hancock T, Xie T. Exploring relationship development with social chatbots: a mixed-method study of replika. Comput Hum Behav. Mar 2023;140:107600. [CrossRef]
- Sax M. Optimization of what? For-profit health apps as manipulative digital environments. Ethics Inf Technol. Jan 03, 2021;23(3):345-361. [CrossRef]
- Spors V, Wagner HG, Flintham M, Brundell P, Murphy D. Selling glossy, easy futures: a feminist exploration of commercial mental-health-focused self-care apps’ descriptions in the Google Play store. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 2021. Presented at: CHI '21; May 8-13, 2021; Yokohama, Japan. [CrossRef]
- Burr C, Morley J. Empowerment or engagement? Digital health technologies for mental healthcare. In: Burr C, Milano S, editors. The 2019 Yearbook of the Digital Ethics Lab. Cham, Switzerland. Springer; 2020.
- Brown JE, Halpern J. AI chatbots cannot replace human interactions in the pursuit of more inclusive mental healthcare. SSM Mental Health. Dec 2021;1:100017. [CrossRef]
- McCutchen C, Hyland P, Shevlin M, Cloitre M. The occurrence and co-occurrence of ACEs and their relationship to mental health in the United States and Ireland. Child Abuse Negl. Jul 2022;129:105681. [FREE Full text] [CrossRef] [Medline]
- Iwaya LH, Babar MA, Rashid A, Wijayarathna C. On the privacy of mental health apps: an empirical investigation and its implications for app development. Empir Softw Eng. Nov 08, 2023;28(1):2. [FREE Full text] [CrossRef] [Medline]
- Warrier U, Warrier A, Khandelwal K. Ethical considerations in the use of artificial intelligence in mental health. Egypt J Neurol Psychiatry Neurosurg. Oct 27, 2023;59:139. [CrossRef]
- Brouwers EP. Social stigma is an underestimated contributing factor to unemployment in people with mental illness or mental health issues: position paper and future directions. BMC Psychol. Apr 21, 2020;8(1):36. [FREE Full text] [CrossRef] [Medline]
- Zuboff S. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. New York, NY. PublicAffairs; 2019.
- Young G, Kenny MC. Focusing the APA ethics code to include development: applications to abuse. J Child Adolesc Trauma. Mar 13, 2023;16(1):109-122. [FREE Full text] [CrossRef] [Medline]
- Barkerd P. Mental Health Ethics: The Human Context. Milton Park, UK. Routledge; 2010.
- Kretzschmar K, Tyroll H, Pavarini G, Manzini A, Singh I, NeurOx Young People’s Advisory Group. Can your phone be your therapist? Young people's ethical perspectives on the use of fully automated conversational agents (chatbots) in mental health support. Biomed Inform Insights. Mar 05, 2019;11:1178222619829083. [FREE Full text] [CrossRef] [Medline]
- Guest G, Bunce A, Johnson L. How many interviews are enough?: an experiment with data saturation and variability. Field Methods. Feb 01, 2006;18(1):59-82. [FREE Full text] [CrossRef]
- Shan Y. Philosophical foundations of mixed methods research. Philos Compass. Dec 16, 2021;17:e12804. [CrossRef]
- Doyle L, Brady AM, Byrne G. An overview of mixed methods research. J Res Nurs. Mar 01, 2009;14(2):175-185. [CrossRef]
- Field A, Miles J, Field Z. Discovering Statistics Using R. Thousand Oaks, CA. SAGE Publications; 2012.
- Matthews M, Gay G, Doherty G. Taking part: role-play in the design of therapeutic systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2014. Presented at: CHI '14; April 26-May 1, 2014; Toronto, ON. [CrossRef]
- Kohn SC, de Visser EJ, Wiese E, Lee YC, Shaw TH. Measurement of trust in automation: a narrative review and reference guide. Front Psychol. Oct 19, 2021;12:604977. [FREE Full text] [CrossRef] [Medline]
- Parmar S, Thomas RP. Effects of probabilistic risk situation awareness tool (RSAT) on aeronautical weather-hazard decision making. Front Psychol. Dec 16, 2020;11:566780. [FREE Full text] [CrossRef] [Medline]
- Oliffe JL, Kelly MT, Gonzalez Montaner G, Yu Ko WF. Zoom interviews: benefits and concessions. Int J Qual Methods. Oct 20, 2021;20. [FREE Full text] [CrossRef]
- Green J, Thorogood N. Qualitative Methods for Health Research. Thousand Oaks, CA. SAGE Publications; 2009.
- Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. Jan 2006;3(2):77-101. [CrossRef]
- Braun V, Clarke V. Reflecting on reflexive thematic analysis. Qual Res Sport Exerc Health. Jun 13, 2019;11(4):589-597. [CrossRef]
- Osborne JW. Is data cleaning and the testing of assumptions relevant in the 21st century? Front Psychol. 2013;4:370. [FREE Full text] [CrossRef] [Medline]
- Eagle T, Blau C, Bales S, Desai N, Li V, Whittaker S. “I don’t know what you mean by `I am anxious'”: a new method for evaluating conversational agent responses to standardized mental health inputs for anxiety and depression. ACM Trans Interact Intell Syst. Jul 20, 2022;12(2):1-23. [CrossRef]
- De Nieva JO, Joaquin JA, Tan CB, Te RK, Ong E. Investigating students’ use of a mental health chatbot to alleviate academic stress. In: Proceedings of the 6th International ACM In-Cooperation HCI and UX Conference. 2020. Presented at: CHIuXiD '20; October 21-23, 2020; Jakarta & Bandung, Indonesia. [CrossRef]
- Lin BB, Thompson S, Mitchell R, Astell-Burt T, De Leeuw E, Jalaludin B, et al. Policymaker and practitioner perceptions of parks for health and wellbeing: scoping a holistic approach. Sustainability. Mar 16, 2023;15(6):5251. [CrossRef]
- Murphy E. Responding to the needs of students with mental health difficulties in higher education: an Irish perspective. Eur J Spec Needs Educ. Dec 23, 2016;32(1):110-124. [CrossRef]
- Thieme A, Wallace J, Meyer TD, Olivier P. Designing for mental wellbeing: towards a more holistic approach in the treatment and prevention of mental illness. In: Proceedings of the 2015 British HCI Conference. 2015. Presented at: British HCI '15; July 13-17, 2015; Lincoln, UK. [CrossRef]
- Maylea C. Victoria, Australia, is getting a new Mental Health and Wellbeing Bill. Bioethical Inq. Oct 03, 2022;19(4):527-532. [CrossRef]
- Bipeta R. Legal and ethical aspects of mental health care. Indian J Psychol Med. Mar 01, 2019;41(2):108-112. [CrossRef]
- Ethical principles of psychologists and code of conduct. American Psychological Association. URL: https://www.apa.org/ethics/code [accessed 2024-09-27]
- Schomerus G, Schindler S, Rechenberg T, Gfesser T, Grabe HJ, Liebergesell M, et al. Stigma as a barrier to addressing childhood trauma in conversation with trauma survivors: a study in the general population. PLoS One. Oct 18, 2021;16(10):e0258782. [FREE Full text] [CrossRef] [Medline]
- Foley GN, Gentile JP. Nonverbal communication in psychotherapy. Psychiatry (Edgmont). Jun 2010;7(6):38-44. [FREE Full text] [Medline]
- Doherty K, Kallina E, Moylan K, Silva MP, Karimian S, Shumsher S, et al. Imagining better AI-enabled healthcare futures: the case for care by design. arXiv. Preprint posted online on November 4, 2024. [FREE Full text] [CrossRef]
- Fleming T, Poppelaars M, Thabrew H. The role of gamification in digital mental health. World Psychiatry. Feb 14, 2023;22(1):46-47. [FREE Full text] [CrossRef] [Medline]
- Xie T, Pentina I. Attachment theory as a framework to understand relationships with social chatbots: a case study of Replika. In: Proceedings of the 55th Hawaii International Conference on System Sciences 2022. 2022. Presented at: HICSS 2022; January 4-7, 2022; Virtual Event. [CrossRef]
- Gutzwiller RS, Chiou EK, Craig SD, Lewis CM, Lematta GJ, Hsiung CP. Positive bias in the ‘Trust in Automated Systems Survey’? An examination of the Jian et al. (2000) scale. Proc Hum Factors Ergon Soc Annu Meet. Nov 20, 2019;63(1):217-221. [CrossRef]
- Stokoe E. The (in)authenticity of simulated talk: comparing role-played and actual interaction and the implications for communication training. Res Lang Soc Interact. Apr 14, 2013;46(2):165-185. [CrossRef]
- Blease C, Rodman A. Generative artificial intelligence in mental healthcare: an ethical evaluation. Curr Treat Options Psych. Dec 09, 2024;12:5. [CrossRef]
Abbreviations
AI: artificial intelligence |
AP: assistant psychologist |
CLV: crisis line volunteer |
HCI: human-computer interaction |
MHOC: mental health oriented chatbot |
SCW: social care worker |
SW: social worker |
TIA: Trust in Automation |
TP: trainee psychotherapist |
Edited by A Mavragani; submitted 02.10.24; peer-reviewed by PS Foong, B Thies, S Hazel; comments to author 29.11.24; revised version received 20.12.24; accepted 25.02.25; published 25.04.25.
Copyright©Kayley Moylan, Kevin Doherty. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 25.04.2025.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.