Engagement and Participant Experiences With Consumer Smartwatches for Health Research: Longitudinal, Observational Feasibility Study

Background Wearables provide opportunities for frequent health data collection and symptom monitoring. The feasibility of using consumer cellular smartwatches to provide information both on symptoms and contemporary sensor data has not yet been investigated. Objective This study aimed to investigate the feasibility and acceptability of using cellular smartwatches to capture multiple patient-reported outcomes per day alongside continuous physical activity data over a 3-month period in people living with knee osteoarthritis (OA). Methods For the KOALAP (Knee OsteoArthritis: Linking Activity and Pain) study, a novel cellular smartwatch app for health data collection was developed. Participants (age ≥50 years; self-diagnosed knee OA) received a smartwatch (Huawei Watch 2) with the KOALAP app. When worn, the watch collected sensor data and prompted participants to self-report outcomes multiple times per day. Participants were invited for a baseline and follow-up interview to discuss their motivations and experiences. Engagement with the watch was measured using daily watch wear time and the percentage completion of watch questions. Interview transcripts were analyzed using grounded thematic analysis. Results A total of 26 people participated in the study. Good use and engagement were observed over 3 months: most participants wore the watch on 75% (68/90) of days or more, for a median of 11 hours. The number of active participants declined over the study duration, especially in the final week. Among participants who remained active, neither watch time nor question completion percentage declined over time. Participants were mainly motivated to learn about their symptoms and enjoyed the self-tracking aspects of the watch. Barriers to full engagement were battery life limitations, technical problems, and unfulfilled expectations of the watch. Participants reported that they would have liked to report symptoms more than 4 or 5 times per day. Conclusions This study shows that capture of patient-reported outcomes multiple times per day with linked sensor data from a smartwatch is feasible over at least a 3-month period. International Registered Report Identifier (IRRID) RR2-10.2196/10238


Introduction
Background Wearables, such as activity trackers, provide opportunities for frequent monitoring of chronic diseases. Their sensors can record behaviors of interest at high temporal and spatial resolution [1][2][3][4]. Wearables are widely used: in 2016 there were 325 million connected wearable devices worldwide, with half of the owners wearing their device every day [5,6]. In health care, sensor data from wearables would be even more relevant if combined with simultaneously collected patient-reported outcomes. This would enable symptom monitoring, adding context to the sensor outputs, and may aid clinical decision making and empower patients [7].

Consumer Cellular Smartwatches
A new technical innovation enables collection of sensor data alongside patient-reported outcomes. In 2017, the first cellular smartwatches came to market. Cellular smartwatches combine the functionalities of smartphones (touch screen, SIM card and cellular connection, and possibility to develop and install apps) with wearables (passive collection of sensor data, wrist-worn). This enables frequent collection of patient-reported outcomes (via touchscreen) alongside accurate and objective information on behavior or exposure (from sensors). Furthermore, these data can be collected in real time and automatically uploaded to remote servers without the need for pairing with a smartphone or another device for connectivity.

Physical Activity and Knee Osteoarthritis
An example of a clinical disease area where the pairing of symptoms and sensor data, especially on physical activity, could significantly advance research is arthritis [8]. Knee osteoarthritis (OA) is one of the most common types of arthritis: it affects 19% to 28% of men and women older than 45 years and is characterized by disabling knee pain and a reduction in mobility [9]. Physical activity is beneficial in reducing long-term pain severity and disability [10] and has cardiovascular and other benefits. However, the relationship between pain and activity is complex: pain can limit the amount of physical activity that is possible, while increasing physical activity beyond a certain level may further increase pain severity [11]. Wearable devices have been used to track physical activity in OA research [12], but frequent symptoms are rarely collected in parallel. Furthermore, proprietary algorithms from consumer fitness trackers are less accurate in arthritis patients because gait characteristics differ between healthy people and those with musculoskeletal conditions [13]. Understanding the interplay between physical exercise and symptoms would be an important step in helping to develop and target personalized interventions to support an appropriate level of physical activity. For example, encouraging more physical activity within an individual's personal threshold. This requires frequent, accurate, and granular data on pain symptoms and activity, which cellular smartwatches may be able to provide.

Objectives
The feasibility of collecting such data through cellular smartwatches remains uncertain. Knowledge of barriers and enablers of engaging with cellular smartwatches long term could inform the design of future studies. To address this, we conducted the KOALAP (Knee OsteoArthritis: Linking Activity and Pain) study. We developed a cellular smartwatch app for collection of patient-reported outcomes (multiple times a day) alongside continuous sensor data. The aim of this feasibility study was to investigate engagement patterns and acceptability of collecting health and behavior data using consumer cellular smartwatches daily for 3 months. Specifically, the study objectives were to report participant engagement, to investigate participant views and experiences, and to identify barriers and enablers to collecting data through cellular smartwatches.

Subjects and Data Collection From the Smartwatch
Men and women older than 50 years with self-reported knee OA were recruited in September 2017 for participation in a 90-day observational study. Detailed methods have been reported elsewhere [14].
In brief, participants received a Huawei Watch 2 preinstalled with the KOALAP study app (all other features and apps were disabled) developed by the study team and Google Android Wear. Participants were instructed to wear the watches for 90 days, from waking until going to bed, and answer the watch questions when prompted (Figure 1). At baseline, participants reported age, gender, and previous experience with health technology (see [14]), and, for the watch questions, the activity that caused most knee pain and the activity that was most important for them to do without knee pain. At study completion, participants received a Web-based questionnaire with questions about their experiences with the watch, for example, "I often forgot to charge the watch or wear it again after charging." The full questionnaire is available as an appendix to the published study protocol [14]. The KOALAP app triggered 4 or 5 questions on knee pain and quality of life per day. These questions had to be answered within a specific time window, which took around 10 seconds per question. The questions were: When a participant was wearing the watch, the KOALAP app collected raw sensor data. When participants took off the watch, off-body detection stopped sensor data collection. On the home screen of the watch, participants could see their heart rate and step count as calculated by the Android operating system. During recharging, data were uploaded to the study servers and deleted from the watch. If participants were abroad or had poor cellular signal at the charging location, the data upload failed, and the watch stopped collecting sensor data.

Participant Interviews
All participants were invited to take part in 2 separate interviews (1 shortly after baseline and 1 on completion of the study). A semistructured interview schedule (see Multimedia Appendix 1) was developed from the sociological research literature on self-tracking [16][17][18][19][20][21]. This literature is split between a techno-utopian approach and a critical approach. The techno-utopian approach suggests that self-tracking can empower and motivate individuals to adopt a healthy lifestyle. The critical approach has focused more on the implications of self-tracking for privacy, personal responsibility, surveillance, and changing views of the body and health. The interview schedule was also informed by factors known to affect attrition in digital health studies, including usability, feedback, perceived advantages of participation, time required, user experience, and external events such as health [22]. The interviews at baseline explored participants' experiences of living with OA, motivations and expectations of using a smartwatch, and previous experiences with health technology. We were interested if previous engagement with devices had changed health-related behaviors. At follow-up, interviews explored participants' experiences of using the watch and being monitored and whether the knowledge gained from using the watch (if any) had significant implications for their understanding of knee OA.

Engagement With Smartwatch
The primary measures of engagement were the number of active participants per day, hours of wear time, and completeness of watch questions. Active participants were defined as participants who wore the watch for at least 30 min. Wear time per participant-day was defined as the total hours of available sensor data, rounded to the nearest hour. The completeness of watch questions was defined as the percentage of watch questions completed (per specific watch question over the study duration; per participant-day). For each study day, we calculated mean wear time and mean completeness of watch questions across all participants and across all active participants.
In addition, we determined temporary and permanent nonusage attrition and the mean clock time that participants put on and took off the watch. Temporary nonusage attrition refers to the participants that are not active for a period (ie, do not wear the watch for >30 min but later resume wearing the watch). Permanent nonusage attrition refers to participants that are not active and never again wear the watch [22]. Per participant over the study period, we determined the average clock time of the first and last sensor data record. When participants took off and put on the watch multiple times, we only considered the longest continuous episode per day for calculating these clock times.

End-of-Study Survey: Participant Experiences
Descriptive statistics were used to summarize responses to the baseline and end-of-study surveys.

Participant Interviews
Interviews were audio-recorded and transcribed verbatim and coded using NVIVO (QSR International). Transcripts were analyzed thematically, drawing on some of the key techniques of grounded theory [23], including open coding, constant comparison, and memo writing. Verbatim quotes that illustrate the key themes were selected.

Case Studies
To illustrate how interview themes relate to individual levels of engagement, 2 case studies of participants were analyzed, combining quantitative engagement data with interview quotes.
This study underwent full review by the University of Manchester Research Ethics Committee (#0165) and University Information Governance (#IGRR000060).

Subjects and Baseline Survey
A total of 26 subjects took part in the study. Their mean age was 64 years, and 50% (13/26) were female (13/26). Before enrollment, 9 participants had used a smartphone only (n=3), wearable only (n=3), or both (n=3) for health or activity monitoring.
In total, 6894 watch questions and 643 gigabytes of sensor data were received over the 90-day study period. Participants wore the watch on 73% (81/90) of days. Over time, the number of active participants decreased ( Figure 2): from 25 on the first day to 11 on the last day. Until the last study week, the main form of attrition was temporary nonusage attrition (participants not wearing the watch but later re-engaging). Permanent nonusage attrition (participants not wearing the watch again during the study) was low in the first 2 months: 1 participant stopped in the first month and 1 in the second month. In the last study month, 13 participants stopped using the watch (of which 8 in the last week), of which 1 was lost to follow-up after day 84 (ie, did not fill in the end-of-study-questionnaire and did not return the watch).

Engagement With Smartwatch
The median daily wear time among active participants was 11 hours 12 min (interquartile range 9 hours 27 min-12 hours 6 min). The mean time-of-day at which sensor data collection started and stopped varied between participants: from 07.48 to 13.48 and 16.00 and 21.18, respectively ( Figure 3). For most participants, this covered the trigger time for all watch questions from first (12.22) to the last (18.22). For 1 participant, average wear time started after the first trigger time, and for 8 participants average wear time stopped on or before the last trigger time. Some participants (eg, participant 14) recharged the watch and put it on again, resulting in a median wear time much higher than the clock times of the longest episode.
The median watch question completion rates and hours of sensor data decreased over the study duration (dark blue diamonds in Figure 4). Engagement of active participants remained roughly constant through time (light blue squares in Figure 4). Each bar corresponds to a participant. The bar starts at the average time of day that sensor data collection started and ends at the average time of day that sensor data collection ceased (duration in hours shown in middle of bar). On the right: the median wear time and number of days the participant was active. Median wear time can be higher than clock time duration of the longest wearing episode, as some participants recharged the watch and put it on multiple times.

End-of-Study Survey: Participant Experiences
A total of 23 participants completed the Web-based end-of-study questionnaire ( Figure 5). Most participants found the watch comfortable, found it easy to enter pain levels on the watch, and found the timing of the various questions convenient. Only 1 participant found survey frequency too high, and 1 participant found the watch disrupted their normal activities. Figure 5. End-of-study study survey-Comfort, convenience of prompts, and self-tracking. Proportion of participants (100%=23) that chose a specific answer; bars to the right of zero reflect positive experiences.

Participant Interviews
A total of 19 participants were interviewed at baseline, and 18 of these completed an end-of-study interview 3 months later (mean age 64 years; 10/19, 53% female). Analysis of the transcribed interviews identified themes around context, motivations, and expectations (section 1), interaction with the watch, experiences, and usability (section 2), and self-tracking (section 3).

Context, Motivations, and Expectations
The opportunity to learn more about the relationship between pain and activity was the primary motivating factor for the majority of interviewed participants (N=14). Some participants expected that participation would help them develop strategies to manage their pain better (Textbox 1, quotes 1.1 and 1.2). Others were motivated by the prospect of helping others and contributing toward improving knee OA research (Textbox 1, quotes 1.3 and 1.4). Textbox 1. At baseline, most participants' motivation was to learn about their condition, whereas some enrolled to contribute to research.

Interaction With the Watch: Experience and Usability
Although some participants expressed concerns in the preliminary interviews about successfully operating the smartwatch, all participants stated at follow-up that they found the watch easy to use (Textbox 2, quote 2.1). Participants did not consider answering the twice daily questions a burden. In fact, many participants suggested pain data should be collected more frequently, for example, also in the evening (Textbox 2, quote 2.2). Participants were enthusiastic about recording pain levels, and some suggested to add a "pain button" to record pain in real time (Textbox 2, quote 2.3). Participants explained that their engagement with the watch was affected by other activities. Sometimes they were too busy to answer the question at the trigger time, or they removed the watch for activities (eg, writing, gardening, and swimming) and forgot to put it back on.
Battery life significantly influenced patterns of engagement, particularly for those participants who worked full days (Textbox 2, quote 2.4). Sometimes, participants missed the evening questions because of limited battery life. There appeared to be an expectation that the battery should last from early in the morning to late in the evening without recharging (at least 15 hours).
Step count was automatically reset to 0 after recharging, frustrating some participants (see case study 2). Textbox 2. Participant experience with smartwatch and usability (quotes from follow-up interview).

Self-Tracking
Participants stated that being involved in the study had helped them to focus more on their activity levels and challenge existing assumptions regarding their activity and pain (Textbox 3, quote 3.1). For example, 1 participant had previously avoided walking long distances as she associated walking with pain. Once she started tracking her steps, she became aware of how far she could walk without causing pain, which led to greater confidence and increased activity (Textbox 3, quote 3.2). Not all participants found the step counter or heart rate monitor useful. Participants already interested in self-tracking had concerns regarding the accuracy of the watch as the watch data were different to their personal devices (Textbox 3, quote 3.3). As the majority of participants were primarily motivated by the opportunity to learn more about their condition, feedback was an important issue. Some participants mentioned that they would have preferred to be more active in analyzing their own data on a daily basis. However, irrespective of the fact that participants had doubts concerning the accuracy of the sensor data, the majority of participants remained in the study for the 3-month period. Textbox 3. Participant experience with self-tracking and watch feedback (quotes from follow-up interview).

It's made me very aware on a daily basis of where I am, what things I'm doing...because you are actually charting
progress, lack of progress and whatever that may be, that journey that you're on. And the fact that you're focusing three or four times a day; it really does say yeah I felt okay today...I just really enjoyed the experience and I felt I was getting something out of it.

Case Studies
To illustrate how these themes interrelate to determine individual levels of engagement, we present 2 case studies: (1) a highly engaged participant despite no interest in self-tracking and (2) a participant that, in spite of interest in self-tracking, dropped out early. Each demonstrates how engagement results from a balance of the themes highlighted above, not always driven by the commonest themes.

Case Study 1-Highly Engaged
The participant was diagnosed with OA over 10 years ago. He wore the watch for 88 days and answered on average 73% (79/249 days; Figure 6) of the watch questions. He had no previous experience with self-tracking or wearables, meaning his high engagement in the study was not explained by any prestudy interest in self-tracking. In fact, at baseline, he considered self-tracking to be "narcissistic" (Textbox 4, quote 4.1). The participant had no expectations that the study would benefit him personally. His reasons to participate were altruistic in nature as he emphasized that while the study may not benefit him, it could benefit others (Textbox 4, quote 4.2).
He occasionally missed questions on some days. He explained this was because of his daily routine, as he sometimes worked late, and he would forget to recharge the battery. He reported that he was glad to finish the study, as he did not particularly like the look of the watch (Textbox 4, quote 4.3).
Irrespective of a negative attitude toward self-tracking and the look of the watch, the participant had the highest level of engagement in the study. This paradox may be explained by the fact this participant had no expectations of personal benefit from participating in the study; hence, negative aspects of the watch did not disappoint him.

Case Study 2-Early Study Withdrawal
Some participants were highly engaged for the first part of the study but then started using the watch significantly less, or in some cases not at all. This participant was diagnosed with OA within recent years. She wore the watch for 27 days, answering 79% (26/88 days; Figure 7) of watch questions on average. At the beginning of the study, she was fully engaged and answered all the daily questions. In contrast to case study 1, this participant was motivated to participate in the study to learn more about how to deal with her pain and potentially avoid having surgery (Textbox 5, quote 5.1). She enjoyed the process of answering the daily questions at the beginning of the study as focusing on her pain challenged her previous assumptions regarding activity levels and pain (Textbox 5, quote 5.2). She had not done self-tracking before the study, but she did enjoy this at the beginning. Over time, she increasingly became skeptical and frustrated with the technical aspects of the watch: the step counter reset when she recharged it, and she was not convinced of its accuracy (Textbox 5, quote 5.3). She was also concerned that nonambulatory triggers for her pain, such as standing and bending, were not captured by the watch (Textbox 5, quote 5.4). She found the size of the watch to be too big, and she mentioned that it got in her way when writing (Textbox 5, quote 5.5).
After this series of disappointments, the participant experienced technical problems. The watch would frequently turn itself off and eventually stopped recharging. The study team offered to send a new watch, but because of her overall frustrations with the watch, she declined this (Textbox 5, quote 5.6).
In conclusion, concerns about the data quality and problems with the size of the watch caused disengagement. Owing to the sequence of disappointments by the time the watch broke, she could not be persuaded to remain in the study. Textbox 5. Case study 2: experiences of a participant that enjoyed learning from the self-tracking but dropped out after a series of unmet expectations (quote 5.1 from baseline interview; 5.2 to 5.6 from follow-up interview). The case studies show that expectation of personal gain or learning about someone's condition alone does not explain engagement. Case study 1 shows that participants motivated by altruism may stay engaged even if they have little interest in self-tracking. Case study 2 shows that those with high interest in self-tracking also may have higher expectations, which, if not met, can be a reason for disengagement.

Principal Findings
This study succeeded in collecting frequent sensor data and patient-reported outcomes using cellular smartwatches for 90 days. Participants wore their smartwatch on most days, with engagement declining most notably after week 12 as participants approached the study end date. Among participants who remained active, data completion remained high, and neither watch wear time nor completion rate of watch questions declined significantly over time. Most participants joined the study to learn more about the link between their pain and activity, in line with known benefits from symptom tracking [24]. They found the watch app technically easy to use even though most had no previous experience with self-tracking. Several participants found the watch somewhat big or cumbersome.
Participant interviews showed that the main barriers to wearing the watch were battery life limitations, technical problems, and unfulfilled expectations of, or doubts about, the watch performance. The first case study illustrated that an interest in self-tracking (one of the commonest motivators) is not an essential requirement for high engagement. The second case study showed that being interested in self-tracking does not automatically lead to high engagement: this participant dropped out after recurrent small disappointments where the watch did not meet her expectations.

Strengths and Limitations
The study has a number of strengths. To our knowledge, it is the first to develop an app to collect both patient-reported outcomes and sensor data from this new generation of consumer cellular smartwatches. The combination of quantitative methods and qualitative methods provides important insights into motivations and barriers for participant engagement with the new technology. The case studies show how these motivations and barriers are weighed against one another. Although the app is not publicly available, lessons learnt about engagement are transferable to future consumer cellular smartwatch studies.
A limitation of this study is the small, self-selected sample. Participants had volunteered for the study, which means that they may have been more motivated than the nonvolunteering population with OA, resulting in higher engagement. Second, we may have underestimated watch wear time. We could not directly record wear time but instead defined this as "minutes of sensor data received." This definition excludes time that a participant wore the watch, but it was out of battery or out of internal memory. Third, we cannot draw conclusions about data collection for longer periods of time, as our feasibility study was limited to a 3-month period.

Comparison With Prior Work
High attrition rates are often a characteristic of mobile health studies [22] and even of activity trackers for personal use [25]. Until the last week of our study, attrition was relatively low. In a 6-week study of Fitbit activity trackers, most participants dropped out (75% attrition after 4 weeks, compared with 4% in our study) [26]. Despite asking participants to report symptoms 4 or 5 times per day, we retained higher completion rates than studies that requested information fewer times per day to OA patients [10,11] or other patient groups [22,27,28]. The possible burden of higher number of questions may have been offset by the speed of data entry per question: responding on a wrist-worn device took less than 10 seconds, compared with taking out a device or diary in other studies. The workload and time required to enter data are known to influence attrition [22], but it remains uncertain where the balance lies between frequency of entry and duration required per entry. A total of 12 participants stopped wearing the watch in the last week of the study but before their end date. Enrollment was staggered over 12 days, but participants received instructions for returning the watch on the day that the first participants had completed 90 days. This may have led to possible confusion for late enrollers, thinking the study had already ended for them.
Barriers and motivators that were identified in the interviews largely correspond with previous research. Our participants were primarily motivated to learn about their condition, a common motivation to engage with digital health apps [28]. Most barriers to engagement have also been described in other studies: forgetting to charge or put on the watch [26], physical design and aesthetics [26], issues with (expectations of) data accuracy [26,28], and preferring a competing intervention [22].
Receiving feedback from a digital health app has previously been identified as a motivator [28]. In our study, participants perceived wearing the watch as beneficial, even though they did not receive decision support and could not look back into previous pain or step count values. Many stated that using the watch still led to a better understanding of the relation between their pain and activity. We did not find "lack of previous experience with digital devices and health tracking" [28] as an important barrier to engagement, possibly because participants with limited digital literacy also found the watch and our app intuitive and easy to use.
This study focused on usage of consumer cellular smartwatches for research only, rather than for self-management or clinical care. Self-tracking using consumer devices has advantages for self-management such as giving participants a better understanding of their condition (an advantage also observed in this study) and identifying triggers [24]. Our app did not display visual feedback about recently tracked symptoms, which may have limited such benefits. Self-tracking also has the potential to transform clinical consultations by providing a clearer picture of symptoms while at home, improving shared decision making [29]. Integrating data from smartwatches into electronic health records in the future may well deliver similar advantages.

Recommendations For Future Studies
Future studies may increase engagement in a number of ways. Our interviews indicate that unrealistic expectations of watch performance (eg, battery life) and doubts about accuracy of the device or ability of researchers to derive relevant metrics caused participants to disengage. Better participant information upon enrollment might mitigate this source of attrition. Visualization of the participants' own data may increase engagement further, especially given the primary motivator of participants wanting to understand better their relationship between physical activity and pain. However, such a change needs to be balanced against concerns that feedback may influence subsequent reporting. Improvements in the technology, including longer battery life and lighter, more comfortable watches, may further reduce attrition.

Conclusions
This study suggests that it is feasible to use cellular smartwatches for collection of patient-reported outcomes 4 or more times per day alongside continuous sensor data collection. Indeed, participants felt self-reported data collection could be even more frequent than the 4 or 5 times per day in this study. Learning about symptoms was a prime motivator to use the watch, even though most participants had never self-tracked before. Technical issues rather than participant attitudes more commonly limited engagement with the smartwatches. Overall, cellular smartwatches were an acceptable and feasible new data collection tool to support health research.