%0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e63252 %T Introducing Novel Methods to Identify Fraudulent Responses (Sampling With Sisyphus): Web-Based LGBTQ2S+ Mixed-Methods Study %A MacKinnon,Kinnon Ross %A Khan,Naail %A Newman,Katherine M %A Gould,Wren Ariel %A Marshall,Gin %A Salway,Travis %A Pullen Sansfaçon,Annie %A Kia,Hannah %A Lam,June SH %+ School of Social Work, York University, 4700 Keele Street, Toronto, ON, M3J 1P3, Canada, 1 416 736 2100, kinnonmk@yorku.ca %K sampling %K bots %K transgender %K nonbinary %K detransition %K lesbian, gay, bisexual, and transgender %K mobile phone %D 2025 %7 17.3.2025 %9 Original Paper %J J Med Internet Res %G English %X Background: The myth of Sisyphus teaches about resilience in the face of life challenges. Detransition after an initial gender transition is an emerging experience that requires sensitive and community-driven research. However, there are significant complexities and costs that researchers must confront to collect reliable data to better understand this phenomenon, including the lack of a uniform definition and challenges with recruitment. Objective: This paper presents the sampling and recruitment methods of a new study on detransition-related phenomena among lesbian, gay, bisexual, transgender, queer, and 2-spirit (LGBTQ2S+) populations. It introduces a novel protocol for identifying and removing bot, scam, and ineligible responses from survey datasets and presents preliminary descriptive sociodemographic results of the sample. This analysis does not present gender-affirming health care outcomes. Methods: To attract a large and heterogeneous sample, 3 different study flyers were created in English, French, and Spanish. Between December 1, 2023, and May 1, 2024, these flyers were distributed to >615 sexual and gender minority organizations and gender care providers in the United States and Canada, and paid advertisements totaling >CAD $7400 (US $5551) were promoted on 5 different social media platforms. Although many social media promotions were rejected or removed, the advertisements reached >7.7 million accounts. Study website visitors were directed from 35 different traffic sources, with the top 5 being Facebook (3,577,520/7,777,218, 46%), direct link (2,255,393/7,777,218, 29%), Reddit (1,011,038/7,777,218, 13%), Instagram (466,633/7,777,218, 6%), and X (formerly known as Twitter; 233,317/7,777,218, 3%). A systematic protocol was developed to identify scam, nonsense, and ineligible responses and to conduct web-based Zoom video platform screening with select participants. Results: Of the 1377 completed survey responses, 957 (69.5%) were deemed eligible and included in the analytic dataset after applying the exclusion protocol and conducting 113 virtual screenings. The mean age of the sample was 25.87 (SD 7.77; median 24, IQR 21-29 years). A majority of the participants were White (Canadian, American, or of European descent; 748/950, 78.7%), living in the United States (704/957, 73.6%), and assigned female at birth (754/953, 79.1%). Many participants reported having a sexual minority identity, with more than half the sample (543/955, 56.8%) indicating plurisexual orientations, such as bisexual or pansexual identities. A minority of participants (108/955, 11.3%) identified as straight or heterosexual. When asked about their gender-diverse identities after stopping or reversing gender transition, 33.2% (318/957) reported being nonbinary, 43.2% (413/957) transgender, and 40.5% (388/957) identified as detransitioned. Conclusions: Despite challenges encountered during the study promotion and data collection phases, a heterogeneous sample of >950 eligible participants was obtained, presenting opportunities for future analyses to better understand these LGBTQ2S+ experiences. This study is among the first to introduce an innovative strategy to sample a hard-to-reach and equity-deserving group, and to present an approach to remove fraudulent responses. %M 40096683 %R 10.2196/63252 %U https://www.jmir.org/2025/1/e63252 %U https://doi.org/10.2196/63252 %U http://www.ncbi.nlm.nih.gov/pubmed/40096683 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e58450 %T Identifying Fraudulent Responses in a Study Exploring Delivery Options for Pregnancies Impacted by Gestational Diabetes: Lessons Learned From a Web-Based Survey %A Ruby,Emma %A Ramlawi,Serine %A Bowie,Alexa Clare %A Boyd,Stephanie %A Dingwall-Harvey,Alysha %A Rennicks White,Ruth %A El-Chaâr,Darine %A Walker,Mark %+ , Clinical Epidemiology Program, Ottawa Hospital Research Institute, 501 Smyth Road, Centre for Practice Changing Research, Box 241, Ottawa, ON, K1H8L6, Canada, 1 613 737 8899 ext 73840, mwalker@toh.ca %K research fraud %K anonymous online research %K data integrity %K fraudulent responses %K web-based survey %K internet research %K perinatal health %K social media %K patient participation %K provider participation %K fraudulent %K fraud %K pregnancy %K gestational diabetes %K diabetes %K data analysis %K survey %K diabetes mellitus %K patient %K evidence-based %D 2025 %7 20.1.2025 %9 Viewpoint %J J Med Internet Res %G English %X Current literature is unclear on the safety and optimal timing of delivery for pregnant individuals with gestational diabetes mellitus, which inspired our study team to conduct a web-based survey study exploring patient and provider opinions on delivery options. However, an incident of fraudulent activity with survey responses prompted a shift in the focus of the research project. Unfortunately, despite the significant rise of web-based surveys used in medical research, there remains very limited evidence on the implications of and optimal methods to handle fraudulent web-based survey responses. Therefore, the objective of this viewpoint paper was to highlight our approach to identifying fraudulent responses in a web-based survey study, in the context of clinical perinatal research exploring patient and provider opinions on delivery options for pregnancies with gestational diabetes mellitus. Initially, we conducted cross-sectional web-based surveys across Canada with pregnant patients and perinatal health care providers. Surveys were available through Research Electronic Data Capture, and recruitment took place between March and October 2023. A change to recruitment introduced a US $5 gift card incentive to increase survey engagement. In mid-October 2023, an incident of fraudulent activity was reported, after which the surveys were deactivated. Systematic guidelines were developed by the study team in consultation with information technology services and the research ethics board to filter fraudulent from true responses. Between October 14 and 16, 2023, an influx of almost 2500 responses (393 patients and 2047 providers) was recorded in our web-based survey. Systematic filtering flagged numerous fraudulent responses. We identified fraudulent responses based on criteria including, but not limited to, identical timestamps and responses, responses with slight variations in wording and similar timestamps, and fraudulent email addresses. Therefore, the incident described in this viewpoint paper highlights the importance of preserving research integrity by using methodologically sound practices to extract true data for research findings. These fraudulent events continue to threaten the credibility of research findings and future evidence-based practices. %M 39832359 %R 10.2196/58450 %U https://www.jmir.org/2025/1/e58450 %U https://doi.org/10.2196/58450 %U http://www.ncbi.nlm.nih.gov/pubmed/39832359 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e63032 %T Comparing Health Survey Data Cost and Quality Between Amazon’s Mechanical Turk and Ipsos’ KnowledgePanel: Observational Study %A Herman,Patricia M %A Slaughter,Mary E %A Qureshi,Nabeel %A Azzam,Tarek %A Cella,David %A Coulter,Ian D %A DiGuiseppi,Graham %A Edelen,Maria Orlando %A Kapteyn,Arie %A Rodriguez,Anthony %A Rubinstein,Max %A Hays,Ron D %+ RAND, 1776 Main Street, Santa Monica, CA, 90407, United States, 1 3103930411 ext 7129, pherman@rand.org %K data collection %K probability panel %K convenience sample %K data quality %K weighting %K back pain %K misrepresentation %K Amazon %K Mechanical Turk %K MTurk %K convenience panel %K KnowledgePanel %D 2024 %7 29.11.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: Researchers have many options for web-based survey data collection, ranging from access to curated probability-based panels, where individuals are selectively invited to join based on their membership in a representative population, to convenience panels, which are open for anyone to join. The mix of respondents available also varies greatly regarding representation of a population of interest and in motivation to provide thoughtful and accurate responses. Despite the additional dataset-building labor required of the researcher, convenience panels are much less expensive than probability-based panels. However, it is important to understand what may be given up regarding data quality for those cost savings. Objective: This study examined the relative costs and data quality of fielding equivalent surveys on Amazon’s Mechanical Turk (MTurk), a convenience panel, and KnowledgePanel, a nationally representative probability-based panel. Methods: We administered the same survey measures to MTurk (in 2021) and KnowledgePanel (in 2022) members. We applied several recommended quality assurance steps to enhance the data quality achieved using MTurk. Ipsos, the owner of KnowledgePanel, followed their usual (industry standard) protocols. The survey was designed to support psychometric analyses and included >60 items from the Patient-Reported Outcomes Measurement Information System (PROMIS), demographics, and a list of health conditions. We used 2 fake conditions (“syndomitis” and “chekalism”) to identify those more likely to be honest respondents. We examined the quality of each platform’s data using several recommended metrics (eg, consistency, reliability, representativeness, missing data, and correlations) including and excluding those respondents who had endorsed a fake condition and examined the impact of weighting on representativeness. Results: We found that prescreening in the MTurk sample (removing those who endorsed a fake health condition) improved data quality but KnowledgePanel data quality generally remained superior. While MTurk’s unweighted point estimates for demographics exhibited the usual mismatch with national averages (younger, better educated, and lower income), weighted MTurk data matched national estimates. KnowledgePanel’s point estimates better matched national benchmarks even before poststratification weighting. Correlations between PROMIS measures and age and income were similar in MTurk and KnowledgePanel; the mean absolute value of the difference between each platform’s 137 correlations was 0.06, and 92% were <0.15. However, correlations between PROMIS measures and educational level were dramatically different; the mean absolute value of the difference across these 17 correlation pairs was 0.15, the largest difference was 0.29, and the direction of more than half of these relationships in the MTurk sample was the opposite from that expected from theory. Therefore, caution is needed if using MTurk for studies where educational level is a key variable. Conclusions: The data quality of our MTurk sample was often inferior to that of the KnowledgePanel sample but possibly not so much as to negate the benefits of its cost savings for some uses. International Registered Report Identifier (IRRID): RR2-10.1186/s12891-020-03696-2 %M 39612505 %R 10.2196/63032 %U https://www.jmir.org/2024/1/e63032 %U https://doi.org/10.2196/63032 %U http://www.ncbi.nlm.nih.gov/pubmed/39612505 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 13 %N %P e58771 %T Dropout in a Longitudinal Survey of Amazon Mechanical Turk Workers With Low Back Pain: Observational Study %A Qureshi,Nabeel %A Hays,Ron D %A Herman,Patricia M %+ RAND Health Care, RAND Corporation, 1776 Main Street, Santa Monica, CA, 90401, United States, 1 3103930411 ext 6054, nqureshi@rand.org %K chronic low back pain %K Mechanical Turk %K MTurk %K survey attrition %K survey weights %K Amazon %K occupational health %K manual labor %D 2024 %7 11.11.2024 %9 Original Paper %J Interact J Med Res %G English %X Background: Surveys of internet panels such as Amazon’s Mechanical Turk (MTurk) are common in health research. Nonresponse in longitudinal studies can limit inferences about change over time. Objective: This study aimed to (1) describe the patterns of survey responses and nonresponse among MTurk members with back pain, (2) identify factors associated with survey response over time, (3) assess the impact of nonresponse on sample characteristics, and (4) assess how well inverse probability weighting can account for differences in sample composition. Methods: We surveyed adult MTurk workers who identified as having back pain. We report participation trends over 3 survey waves and use stepwise logistic regression to identify factors related to survey participation in successive waves. Results: A total of 1678 adults participated in wave 1. Of those, 983 (59%) participated in wave 2 and 703 (42%) in wave 3. Participants who did not drop out took less time to complete previous surveys (30 min vs 35 min in wave 1, P<.001; 24 min vs 26 min in wave 2, P=.02) and reported having fewer health conditions (5.88 vs 6.6, P<.001). In multivariate models predicting responding at wave 2, lower odds of participation were associated with more time to complete the baseline survey (odds ratio [OR] 0.98, 95% CI 0.97-0.99), being Hispanic (compared with non-Hispanic, OR 0.69, 95% CI 0.49-0.96), having a bachelor’s degree as their terminal degree (compared with all other levels of education, OR 0.58, 95% CI 0.46-0.73), having more pain interference and intensity (OR 0.75, 95% CI 0.64-0.89), and having more health conditions. In contrast, older respondents (older than 45 years age compared with 18-24 years age) were more likely to respond to the wave 2 survey (OR 2.63 and 3.79, respectively) and those whose marital status was divorced (OR 1.81) and separated (OR 1.77) were also more likely to respond to the wave 2 survey. Weighted analysis showed slight differences in sample demographics and conditions and larger differences in pain assessments, particularly for those who responded to wave 2. Conclusions: Longitudinal studies on MTurk have large, differential dropouts between waves. This study provided information about the individuals more likely to drop out over time, which can help researchers prepare for future surveys. %M 39527103 %R 10.2196/58771 %U https://www.i-jmr.org/2024/1/e58771 %U https://doi.org/10.2196/58771 %U http://www.ncbi.nlm.nih.gov/pubmed/39527103 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 8 %N %P e50131 %T Optimizing the Measurement of Information on the Context of Alcohol Consumption Within the Drink Less App Among People Drinking at Increasing and Higher Risk Levels: Mixed-Methods Usability Study %A Stevely,Abigail K %A Garnett,Claire %A Holmes,John %A Jones,Andrew %A Dinu,Larisa %A Oldham,Melissa %+ Sheffield Addictions Research Group, School of Medicine and Population Health, University of Sheffield, 30 Regent St, Sheffield City Centre, Sheffield, S1 4DA, United Kingdom, 44 114 222 552, a.stevely@sheffield.ac.uk %K alcohol use disorder %K substance use disorder %K alcohol consumption %K mobile app %K mHealth %K mobile health %K diary %K health behavior change %K usability %K user engagement %D 2024 %7 24.10.2024 %9 Original Paper %J JMIR Form Res %G English %X Background: There is a growing public health evidence base focused on understanding the links between drinking contexts and alcohol consumption. However, the potential value of developing context-based interventions to help people drinking at increasing and higher risk levels to cut down remains underexplored. Digital interventions, such as apps, offer significant potential for delivering context-based interventions as they can collect contextual information and flexibly deliver personalized interventions while addressing barriers associated with face-to-face interventions, such as time constraints. Objective: This early phase study aimed to identify the best method for collecting information on the contexts of alcohol consumption among users of an alcohol reduction app by comparing 2 alternative drinking diaries in terms of user engagement, data quality, usability, and acceptability. Methods: Participants were recruited using the online platform Prolific and were randomly assigned to use 1 of the 2 adapted versions of the Drink Less app for 14 days. Tags (n=31) included tags for location, motivation, and company that participants added to drink records. Occasion type (n=31) included a list of occasion types that participants selected from when adding drink records. We assessed engagement and data quality with app data, usability with a validated questionnaire, and acceptability with semistructured interviews. Results: Quantitative findings on engagement, data quality, and app usability were good overall, with participants using the app on most days (tags: mean 12.23, SD 2.46 days; occasion type: mean 12.39, SD 2.12 days). However, around 40% of drinking records in tags did not include company and motivation tags. Mean usability scores were similar across app versions (tags: mean 72.39, SD 8.10; occasion type: mean 74.23, SD 6.76). Qualitative analysis found that both versions were acceptable to users and were relevant to their drinking occasions, and participants reported increased awareness of their drinking contexts. Several participants reported that the diary helped them to reduce alcohol consumption in some contexts (eg, home or lone drinking) more than others (eg, social drinking) and suggested that they felt less negative affect recording social drinking contexts out of their home. Participants also suggested the inclusion of “work drinks” in both versions and “habit” as a motivation in the tags version. Conclusions: There was no clearly better method for collecting data on alcohol consumption as both methods had good user engagement, usability, acceptability, and data quality. Participants recorded sufficient data on their drinking contexts to suggest that an adapted version of Drink Less could be used as the basis for context-specific interventions. The occasion type version may be preferable owing to lower participant burden. A more general consideration is to ensure that context-specific interventions are designed to minimize the risk of unintended positive reinforcement of drinking occasions that are seen as sociable by users. %M 39446464 %R 10.2196/50131 %U https://formative.jmir.org/2024/1/e50131 %U https://doi.org/10.2196/50131 %U http://www.ncbi.nlm.nih.gov/pubmed/39446464 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 8 %N %P e59950 %T Evaluating the Psychometric Properties of a Physical Activity and Sedentary Behavior Identity Scale: Survey Study With Two Independent Samples of Adults in the United States %A Wen,Cheng K Fred %A Schneider,Stefan %A Junghaenel,Doerte U %A Toledo,Meynard John L %A Lee,Pey-Jiuan %A Smyth,Joshua M %A Stone,Arthur A %+ Dornsife Center for Self-Report Science, University of Southern California, 635 Downey Way, Los Angeles, CA, 90089-3332, United States, 1 213 821 1850, chengkuw@usc.edu %K physical activity %K sedentary behavior %K geriatrics %K exercise %K lifestyle %K physical health %K mental health %K social-cognitive approach %D 2024 %7 24.10.2024 %9 Original Paper %J JMIR Form Res %G English %X Background: Emerging evidence suggests a positive association between relevant aspects of one’s psychological identity and physical activity engagement, but the current understanding of this relationship is primarily based on scales designed to assess identity as a person who exercises, leaving out essential aspects of physical activities (eg, incidental and occupational physical activity) and sedentary behavior. Objective: The goal of this study is to evaluate the validity of a new physical activity and sedentary behavior (PA/SB) identity scale using 2 independent samples of US adults. Methods: In study 1, participants answered 21 candidate items for the PA/SB identity scale and completed the International Physical Activity Questionnaire-Short Form (IPAQ-SF). Study 2 participants completed the same PA/SB identity items twice over a 1-week interval and completed the IPAQ-SF at the end. We performed factor analyses to evaluate the structure of the PA/SB identity scale, evaluated convergent validity and test-retest reliability (in study 2) of the final scale scores, and examined their discriminant validity using tests for differences in dependent correlations. Results: The final PA/SB identity measure was comprised of 3 scales: physical activity role identity (F1), physical activity belief (F2), and sedentary behavior role identity (F3). The scales had high test-retest reliability (Pearson correlation coefficient: F1, r=0.87; F2, r=0.75; F3, r=0.84; intraclass correlation coefficient [ICC]: F1: ICC=0.85; F2: ICC=0.75; F3: ICC=0.84). F1 and F2 were positively correlated with each other (study 1, r=0.76; study 2, r=0.69), while both were negatively correlated with F3 (Pearson correlation coefficient between F1 and F3: r=–0.58 for study 1 and r=–0.73 for study 2; F2 and F3: r=–0.46 for studies 1 and 2). Data from both studies also demonstrated adequate discriminant validity of the scale developed. Significantly larger correlations with time in vigorous and moderate activities and time walking and sitting assessed by IPAQ-SF with F1, compared with F2, were observed. Significantly larger correlations with time in vigorous and moderate activities with F1, compared with F3, were also observed. Similarly, a larger correlation with time in vigorous activities and a smaller correlation with time walking were observed with F2, compared with F3. Conclusions: This study provided initial empirical evidence from 2 independent studies on the reliability and validity of the PA/SB identity scales for adults. %M 39446463 %R 10.2196/59950 %U https://formative.jmir.org/2024/1/e59950 %U https://doi.org/10.2196/59950 %U http://www.ncbi.nlm.nih.gov/pubmed/39446463 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 13 %N %P e58203 %T The Rutgers Omnibus Study: Protocol for Quarterly Web-Based Surveys to Promote Rapid Tobacco Research %A Bover Manderski,Michelle T %A Young,William J %A Ganz,Ollie %A Delnevo,Cristine D %+ Institute for Nicotine and Tobacco Studies, Rutgers Biomedical and Health Sciences (Rutgers Health), Rutgers, The State University of New Jersey, 303 George Street, Suite 405, New Brunswick, NJ, 08901, United States, 1 732 235 9727, bovermi@ints.rutgers.edu %K survey %K tobacco %K nicotine %K young adults %K adults %K protocol %K Rutgers Omnibus Study %K Amazon Mechanical Turk %K MTurk %D 2024 %7 16.10.2024 %9 Protocol %J JMIR Res Protoc %G English %X Background: Rapid and flexible data collection efforts are necessary for effective monitoring and research on tobacco and nicotine product use in a constantly evolving marketplace. The Rutgers Omnibus Survey (1) provides timely data on awareness and use of new and emerging tobacco products among adults in a rapid manner, (2) provides a platform for measurement experiments to help develop and refine measures of tobacco use that reflect the current marketplace, and (3) generates pilot data for grant applications and scientific manuscripts. Objective: This study aims to document the first 2 years of the Rutgers Omnibus Study through the reporting of methodology, fielding summaries, and sample characteristics. Methods: Launched in February 2022 and fielded quarterly thereafter, we survey convenience samples of 2000 to 3000 US adults aged 18-45 years recruited from Amazon Mechanical Turk (MTurk) using the MTurk Toolkit by CloudResearch. The questionnaire includes core and rotating modules and is designed to take approximately 10 minutes to complete through Qualtrics. The fielding duration is approximately 10 days per wave. Each wave includes both unique and repeating participants, and responses can be linked across waves by an anonymous ID. Results: Sample sizes ranged from 2082 (wave 8, December 2023) to 2989 (wave 1, February 2022), and the 8-wave longitudinal dataset included 10,334 participants, of whom 2477 had 3 or more data points. The cost per complete at each wave was low, ranging from US $2.46 to US $3.27 across waves. Key demographics were consistent across waves and similar to that of the general population, while tobacco product trial and past–30-day use were generally higher. Conclusions: The Rutgers Omnibus Study is a quarterly survey that is effective for rapidly assessing the use of emerging tobacco and nicotine products and can also be leveraged to conduct survey experiments, generate pilot data, and address both cross-sectional and longitudinal research questions. International Registered Report Identifier (IRRID): RR1-10.2196/58203 %M 39413372 %R 10.2196/58203 %U https://www.researchprotocols.org/2024/1/e58203 %U https://doi.org/10.2196/58203 %U http://www.ncbi.nlm.nih.gov/pubmed/39413372 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e49704 %T Evaluating Expert-Layperson Agreement in Identifying Jargon Terms in Electronic Health Record Notes: Observational Study %A Lalor,John P %A Levy,David A %A Jordan,Harmon S %A Hu,Wen %A Smirnova,Jenni Kim %A Yu,Hong %+ Center for Biomedical and Health Research in Data Sciences, Miner School of Computer and Information Sciences, University of Massachusetts Lowell, 1 University Ave, Lowell, MA, 01854, United States, 1 978 934 3620, Hong_Yu@uml.edu %K expert-layperson agreement %K medical jargon %K jargon identification %K EHR %K electronic health record notes %K crowdsourcing %K clinical notes %D 2024 %7 15.10.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: Studies have shown that patients have difficulty understanding medical jargon in electronic health record (EHR) notes, particularly patients with low health literacy. In creating the NoteAid dictionary of medical jargon for patients, a panel of medical experts selected terms they perceived as needing definitions for patients. Objective: This study aims to determine whether experts and laypeople agree on what constitutes medical jargon. Methods: Using an observational study design, we compared the ability of medical experts and laypeople to identify medical jargon in EHR notes. The laypeople were recruited from Amazon Mechanical Turk. Participants were shown 20 sentences from EHR notes, which contained 325 potential jargon terms as identified by the medical experts. We collected demographic information about the laypeople’s age, sex, race or ethnicity, education, native language, and health literacy. Health literacy was measured with the Single Item Literacy Screener. Our evaluation metrics were the proportion of terms rated as jargon, sensitivity, specificity, Fleiss κ for agreement among medical experts and among laypeople, and the Kendall rank correlation statistic between the medical experts and laypeople. We performed subgroup analyses by layperson characteristics. We fit a beta regression model with a logit link to examine the association between layperson characteristics and whether a term was classified as jargon. Results: The average proportion of terms identified as jargon by the medical experts was 59% (1150/1950, 95% CI 56.1%-61.8%), and the average proportion of terms identified as jargon by the laypeople overall was 25.6% (22,480/87,750, 95% CI 25%-26.2%). There was good agreement among medical experts (Fleiss κ=0.781, 95% CI 0.753-0.809) and fair agreement among laypeople (Fleiss κ=0.590, 95% CI 0.589-0.591). The beta regression model had a pseudo-R2 of 0.071, indicating that demographic characteristics explained very little of the variability in the proportion of terms identified as jargon by laypeople. Using laypeople’s identification of jargon as the gold standard, the medical experts had high sensitivity (91.7%, 95% CI 90.1%-93.3%) and specificity (88.2%, 95% CI 86%-90.5%) in identifying jargon terms. Conclusions: To ensure coverage of possible jargon terms, the medical experts were loose in selecting terms for inclusion. Fair agreement among laypersons shows that this is needed, as there is a variety of opinions among laypersons about what is considered jargon. We showed that medical experts could accurately identify jargon terms for annotation that would be useful for laypeople. %M 39405109 %R 10.2196/49704 %U https://www.jmir.org/2024/1/e49704 %U https://doi.org/10.2196/49704 %U http://www.ncbi.nlm.nih.gov/pubmed/39405109 %0 Journal Article %@ 2564-1891 %I JMIR Publications %V 4 %N %P e50125 %T Collective Intelligence–Based Participatory COVID-19 Surveillance in Accra, Ghana: Pilot Mixed Methods Study %A Marley,Gifty %A Dako-Gyeke,Phyllis %A Nepal,Prajwol %A Rajgopal,Rohini %A Koko,Evelyn %A Chen,Elizabeth %A Nuamah,Kwabena %A Osei,Kingsley %A Hofkirchner,Hubertus %A Marks,Michael %A Tucker,Joseph D %A Eggo,Rosalind %A Ampofo,William %A Sylvia,Sean %+ Department of Health Policy and Management, University of North Carolina, 1101D McGavran-Greenberg Hall, CB #7411 Chapel Hill, NC 27599-7411, Chapel Hill, NC, 27599, United States, 1 919 966 6328, sysylvia@email.unc.edu %K information markets %K participatory disease surveillance %K collective intelligence %K community engagement %K the wisdom of the crowds %K Ghana %K mobile phone %D 2024 %7 12.8.2024 %9 Original Paper %J JMIR Infodemiology %G English %X Background: Infectious disease surveillance is difficult in many low- and middle-income countries. Information market (IM)–based participatory surveillance is a crowdsourcing method that encourages individuals to actively report health symptoms and observed trends by trading web-based virtual “stocks” with payoffs tied to a future event. Objective: This study aims to assess the feasibility and acceptability of a tailored IM surveillance system to monitor population-level COVID-19 outcomes in Accra, Ghana. Methods: We designed and evaluated a prediction markets IM system from October to December 2021 using a mixed methods study approach. Health care workers and community volunteers aged ≥18 years living in Accra participated in the pilot trading. Participants received 10,000 virtual credits to trade on 12 questions on COVID-19–related outcomes. Payoffs were tied to the cost estimation of new and cumulative cases in the region (Greater Accra) and nationwide (Ghana) at specified future time points. Questions included the number of new COVID-19 cases, the number of people likely to get the COVID-19 vaccination, and the total number of COVID-19 cases in Ghana by the end of the year. Phone credits were awarded based on the tally of virtual credits left and the participant’s percentile ranking. Data collected included age, occupation, and trading frequency. In-depth interviews explored the reasons and factors associated with participants’ user journey experience, barriers to system use, and willingness to use IM systems in the future. Trading frequency was assessed using trend analysis, and ordinary least squares regression analysis was conducted to determine the factors associated with trading at least once. Results: Of the 105 eligible participants invited, 21 (84%) traded at least once on the platform. Questions estimating the national-level number of COVID-19 cases received 13 to 19 trades, and obtaining COVID-19–related information mainly from television and radio was associated with less likelihood of trading (marginal effect: −0.184). Individuals aged <30 years traded 7.5 times more and earned GH ¢134.1 (US $11.7) more in rewards than those aged >30 years (marginal effect: 0.0135). Implementing the IM surveillance was feasible; all 21 participants who traded found using IM for COVID-19 surveillance acceptable. Active trading by friends with communal discussion and a strong onboarding process facilitated participation. The lack of bidirectional communication on social media and technical difficulties were key barriers. Conclusions: Using an IM system for disease surveillance is feasible and acceptable in Ghana. This approach shows promise as a cost-effective source of information on disease trends in low- and middle-income countries where surveillance is underdeveloped, but further studies are needed to optimize its use. %M 39133907 %R 10.2196/50125 %U https://infodemiology.jmir.org/2024/1/e50125 %U https://doi.org/10.2196/50125 %U http://www.ncbi.nlm.nih.gov/pubmed/39133907 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 13 %N %P e58635 %T Perception of Medication Safety–Related Behaviors Among Different Age Groups: Web-Based Cross-Sectional Study %A Lang,Yan %A Chen,Kay-Yut %A Zhou,Yuan %A Kosmari,Ludmila %A Daniel,Kathryn %A Gurses,Ayse %A Young,Richard %A Arbaje,Alicia %A Xiao,Yan %+ Department of Business, State University of New York at Oneonta, 108 Ravine Pkwy, Oneonta, NY, 13820, United States, 1 607 436 3251, yan.lang@oneonta.edu %K medication safety %K patient engagement %K aged adults %K survey %K Amazon Mechanical Turk %K medication %K engagement %K older adults %K elderly %K safety %K United States %K USA %K crowdsourcing %K community %K patient portal %K primary care %K medications %K safety behavior %K younger adults %K age %K correlation %K statistical test %D 2024 %7 12.8.2024 %9 Original Paper %J Interact J Med Res %G English %X Background: Previous research and safety advocacy groups have proposed various behaviors for older adults to actively engage in medication safety. However, little is known about how older adults perceive the importance and reasonableness of these behaviors in ambulatory settings. Objective: This study aimed to assess older adults’ perceptions of the importance and reasonableness of 8 medication safety behaviors in ambulatory settings and compare their responses with those of younger adults. Methods: We conducted a survey of 1222 adults in the United States using crowdsourcing to evaluate patient behaviors that may enhance medication safety in community settings. A total of 8 safety behaviors were identified based on the literature, such as bringing medications to office visits, confirming medications at home, managing medication refills, using patient portals, organizing medications, checking medications, getting help, and knowing medications. Respondents were asked about their perception of the importance and reasonableness of these behaviors on a 5-point Likert rating scale in the context of collaboration with primary care providers. We assessed the relative ranking of behaviors in terms of importance and reasonableness and examined the association between these dimensions across age groups using statistical tests. Results: Of 1222 adult participants, 125 (10.2%) were aged 65 years or older. Most participants were White, college-educated, and had chronic conditions. Older adults rated all 8 behaviors significantly higher in both importance and reasonableness than did younger adults (P<.001 for combined behaviors). Confirming medications ranked highest in importance (mean score=3.78) for both age groups while knowing medications ranked highest in reasonableness (mean score=3.68). Using patient portals was ranked lowest in importance (mean score=3.53) and reasonableness (mean score=3.49). There was a significant correlation between the perceived importance and reasonableness of the identified behaviors, with coefficients ranging from 0.436 to 0.543 (all P<.001). Conclusions: Older adults perceived the identified safety behaviors as more important and reasonable than younger adults. However, both age groups considered a behavior highly recommended by professionals as the least important and reasonable. Patient engagement strategies, common and specific to age groups, should be considered to improve medication safety in ambulatory settings. %M 39133905 %R 10.2196/58635 %U https://www.i-jmr.org/2024/1/e58635 %U https://doi.org/10.2196/58635 %U http://www.ncbi.nlm.nih.gov/pubmed/39133905 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e51397 %T Gamified Crowdsourcing as a Novel Approach to Lung Ultrasound Data Set Labeling: Prospective Analysis %A Duggan,Nicole M %A Jin,Mike %A Duran Mendicuti,Maria Alejandra %A Hallisey,Stephen %A Bernier,Denie %A Selame,Lauren A %A Asgari-Targhi,Ameneh %A Fischetti,Chanel E %A Lucassen,Ruben %A Samir,Anthony E %A Duhaime,Erik %A Kapur,Tina %A Goldsmith,Andrew J %+ Department of Emergency Medicine, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, NH-2, Boston, MA, 02115, United States, 1 617 732 5640, nmduggan@bwh.harvard.edu %K crowdsource %K crowdsourced %K crowdsourcing %K machine learning %K artificial intelligence %K point-of-care ultrasound %K POCUS %K lung ultrasound %K B-lines %K gamification %K gamify %K gamified %K label %K labels %K labeling %K classification %K lung %K pulmonary %K respiratory %K ultrasound %K imaging %K medical image %K diagnostic %K diagnose %K diagnosis %K data science %D 2024 %7 4.7.2024 %9 Original Paper %J J Med Internet Res %G English %X Background: Machine learning (ML) models can yield faster and more accurate medical diagnoses; however, developing ML models is limited by a lack of high-quality labeled training data. Crowdsourced labeling is a potential solution but can be constrained by concerns about label quality. Objective: This study aims to examine whether a gamified crowdsourcing platform with continuous performance assessment, user feedback, and performance-based incentives could produce expert-quality labels on medical imaging data. Methods: In this diagnostic comparison study, 2384 lung ultrasound clips were retrospectively collected from 203 emergency department patients. A total of 6 lung ultrasound experts classified 393 of these clips as having no B-lines, one or more discrete B-lines, or confluent B-lines to create 2 sets of reference standard data sets (195 training clips and 198 test clips). Sets were respectively used to (1) train users on a gamified crowdsourcing platform and (2) compare the concordance of the resulting crowd labels to the concordance of individual experts to reference standards. Crowd opinions were sourced from DiagnosUs (Centaur Labs) iOS app users over 8 days, filtered based on past performance, aggregated using majority rule, and analyzed for label concordance compared with a hold-out test set of expert-labeled clips. The primary outcome was comparing the labeling concordance of collated crowd opinions to trained experts in classifying B-lines on lung ultrasound clips. Results: Our clinical data set included patients with a mean age of 60.0 (SD 19.0) years; 105 (51.7%) patients were female and 114 (56.1%) patients were White. Over the 195 training clips, the expert-consensus label distribution was 114 (58%) no B-lines, 56 (29%) discrete B-lines, and 25 (13%) confluent B-lines. Over the 198 test clips, expert-consensus label distribution was 138 (70%) no B-lines, 36 (18%) discrete B-lines, and 24 (12%) confluent B-lines. In total, 99,238 opinions were collected from 426 unique users. On a test set of 198 clips, the mean labeling concordance of individual experts relative to the reference standard was 85.0% (SE 2.0), compared with 87.9% crowdsourced label concordance (P=.15). When individual experts’ opinions were compared with reference standard labels created by majority vote excluding their own opinion, crowd concordance was higher than the mean concordance of individual experts to reference standards (87.4% vs 80.8%, SE 1.6 for expert concordance; P<.001). Clips with discrete B-lines had the most disagreement from both the crowd consensus and individual experts with the expert consensus. Using randomly sampled subsets of crowd opinions, 7 quality-filtered opinions were sufficient to achieve near the maximum crowd concordance. Conclusions: Crowdsourced labels for B-line classification on lung ultrasound clips via a gamified approach achieved expert-level accuracy. This suggests a strategic role for gamified crowdsourcing in efficiently generating labeled image data sets for training ML systems. %M 38963923 %R 10.2196/51397 %U https://www.jmir.org/2024/1/e51397 %U https://doi.org/10.2196/51397 %U http://www.ncbi.nlm.nih.gov/pubmed/38963923 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 26 %N %P e51138 %T A Perspective on Crowdsourcing and Human-in-the-Loop Workflows in Precision Health %A Washington,Peter %+ Information and Computer Sciences, University of Hawaii at Manoa, 1680 East-West Road, Honolulu, HI, 96822, United States, pyw@hawaii.edu %K crowdsourcing %K digital medicine %K human-in-the-loop %K human in the loop %K human-AI collaboration %K machine learning %K precision health %K artificial intelligence %K AI %D 2024 %7 11.4.2024 %9 Viewpoint %J J Med Internet Res %G English %X Modern machine learning approaches have led to performant diagnostic models for a variety of health conditions. Several machine learning approaches, such as decision trees and deep neural networks, can, in principle, approximate any function. However, this power can be considered to be both a gift and a curse, as the propensity toward overfitting is magnified when the input data are heterogeneous and high dimensional and the output class is highly nonlinear. This issue can especially plague diagnostic systems that predict behavioral and psychiatric conditions that are diagnosed with subjective criteria. An emerging solution to this issue is crowdsourcing, where crowd workers are paid to annotate complex behavioral features in return for monetary compensation or a gamified experience. These labels can then be used to derive a diagnosis, either directly or by using the labels as inputs to a diagnostic machine learning model. This viewpoint describes existing work in this emerging field and discusses ongoing challenges and opportunities with crowd-powered diagnostic systems, a nascent field of study. With the correct considerations, the addition of crowdsourcing to human-in-the-loop machine learning workflows for the prediction of complex and nuanced health conditions can accelerate screening, diagnostics, and ultimately access to care. %M 38602750 %R 10.2196/51138 %U https://www.jmir.org/2024/1/e51138 %U https://doi.org/10.2196/51138 %U http://www.ncbi.nlm.nih.gov/pubmed/38602750 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 13 %N %P e52205 %T Digitally Diagnosing Multiple Developmental Delays Using Crowdsourcing Fused With Machine Learning: Protocol for a Human-in-the-Loop Machine Learning Study %A Jaiswal,Aditi %A Kruiper,Ruben %A Rasool,Abdur %A Nandkeolyar,Aayush %A Wall,Dennis P %A Washington,Peter %+ Department of Information and Computer Sciences, University of Hawaii at Manoa, Room 312, Pacific Ocean Science and Technology (POST), 1680 East-West Road, Honolulu, HI, 96822, United States, 1 8088296359, ajaiswal@hawaii.edu %K machine learning %K crowdsourcing %K autism spectrum disorder %K ASD %K attention-deficit/hyperactivity disorder %K ADHD %K precision health %D 2024 %7 8.2.2024 %9 Protocol %J JMIR Res Protoc %G English %X Background: A considerable number of minors in the United States are diagnosed with developmental or psychiatric conditions, potentially influenced by underdiagnosis factors such as cost, distance, and clinician availability. Despite the potential of digital phenotyping tools with machine learning (ML) approaches to expedite diagnoses and enhance diagnostic services for pediatric psychiatric conditions, existing methods face limitations because they use a limited set of social features for prediction tasks and focus on a single binary prediction, resulting in uncertain accuracies. Objective: This study aims to propose the development of a gamified web system for data collection, followed by a fusion of novel crowdsourcing algorithms with ML behavioral feature extraction approaches to simultaneously predict diagnoses of autism spectrum disorder and attention-deficit/hyperactivity disorder in a precise and specific manner. Methods: The proposed pipeline will consist of (1) gamified web applications to curate videos of social interactions adaptively based on the needs of the diagnostic system, (2) behavioral feature extraction techniques consisting of automated ML methods and novel crowdsourcing algorithms, and (3) the development of ML models that classify several conditions simultaneously and that adaptively request additional information based on uncertainties about the data. Results: A preliminary version of the web interface has been implemented, and a prior feature selection method has highlighted a core set of behavioral features that can be targeted through the proposed gamified approach. Conclusions: The prospect for high reward stems from the possibility of creating the first artificial intelligence–powered tool that can identify complex social behaviors well enough to distinguish conditions with nuanced differentiators such as autism spectrum disorder and attention-deficit/hyperactivity disorder. International Registered Report Identifier (IRRID): PRR1-10.2196/52205 %M 38329783 %R 10.2196/52205 %U https://www.researchprotocols.org/2024/1/e52205 %U https://doi.org/10.2196/52205 %U http://www.ncbi.nlm.nih.gov/pubmed/38329783 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e51089 %T Investigating Racial Disparities in Cancer Crowdfunding: A Comprehensive Study of Medical GoFundMe Campaigns %A Zhang,Xupin %A Wang,Jingjing %A Lane,Jamil M %A Xu,Xin %A Sörensen,Silvia %+ School of Economics and Management, East China Normal University, No. 3663, North Zhongshan Road, Shanghai, 200062, China, 86 21 62235067, xxu@infor.ecnu.edu.cn %K crowdfunding %K racial discrimination %K GoFundMe %D 2023 %7 12.12.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: In recent years, there has been growing concern about prejudice in crowdfunding; however, empirical research remains limited, particularly in the context of medical crowdfunding. This study addresses the pressing issue of racial disparities in medical crowdfunding, with a specific focus on cancer crowdfunding on the GoFundMe platform. Objective: This study aims to investigate racial disparities in cancer crowdfunding using average donation amount, number of donations, and success of the fundraising campaign as outcomes. Methods: Drawing from a substantial data set of 104,809 campaigns in the United States, we used DeepFace facial recognition technology to determine racial identities and used regression models to examine racial factors in crowdfunding performance. We also examined the moderating effect of the proportion of White residents on crowdfunding bias and used 2-tailed t tests to measure the influence of racial anonymity on crowdfunding success. Owing to the large sample size, we set the cutoff for significance at P<.001. Results: In the regression and supplementary analyses, the racial identity of the fundraiser significantly predicted average donations (P<.001), indicating that implicit bias may play a role in donor behavior. Gender (P=.04) and campaign description length (P=.62) did not significantly predict the average donation amounts. The race of the fundraiser was not significantly associated with the number of donations (P=.42). The success rate of cancer crowdfunding campaigns, although generally low (11.77%), showed a significant association with the race of the fundraiser (P<.001). After controlling for the covariates of the fundraiser gender, fundraiser age, local White proportion, length of campaign description, and fundraising goal, the average donation amount to White individuals was 17.68% higher than for Black individuals. Moreover, campaigns that did not disclose racial information demonstrated a marginally higher average donation amount (3.92%) than those identified as persons of color. Furthermore, the racial composition of the fundraiser’s county of residence was found to exert influence (P<.001); counties with a higher proportion of White residents exhibited reduced racial disparities in crowdfunding outcomes. Conclusions: This study contributes to a deeper understanding of racial disparities in cancer crowdfunding. It highlights the impact of racial identity, geographic context, and the potential for implicit bias in donor behavior. As web-based platforms evolve, addressing racial inequality and promoting fairness in health care financing remain critical goals. Insights from this research suggest strategies such as maintaining racial anonymity and ensuring that campaigns provide strong evidence of deservingness. Moreover, broader societal changes are necessary to eliminate the financial distress that drives individuals to seek crowdfunding support. %M 38085562 %R 10.2196/51089 %U https://www.jmir.org/2023/1/e51089 %U https://doi.org/10.2196/51089 %U http://www.ncbi.nlm.nih.gov/pubmed/38085562 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e46421 %T Effects of Excluding Those Who Report Having “Syndomitis” or “Chekalism” on Data Quality: Longitudinal Health Survey of a Sample From Amazon’s Mechanical Turk %A Hays,Ron D %A Qureshi,Nabeel %A Herman,Patricia M %A Rodriguez,Anthony %A Kapteyn,Arie %A Edelen,Maria Orlando %+ Division of General Internal Medicine and Health Services Research, Department of Medicine, University of California, 1100 Glendon Avenue Suite 800, Los Angeles, CA, 90024, United States, 1 310 794 2294, drhays@ucla.edu %K misrepresentation %K survey %K data quality %K MTurk %K Amazon Mechanical Turk %D 2023 %7 4.8.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: Researchers have implemented multiple approaches to increase data quality from existing web-based panels such as Amazon’s Mechanical Turk (MTurk). Objective: This study extends prior work by examining improvements in data quality and effects on mean estimates of health status by excluding respondents who endorse 1 or both of 2 fake health conditions (“Syndomitis” and “Chekalism”). Methods: Survey data were collected in 2021 at baseline and 3 months later from MTurk study participants, aged 18 years or older, with an internet protocol address in the United States, and who had completed a minimum of 500 previous MTurk “human intelligence tasks.” We included questions about demographic characteristics, health conditions (including the 2 fake conditions), and the Patient Reported Outcomes Measurement Information System (PROMIS)-29+2 (version 2.1) preference–based score survey. The 3-month follow-up survey was only administered to those who reported having back pain and did not endorse a fake condition at baseline. Results: In total, 15% (996/6832) of the sample endorsed at least 1 of the 2 fake conditions at baseline. Those who endorsed a fake condition at baseline were more likely to identify as male, non-White, younger, report more health conditions, and take longer to complete the survey than those who did not endorse a fake condition. They also had substantially lower internal consistency reliability on the PROMIS-29+2 scales than those who did not endorse a fake condition: physical function (0.69 vs 0.89), pain interference (0.80 vs 0.94), fatigue (0.80 vs 0.92), depression (0.78 vs 0.92), anxiety (0.78 vs 0.90), sleep disturbance (−0.27 vs 0.84), ability to participate in social roles and activities (0.77 vs 0.92), and cognitive function (0.65 vs 0.77). The lack of reliability of the sleep disturbance scale for those endorsing a fake condition was because it includes both positively and negatively worded items. Those who reported a fake condition reported significantly worse self-reported health scores (except for sleep disturbance) than those who did not endorse a fake condition. Excluding those who endorsed a fake condition improved the overall mean PROMIS-29+2 (version 2.1) T-scores by 1-2 points and the PROMIS preference–based score by 0.04. Although they did not endorse a fake condition at baseline, 6% (n=59) of them endorsed at least 1 of them on the 3-month survey and they had lower PROMIS-29+2 score internal consistency reliability and worse mean scores on the 3-month survey than those who did not report having a fake condition. Based on these results, we estimate that 25% (1708/6832) of the MTurk respondents provided careless or dishonest responses. Conclusions: This study provides evidence that asking about fake health conditions can help to screen out respondents who may be dishonest or careless. We recommend this approach be used routinely in samples of members of MTurk. %M 37540543 %R 10.2196/46421 %U https://www.jmir.org/2023/1/e46421 %U https://doi.org/10.2196/46421 %U http://www.ncbi.nlm.nih.gov/pubmed/37540543 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e41431 %T Assessing Interventions on Crowdsourcing Platforms to Nudge Patients for Engagement Behaviors in Primary Care Settings: Randomized Controlled Trial %A Chen,Kay-Yut %A Lang,Yan %A Zhou,Yuan %A Kosmari,Ludmila %A Daniel,Kathryn %A Gurses,Ayse %A Xiao,Yan %+ Department of Business, State University of New York at Oneonta, 108 Ravine Pkwy, Oneonta, NY, 13820, United States, 1 607 436 3251, yan.lang@oneonta.edu %K Amazon Mechanical Turk %K behavioral interventions %K crowdsourcing %K medication safety %K Mturk %K patient engagement %K primary care %D 2023 %7 13.7.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: Engaging patients in health behaviors is critical for better outcomes, yet many patient partnership behaviors are not widely adopted. Behavioral economics–based interventions offer potential solutions, but it is challenging to assess the time and cost needed for different options. Crowdsourcing platforms can efficiently and rapidly assess the efficacy of such interventions, but it is unclear if web-based participants respond to simulated incentives in the same way as they would to actual incentives. Objective: The goals of this study were (1) to assess the feasibility of using crowdsourced surveys to evaluate behavioral economics interventions for patient partnerships by examining whether web-based participants responded to simulated incentives in the same way they would have responded to actual incentives, and (2) to assess the impact of 2 behavioral economics–based intervention designs, psychological rewards and loss of framing, on simulated medication reconciliation behaviors in a simulated primary care setting. Methods: We conducted a randomized controlled trial using a between-subject design on a crowdsourcing platform (Amazon Mechanical Turk) to evaluate the effectiveness of behavioral interventions designed to improve medication adherence in primary care visits. The study included a control group that represented the participants’ baseline behavior and 3 simulated interventions, namely monetary compensation, a status effect as a psychological reward, and a loss frame as a modification of the status effect. Participants’ willingness to bring medicines to a primary care visit was measured on a 5-point Likert scale. A reverse-coding question was included to ensure response intentionality. Results: A total of 569 study participants were recruited. There were 132 in the baseline group, 187 in the monetary compensation group, 149 in the psychological reward group, and 101 in the loss frame group. All 3 nudge interventions increased participants’ willingness to bring medicines significantly when compared to the baseline scenario. The monetary compensation intervention caused an increase of 17.51% (P<.001), psychological rewards on status increased willingness by 11.85% (P<.001), and a loss frame on psychological rewards increased willingness by 24.35% (P<.001). Responses to the reverse-coding question were consistent with the willingness questions. Conclusions: In primary care, bringing medications to office visits is a frequently advocated patient partnership behavior that is nonetheless not widely adopted. Crowdsourcing platforms such as Amazon Mechanical Turk support efforts to efficiently and rapidly reach large groups of individuals to assess the efficacy of behavioral interventions. We found that crowdsourced survey-based experiments with simulated incentives can produce valid simulated behavioral responses. The use of psychological status design, particularly with a loss framing approach, can effectively enhance patient engagement in primary care. These results support the use of crowdsourcing platforms to augment and complement traditional approaches to learning about behavioral economics for patient engagement. %M 37440308 %R 10.2196/41431 %U https://www.jmir.org/2023/1/e41431 %U https://doi.org/10.2196/41431 %U http://www.ncbi.nlm.nih.gov/pubmed/37440308 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e42723 %T Exploring Novel Innovation Strategies to Close a Technology Gap in Neurosurgery: HORAO Crowdsourcing Campaign %A Schucht,Philippe %A Mathis,Andrea Maria %A Murek,Michael %A Zubak,Irena %A Goldberg,Johannes %A Falk,Stephanie %A Raabe,Andreas %+ Department of Neurosurgery, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16, Bern, 3010, Switzerland, 41 31 632 95 64, andrea.mathis@insel.ch %K collective intelligence %K crowdsourcing %K fiber tracts %K ideation %K Mueller polarimetry %K neuroscience %K neurosurgery %K open innovation %K polarization %D 2023 %7 28.4.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: Scientific research is typically performed by expert individuals or groups who investigate potential solutions in a sequential manner. Given the current worldwide exponential increase in technical innovations, potential solutions for any new problem might already exist, even though they were developed to solve a different problem. Therefore, in crowdsourcing ideation, a research question is explained to a much larger group of individuals beyond the specialist community to obtain a multitude of diverse, outside-the-box solutions. These are then assessed in parallel by a group of experts for their capacity to solve the new problem. The 2 key problems in brain tumor surgery are the difficulty of discerning the exact border between a tumor and the surrounding brain, and the difficulty of identifying the function of a specific area of the brain. Both problems could be solved by a method that visualizes the highly organized fiber tracts within the brain; the absence of fibers would reveal the tumor, whereas the spatial orientation of the tracts would reveal the area’s function. To raise awareness about our challenge of developing a means of intraoperative, real-time, noninvasive identification of fiber tracts and tumor borders to improve neurosurgical oncology, we turned to the crowd with a crowdsourcing ideation challenge. Objective: Our objective was to evaluate the feasibility of a crowdsourcing ideation campaign for finding novel solutions to challenges in neuroscience. The purpose of this paper is to introduce our chosen crowdsourcing method and discuss it in the context of the current literature. Methods: We ran a prize-based crowdsourcing ideation competition called HORAO on the commercial platform HeroX. Prize money previously collected through a crowdfunding campaign was offered as an incentive. Using a multistage approach, an expert jury first selected promising technical solutions based on broad, predefined criteria, coached the respective solvers in the second stage, and finally selected the winners in a conference setting. We performed a postchallenge web-based survey among the solvers crowd to find out about their backgrounds and demographics. Results: Our web-based campaign reached more than 20,000 people (views). We received 45 proposals from 32 individuals and 7 teams, working in 26 countries on 4 continents. The postchallenge survey revealed that most of the submissions came from single solvers or teams working in engineering or the natural sciences, with additional submissions from other nonmedical fields. We engaged in further exchanges with 3 out of the 5 finalists and finally initiated a successful scientific collaboration with the winner of the challenge. Conclusions: This open innovation competition is the first of its kind in medical technology research. A prize-based crowdsourcing ideation campaign is a promising strategy for raising awareness about a specific problem, finding innovative solutions, and establishing new scientific collaborations beyond strictly disciplinary domains. %M 37115612 %R 10.2196/42723 %U https://www.jmir.org/2023/1/e42723 %U https://doi.org/10.2196/42723 %U http://www.ncbi.nlm.nih.gov/pubmed/37115612 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 25 %N %P e41233 %T A Virtual Reading Center Model Using Crowdsourcing to Grade Photographs for Trachoma: Validation Study %A Brady,Christopher J %A Cockrell,R Chase %A Aldrich,Lindsay R %A Wolle,Meraf A %A West,Sheila K %+ Division of Ophthalmology, Department of Surgery, Larner College of Medicine at The University of Vermont, 111 Colchester Avenue, Main Campus, West Pavilion, Level 5, Burlington, VT, 05401, United States, 1 802 847 0400, christopher.brady@med.uvm.edu %K trachoma %K crowdsourcing %K telemedicine %K ophthalmic photography %K Amazon Mechanical Turk %K image analysis %K diagnosis %K detection %K cloud-based %K image interpretation %K disease identification %K diagnostics %K image grading %K disease grading %K trachomatous inflammation—follicular %K ophthalmology %D 2023 %7 6.4.2023 %9 Original Paper %J J Med Internet Res %G English %X Background: As trachoma is eliminated, skilled field graders become less adept at correctly identifying active disease (trachomatous inflammation—follicular [TF]). Deciding if trachoma has been eliminated from a district or if treatment strategies need to be continued or reinstated is of critical public health importance. Telemedicine solutions require both connectivity, which can be poor in the resource-limited regions of the world in which trachoma occurs, and accurate grading of the images. Objective: Our purpose was to develop and validate a cloud-based “virtual reading center” (VRC) model using crowdsourcing for image interpretation. Methods: The Amazon Mechanical Turk (AMT) platform was used to recruit lay graders to interpret 2299 gradable images from a prior field trial of a smartphone-based camera system. Each image received 7 grades for US $0.05 per grade in this VRC. The resultant data set was divided into training and test sets to internally validate the VRC. In the training set, crowdsourcing scores were summed, and the optimal raw score cutoff was chosen to optimize kappa agreement and the resulting prevalence of TF. The best method was then applied to the test set, and the sensitivity, specificity, kappa, and TF prevalence were calculated. Results: In this trial, over 16,000 grades were rendered in just over 60 minutes for US $1098 including AMT fees. After choosing an AMT raw score cut point to optimize kappa near the World Health Organization (WHO)–endorsed level of 0.7 (with a simulated 40% prevalence TF), crowdsourcing was 95% sensitive and 87% specific for TF in the training set with a kappa of 0.797. All 196 crowdsourced-positive images received a skilled overread to mimic a tiered reading center and specificity improved to 99%, while sensitivity remained above 78%. Kappa for the entire sample improved from 0.162 to 0.685 with overreads, and the skilled grader burden was reduced by over 80%. This tiered VRC model was then applied to the test set and produced a sensitivity of 99% and a specificity of 76% with a kappa of 0.775 in the entire set. The prevalence estimated by the VRC was 2.70% (95% CI 1.84%-3.80%) compared to the ground truth prevalence of 2.87% (95% CI 1.98%-4.01%). Conclusions: A VRC model using crowdsourcing as a first pass with skilled grading of positive images was able to identify TF rapidly and accurately in a low prevalence setting. The findings from this study support further validation of a VRC and crowdsourcing for image grading and estimation of trachoma prevalence from field-acquired images, although further prospective field testing is required to determine if diagnostic characteristics are acceptable in real-world surveys with a low prevalence of the disease. %M 37023420 %R 10.2196/41233 %U https://www.jmir.org/2023/1/e41233 %U https://doi.org/10.2196/41233 %U http://www.ncbi.nlm.nih.gov/pubmed/37023420 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 11 %N %P e38412 %T Agreement Between Experts and an Untrained Crowd for Identifying Dermoscopic Features Using a Gamified App: Reader Feasibility Study %A Kentley,Jonathan %A Weber,Jochen %A Liopyris,Konstantinos %A Braun,Ralph P %A Marghoob,Ashfaq A %A Quigley,Elizabeth A %A Nelson,Kelly %A Prentice,Kira %A Duhaime,Erik %A Halpern,Allan C %A Rotemberg,Veronica %+ Dermatology Section, Memorial Sloan Kettering Cancer Center, 530 E 74th Street, New York, NY, 10021, United States, 1 8336854126, rotembev@mskcc.org %K dermatology %K dermatologist %K diagnosis %K diagnostic %K labeling %K classification %K deep learning %K dermoscopy %K dermatoscopy %K skin %K pigmentation %K microscopy %K dermascopic %K artificial intelligence %K machine learning %K crowdsourcing %K crowdsourced %K melanoma %K cancer %K lesion %K medical image %K imaging %K development %K feasibility %D 2023 %7 18.1.2023 %9 Original Paper %J JMIR Med Inform %G English %X Background: Dermoscopy is commonly used for the evaluation of pigmented lesions, but agreement between experts for identification of dermoscopic structures is known to be relatively poor. Expert labeling of medical data is a bottleneck in the development of machine learning (ML) tools, and crowdsourcing has been demonstrated as a cost- and time-efficient method for the annotation of medical images. Objective: The aim of this study is to demonstrate that crowdsourcing can be used to label basic dermoscopic structures from images of pigmented lesions with similar reliability to a group of experts. Methods: First, we obtained labels of 248 images of melanocytic lesions with 31 dermoscopic “subfeatures” labeled by 20 dermoscopy experts. These were then collapsed into 6 dermoscopic “superfeatures” based on structural similarity, due to low interrater reliability (IRR): dots, globules, lines, network structures, regression structures, and vessels. These images were then used as the gold standard for the crowd study. The commercial platform DiagnosUs was used to obtain annotations from a nonexpert crowd for the presence or absence of the 6 superfeatures in each of the 248 images. We replicated this methodology with a group of 7 dermatologists to allow direct comparison with the nonexpert crowd. The Cohen κ value was used to measure agreement across raters. Results: In total, we obtained 139,731 ratings of the 6 dermoscopic superfeatures from the crowd. There was relatively lower agreement for the identification of dots and globules (the median κ values were 0.526 and 0.395, respectively), whereas network structures and vessels showed the highest agreement (the median κ values were 0.581 and 0.798, respectively). This pattern was also seen among the expert raters, who had median κ values of 0.483 and 0.517 for dots and globules, respectively, and 0.758 and 0.790 for network structures and vessels. The median κ values between nonexperts and thresholded average–expert readers were 0.709 for dots, 0.719 for globules, 0.714 for lines, 0.838 for network structures, 0.818 for regression structures, and 0.728 for vessels. Conclusions: This study confirmed that IRR for different dermoscopic features varied among a group of experts; a similar pattern was observed in a nonexpert crowd. There was good or excellent agreement for each of the 6 superfeatures between the crowd and the experts, highlighting the similar reliability of the crowd for labeling dermoscopic images. This confirms the feasibility and dependability of using crowdsourcing as a scalable solution to annotate large sets of dermoscopic images, with several potential clinical and educational applications, including the development of novel, explainable ML tools. %M 36652282 %R 10.2196/38412 %U https://medinform.jmir.org/2023/1/e38412 %U https://doi.org/10.2196/38412 %U http://www.ncbi.nlm.nih.gov/pubmed/36652282 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 6 %N 12 %P e37507 %T Assessing Associations Between COVID-19 Symptomology and Adverse Outcomes After Piloting Crowdsourced Data Collection: Cross-sectional Survey Study %A Flaks-Manov,Natalie %A Bai,Jiawei %A Zhang,Cindy %A Malpani,Anand %A Ray,Stuart C %A Taylor,Casey Overby %+ Johns Hopkins University School of Medicine, 3101 Wyman Park Dr., Baltimore, MD, 21218, United States, 1 443 287 6657, cot@jhu.edu %K COVID-19 %K coronavirus %K symptoms %K symptomology %K crowdsourcing %K adverse outcomes %K data quality %D 2022 %7 6.12.2022 %9 Original Paper %J JMIR Form Res %G English %X Background: Crowdsourcing is a useful way to rapidly collect information on COVID-19 symptoms. However, there are potential biases and data quality issues given the population that chooses to participate in crowdsourcing activities and the common strategies used to screen participants based on their previous experience. Objective: The study aimed to (1) build a pipeline to enable data quality and population representation checks in a pilot setting prior to deploying a final survey to a crowdsourcing platform, (2) assess COVID-19 symptomology among survey respondents who report a previous positive COVID-19 result, and (3) assess associations of symptomology groups and underlying chronic conditions with adverse outcomes due to COVID-19. Methods: We developed a web-based survey and hosted it on the Amazon Mechanical Turk (MTurk) crowdsourcing platform. We conducted a pilot study from August 5, 2020, to August 14, 2020, to refine the filtering criteria according to our needs before finalizing the pipeline. The final survey was posted from late August to December 31, 2020. Hierarchical cluster analyses were performed to identify COVID-19 symptomology groups, and logistic regression analyses were performed for hospitalization and mechanical ventilation outcomes. Finally, we performed a validation of study outcomes by comparing our findings to those reported in previous systematic reviews. Results: The crowdsourcing pipeline facilitated piloting our survey study and revising the filtering criteria to target specific MTurk experience levels and to include a second attention check. We collected data from 1254 COVID-19–positive survey participants and identified the following 6 symptomology groups: abdominal and bladder pain (Group 1); flu-like symptoms (loss of smell/taste/appetite; Group 2); hoarseness and sputum production (Group 3); joint aches and stomach cramps (Group 4); eye or skin dryness and vomiting (Group 5); and no symptoms (Group 6). The risk factors for adverse COVID-19 outcomes differed for different symptomology groups. The only risk factor that remained significant across 4 symptomology groups was influenza vaccine in the previous year (Group 1: odds ratio [OR] 6.22, 95% CI 2.32-17.92; Group 2: OR 2.35, 95% CI 1.74-3.18; Group 3: OR 3.7, 95% CI 1.32-10.98; Group 4: OR 4.44, 95% CI 1.53-14.49). Our findings regarding the symptoms of abdominal pain, cough, fever, fatigue, shortness of breath, and vomiting as risk factors for COVID-19 adverse outcomes were concordant with the findings of other researchers. Some high-risk symptoms found in our study, including bladder pain, dry eyes or skin, and loss of appetite, were reported less frequently by other researchers and were not considered previously in relation to COVID-19 adverse outcomes. Conclusions: We demonstrated that a crowdsourced approach was effective for collecting data to assess symptomology associated with COVID-19. Such a strategy may facilitate efficient assessments in a dynamic intersection between emerging infectious diseases, and societal and environmental changes. %M 36343205 %R 10.2196/37507 %U https://formative.jmir.org/2022/12/e37507 %U https://doi.org/10.2196/37507 %U http://www.ncbi.nlm.nih.gov/pubmed/36343205 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 24 %N 6 %P e30216 %T The Benefits of Crowdsourcing to Seed and Align an Algorithm in an mHealth Intervention for African American and Hispanic Adults: Survey Study %A Sehgal,Neil Jay %A Huang,Shuo %A Johnson,Neil Mason %A Dickerson,John %A Jackson,Devlon %A Baur,Cynthia %+ Department of Health Policy and Management, School of Public Health, University of Maryland, 4200 Valley Drive, College Park, MD, 20742, United States, 1 3014052469, sehgal@umd.edu %K crowdsourcing %K health information %K health promotion %K prevention %K public health informatics %K African American, Black, Latino, and Hispanic populations %K recommender system %K RecSys %K machine learning %K Mechanical Turk %K MTurk %K mobile phone %D 2022 %7 21.6.2022 %9 Original Paper %J J Med Internet Res %G English %X Background: The lack of publicly available and culturally relevant data sets on African American and bilingual/Spanish-speaking Hispanic adults’ disease prevention and health promotion priorities presents a major challenge for researchers and developers who want to create and test personalized tools built on and aligned with those priorities. Personalization depends on prediction and performance data. A recommender system (RecSys) could predict the most culturally and personally relevant preventative health information and serve it to African American and Hispanic users via a novel smartphone app. However, early in a user’s experience, a RecSys can face the “cold start problem” of serving untailored and irrelevant content before it learns user preferences. For underserved African American and Hispanic populations, who are consistently being served health content targeted toward the White majority, the cold start problem can become an example of algorithmic bias. To avoid this, a RecSys needs population-appropriate seed data aligned with the app’s purposes. Crowdsourcing provides a means to generate population-appropriate seed data. Objective: Our objective was to identify and test a method to address the lack of culturally specific preventative personal health data and sidestep the type of algorithmic bias inherent in a RecSys not trained in the population of focus. We did this by collecting a large amount of data quickly and at low cost from members of the population of focus, thereby generating a novel data set based on prevention-focused, population-relevant health goals. We seeded our RecSys with data collected anonymously from self-identified Hispanic and self-identified non-Hispanic African American/Black adult respondents, using Amazon Mechanical Turk (MTurk). Methods: MTurk provided the crowdsourcing platform for a web-based survey in which respondents completed a personal profile and a health information–seeking assessment, and provided data on family health history and personal health history. Respondents then selected their top 3 health goals related to preventable health conditions, and for each goal, reviewed and rated the top 3 information returns by importance, personal utility, whether the item should be added to their personal health library, and their satisfaction with the quality of the information returned. This paper reports the article ratings because our intent was to assess the benefits of crowdsourcing to seed a RecSys. The analysis of the data from health goals will be reported in future papers. Results: The MTurk crowdsourcing approach generated 985 valid responses from 485 (49%) self-identified Hispanic and 500 (51%) self-identified non-Hispanic African American adults over the course of only 64 days at a cost of US $6.74 per respondent. Respondents rated 92 unique articles to inform the RecSys. Conclusions: Researchers have options such as MTurk as a quick, low-cost means to avoid the cold start problem for algorithms and to sidestep bias and low relevance for an intended population of app users. Seeding a RecSys with responses from people like the intended users allows for the development of a digital health tool that can recommend information to users based on similar demography, health goals, and health history. This approach minimizes the potential, initial gaps in algorithm performance; allows for quicker algorithm refinement in use; and may deliver a better user experience to individuals seeking preventative health information to improve health and achieve health goals. %M 35727616 %R 10.2196/30216 %U https://www.jmir.org/2022/6/e30216 %U https://doi.org/10.2196/30216 %U http://www.ncbi.nlm.nih.gov/pubmed/35727616 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 6 %N 5 %P e30573 %T Identifying Barriers to Enrollment in Patient Pregnancy Registries: Building Evidence Through Crowdsourcing %A Pimenta,Jeanne M %A Painter,Jeffery L %A Gemzoe,Kim %A Levy,Roger Abramino %A Powell,Marcy %A Meizlik,Paige %A Powell,Gregory %+ Safety Innovation and Analytics, GlaxoSmithKline, 5 Moore Dr, Research Triangle, Durham, NC, 27709, United States, 1 919 619 3297, gregory.e.powell@gsk.com %K belimumab %K crowdsourcing %K systemic lupus erythematosus %K pregnancy %K registry %D 2022 %7 25.5.2022 %9 Original Paper %J JMIR Form Res %G English %X Background: Enrollment in pregnancy registries is challenging despite substantial awareness-raising activities, generally resulting in low recruitment owing to limited safety data. Understanding patient and physician awareness of and attitudes toward pregnancy registries is needed to facilitate enrollment. Crowdsourcing, in which services, ideas, or content are obtained by soliciting contributions from a large group of people using web-based platforms, has shown promise for improving patient engagement and obtaining patient insights. Objective: This study aimed to use web-based crowdsourcing platforms to evaluate Belimumab Pregnancy Registry (BPR) awareness among patients and physicians and to identify potential barriers to pregnancy registry enrollment with the BPR as a case study. Methods: We conducted 2 surveys using separate web-based crowdsourcing platforms: Amazon Mechanical Turk (a 14-question patient survey) and Sermo RealTime (a 11-question rheumatologist survey). Eligible patients were women, aged 18-55 years; diagnosed with systemic lupus erythematosus (SLE); and pregnant, recently pregnant (within 2 years), or planning pregnancy. Eligible rheumatologists had prescribed belimumab and treated pregnant women. Responses were descriptively analyzed. Results: Of 151 patient respondents over a 3-month period (n=88, 58.3% aged 26-35 years; n=149, 98.7% with mild or moderate SLE; and n=148, 98% from the United States), 51% (77/151) were currently or recently pregnant. Overall, 169 rheumatologists completed the survey within 48 hours, and 59.2% (100/169) were based in the United States. Belimumab exposure was reported by 41.7% (63/151) patients, whereas 51.7% (75/145) rheumatologists had prescribed belimumab to <5 patients, 25.5% (37/145) had prescribed to 5-10 patients, and 22.8% (33/145) had prescribed to >10 patients who were pregnant or trying to conceive. Of the patients exposed to belimumab, 51% (32/63) were BPR-aware, and 45.5% (77/169) of the rheumatologists were BPR-aware. Overall, 60% (38/63) of patients reported belimumab discontinuation because of pregnancy or planned pregnancy. Among the 77 BPR-aware rheumatologists, 70 (91%) referred patients to the registry. Concerns among rheumatologists who did not prescribe belimumab during pregnancy included unknown pregnancy safety profile (119/169, 70.4%), and 61.5% (104/169) reported their patients’ concerns about the unknown pregnancy safety profile. Belimumab exposure during or recently after pregnancy or while trying to conceive was reported in patients with mild (6/64, 9%), moderate (22/85, 26%), or severe (1/2, 50%) SLE. Rheumatologists more commonly recommended belimumab for moderate (84/169, 49.7%) and severe (123/169, 72.8%) SLE than for mild SLE (36/169, 21.3%) for patients trying to conceive recently or currently pregnant. Overall, 81.6% (138/169) of the rheumatologists suggested a belimumab washout period before pregnancy of 0-30 days (44/138, 31.9%), 30-60 days (64/138, 46.4%), or >60 days (30/138, 21.7%). Conclusions: In this case, crowdsourcing efficiently obtained patient and rheumatologist input, with some patients with SLE continuing to use belimumab during or while planning a pregnancy. There was moderate awareness of the BPR among patients and physicians. %M 35612888 %R 10.2196/30573 %U https://formative.jmir.org/2022/5/e30573 %U https://doi.org/10.2196/30573 %U http://www.ncbi.nlm.nih.gov/pubmed/35612888 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 6 %N 5 %P e35764 %T A Crowdsourcing Open Contest to Design a Latino-Specific COVID-19 Campaign: Mixed Methods Analysis %A Shah,Harita S %A Dolwick Grieb,Suzanne %A Flores-Miller,Alejandra %A Phillips,Katherine H %A Page,Kathleen R %A Cervantes,Ana %A Yang,Cui %+ Department of Medicine, University of Chicago, 5841 S Maryland Ave, MC 3051, Chicago, IL, 60637, United States, 1 773 702 6840, harita@uchicago.edu %K crowdsourcing %K Latino %K open contest %K community engagement %K social marketing %K COVID-19 %K mixed method %K implementation %K thematic analysis %D 2022 %7 12.5.2022 %9 Original Paper %J JMIR Form Res %G English %X Background: Latino communities are among the most heavily impacted populations by the COVID-19 pandemic in the United States due to intersectional barriers to care. Crowdsourcing open contests can be an effective means of community engagement but have not been well studied in Latino populations nor in addressing the COVID-19 pandemic. Objective: The aims of this study are to (1) implement and evaluate a crowdsourcing open contest to solicit a name for a COVID-19 social marketing campaign for Latino populations in Maryland and (2) conduct a thematic analysis of submitted entries to guide campaign messaging. Methods: To assess the level of community engagement in this crowdsourcing open contest, we used descriptive statistics to analyze data on entries, votes, and demographic characteristics of participants. The submitted text was analyzed through inductive thematic analysis. Results: We received 74 entries within a 2-week period. The top 10 entries were chosen by community judges and the winner was decided by popular vote. We received 383 votes within 1 week. The most common themes were collective efficacy, self-efficacy, and perceived benefits of COVID-19 testing. We used these themes to directly inform our social marketing intervention and found that advertisements based on these themes became the highest performing. Conclusions: Crowdsourcing open contests are an effective means of community engagement and an agile tool for guiding interventions to address COVID-19, including in populations impacted by health care disparities, such as Latino communities. The thematic analysis of contest entries can be a valuable strategy to inform the development of social marketing campaign materials. %M 35357317 %R 10.2196/35764 %U https://formative.jmir.org/2022/5/e35764 %U https://doi.org/10.2196/35764 %U http://www.ncbi.nlm.nih.gov/pubmed/35357317 %0 Journal Article %@ 2292-9495 %I JMIR Publications %V 9 %N 1 %P e35358 %T The Acceptability of Virtual Characters as Social Skills Trainers: Usability Study %A Tanaka,Hiroki %A Nakamura,Satoshi %+ Division of Information Science, Nara Institute of Science and Technology, Takayamacho 8916-5, Ikoma-shi, 630-0192, Japan, 81 9076493408, hiroki-tan@is.naist.jp %K social skills training %K virtual agent design %K virtual assistant %K virtual trainer %K chatbot %K acceptability %K realism %K virtual agent %K simulation %K social skill %K social interaction %K design %K training %K crowdsourcing %D 2022 %7 29.3.2022 %9 Original Paper %J JMIR Hum Factors %G English %X Background: Social skills training by human trainers is a well-established method to provide appropriate social interaction skills and strengthen social self-efficacy. In our previous work, we attempted to automate social skills training by developing a virtual agent that taught social skills through interaction. Previous research has not investigated the visual design of virtual agents for social skills training. Thus, we investigated the effect of virtual agent visual design on automated social skills training. Objective: The 3 main purposes of this research were to investigate the effect of virtual agent appearance on automated social skills training, the relationship between acceptability and other measures (eg, likeability, realism, and familiarity), and the relationship between likeability and individual user characteristics (eg, gender, age, and autistic traits). Methods: We prepared images and videos of a virtual agent, and 1218 crowdsourced workers rated the virtual agents through a questionnaire. In designing personalized virtual agents, we investigated the acceptability, likeability, and other impressions of the virtual agents and their relationship to individual characteristics. Results: We found that there were differences between the virtual agents in all measures (P<.001). A female anime-type virtual agent was rated as the most likeable. We also confirmed that participants’ gender, age, and autistic traits were related to their ratings. Conclusions: We confirmed the effect of virtual agent design on automated social skills training. Our findings are important in designing the appearance of an agent for use in personalized automated social skills training. %M 35348468 %R 10.2196/35358 %U https://humanfactors.jmir.org/2022/1/e35358 %U https://doi.org/10.2196/35358 %U http://www.ncbi.nlm.nih.gov/pubmed/35348468 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 8 %N 2 %P e32355 %T Physical Activity, Sedentary Behavior, and Sleep on Twitter: Multicountry and Fully Labeled Public Data Set for Digital Public Health Surveillance Research %A Shakeri Hossein Abad,Zahra %A Butler,Gregory P %A Thompson,Wendy %A Lee,Joon %+ Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, 3280 Hospital Dr NW, Calgary, AB, T2N 4Z6, Canada, 1 403 220 2968, joonwu.lee@ucalgary.ca %K digital public health surveillance %K social media analysis %K physical activity %K sedentary behavior %K sleep %K machine learning %K online health information %K infodemiology %K public health database %D 2022 %7 14.2.2022 %9 Open Source/Open Data %J JMIR Public Health Surveill %G English %X Background: Advances in automated data processing and machine learning (ML) models, together with the unprecedented growth in the number of social media users who publicly share and discuss health-related information, have made public health surveillance (PHS) one of the long-lasting social media applications. However, the existing PHS systems feeding on social media data have not been widely deployed in national surveillance systems, which appears to stem from the lack of practitioners and the public’s trust in social media data. More robust and reliable data sets over which supervised ML models can be trained and tested reliably is a significant step toward overcoming this hurdle. The health implications of daily behaviors (physical activity, sedentary behavior, and sleep [PASS]), as an evergreen topic in PHS, are widely studied through traditional data sources such as surveillance surveys and administrative databases, which are often several months out-of-date by the time they are used, costly to collect, and thus limited in quantity and coverage. Objective: The main objective of this study is to present a large-scale, multicountry, longitudinal, and fully labeled data set to enable and support digital PASS surveillance research in PHS. To support high-quality surveillance research using our data set, we have conducted further analysis on the data set to supplement it with additional PHS-related metadata. Methods: We collected the data of this study from Twitter using the Twitter livestream application programming interface between November 28, 2018, and June 19, 2020. To obtain PASS-related tweets for manual annotation, we iteratively used regular expressions, unsupervised natural language processing, domain-specific ontologies, and linguistic analysis. We used Amazon Mechanical Turk to label the collected data to self-reported PASS categories and implemented a quality control pipeline to monitor and manage the validity of crowd-generated labels. Moreover, we used ML, latent semantic analysis, linguistic analysis, and label inference analysis to validate the different components of the data set. Results: LPHEADA (Labelled Digital Public Health Dataset) contains 366,405 crowd-generated labels (3 labels per tweet) for 122,135 PASS-related tweets that originated in Australia, Canada, the United Kingdom, or the United States, labeled by 708 unique annotators on Amazon Mechanical Turk. In addition to crowd-generated labels, LPHEADA provides details about the three critical components of any PHS system: place, time, and demographics (ie, gender and age range) associated with each tweet. Conclusions: Publicly available data sets for digital PASS surveillance are usually isolated and only provide labels for small subsets of the data. We believe that the novelty and comprehensiveness of the data set provided in this study will help develop, evaluate, and deploy digital PASS surveillance systems. LPHEADA will be an invaluable resource for both public health researchers and practitioners. %M 35156938 %R 10.2196/32355 %U https://publichealth.jmir.org/2022/2/e32355 %U https://doi.org/10.2196/32355 %U http://www.ncbi.nlm.nih.gov/pubmed/35156938 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 24 %N 1 %P e28749 %T Crowdsourcing for Machine Learning in Public Health Surveillance: Lessons Learned From Amazon Mechanical Turk %A Shakeri Hossein Abad,Zahra %A Butler,Gregory P %A Thompson,Wendy %A Lee,Joon %+ Data Intelligence for Health Lab, Cumming School of Medicine, University of Calgary, 3280 Hospital Dr NW, Calgary, AB, T2N 4Z6, Canada, 1 403 220 2968, joonwu.lee@ucalgary.ca %K crowdsourcing %K machine learning %K digital public health surveillance %K public health database %K social media analysis %D 2022 %7 18.1.2022 %9 Original Paper %J J Med Internet Res %G English %X Background: Crowdsourcing services, such as Amazon Mechanical Turk (AMT), allow researchers to use the collective intelligence of a wide range of web users for labor-intensive tasks. As the manual verification of the quality of the collected results is difficult because of the large volume of data and the quick turnaround time of the process, many questions remain to be explored regarding the reliability of these resources for developing digital public health systems. Objective: This study aims to explore and evaluate the application of crowdsourcing, generally, and AMT, specifically, for developing digital public health surveillance systems. Methods: We collected 296,166 crowd-generated labels for 98,722 tweets, labeled by 610 AMT workers, to develop machine learning (ML) models for detecting behaviors related to physical activity, sedentary behavior, and sleep quality among Twitter users. To infer the ground truth labels and explore the quality of these labels, we studied 4 statistical consensus methods that are agnostic of task features and only focus on worker labeling behavior. Moreover, to model the meta-information associated with each labeling task and leverage the potential of context-sensitive data in the truth inference process, we developed 7 ML models, including traditional classifiers (offline and active), a deep learning–based classification model, and a hybrid convolutional neural network model. Results: Although most crowdsourcing-based studies in public health have often equated majority vote with quality, the results of our study using a truth set of 9000 manually labeled tweets showed that consensus-based inference models mask underlying uncertainty in data and overlook the importance of task meta-information. Our evaluations across 3 physical activity, sedentary behavior, and sleep quality data sets showed that truth inference is a context-sensitive process, and none of the methods studied in this paper were consistently superior to others in predicting the truth label. We also found that the performance of the ML models trained on crowd-labeled data was sensitive to the quality of these labels, and poor-quality labels led to incorrect assessment of these models. Finally, we have provided a set of practical recommendations to improve the quality and reliability of crowdsourced data. Conclusions: Our findings indicate the importance of the quality of crowd-generated labels in developing ML models designed for decision-making purposes, such as public health surveillance decisions. A combination of inference models outlined and analyzed in this study could be used to quantitatively measure and improve the quality of crowd-generated labels for training ML models. %M 35040794 %R 10.2196/28749 %U https://www.jmir.org/2022/1/e28749 %U https://doi.org/10.2196/28749 %U http://www.ncbi.nlm.nih.gov/pubmed/35040794 %0 Journal Article %@ 2561-326X %I JMIR Publications %V 5 %N 12 %P e27512 %T The Use of Food Images and Crowdsourcing to Capture Real-time Eating Behaviors: Acceptability and Usability Study %A Harrington,Katharine %A Zenk,Shannon N %A Van Horn,Linda %A Giurini,Lauren %A Mahakala,Nithya %A Kershaw,Kiarri N %+ Northwestern University Feinberg School of Medicine, 680 N Lake Shore, Suite 1400, Chicago, IL, 60611, United States, 1 312 503 4014, k-kershaw@northwestern.edu %K ecological momentary assessment %K eating behaviors %K crowdsourcing %K food consumption images %K food image processing %K mobile phone %D 2021 %7 2.12.2021 %9 Original Paper %J JMIR Form Res %G English %X Background: As poor diet quality is a significant risk factor for multiple noncommunicable diseases prevalent in the United States, it is important that methods be developed to accurately capture eating behavior data. There is growing interest in the use of ecological momentary assessments to collect data on health behaviors and their predictors on a micro timescale (at different points within or across days); however, documenting eating behaviors remains a challenge. Objective: This pilot study (N=48) aims to examine the feasibility—usability and acceptability—of using smartphone-captured and crowdsource-labeled images to document eating behaviors in real time. Methods: Participants completed the Block Fat/Sugar/Fruit/Vegetable Screener to provide a measure of their typical eating behavior, then took pictures of their meals and snacks and answered brief survey questions for 7 consecutive days using a commercially available smartphone app. Participant acceptability was determined through a questionnaire regarding their experiences administered at the end of the study. The images of meals and snacks were uploaded to Amazon Mechanical Turk (MTurk), a crowdsourcing distributed human intelligence platform, where 2 Workers assigned a count of food categories to the images (fruits, vegetables, salty snacks, and sweet snacks). The agreement among MTurk Workers was assessed, and weekly food counts were calculated and compared with the Screener responses. Results: Participants reported little difficulty in uploading photographs and remembered to take photographs most of the time. Crowdsource-labeled images (n=1014) showed moderate agreement between the MTurk Worker responses for vegetables (688/1014, 67.85%) and high agreement for all other food categories (871/1014, 85.89% for fruits; 847/1014, 83.53% for salty snacks, and 833/1014, 81.15% for sweet snacks). There were no significant differences in weekly food consumption between the food images and the Block Screener, suggesting that this approach may measure typical eating behaviors as accurately as traditional methods, with lesser burden on participants. Conclusions: Our approach offers a potentially time-efficient and cost-effective strategy for capturing eating events in real time. %M 34860666 %R 10.2196/27512 %U https://formative.jmir.org/2021/12/e27512 %U https://doi.org/10.2196/27512 %U http://www.ncbi.nlm.nih.gov/pubmed/34860666 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 9 %N 11 %P e30308 %T The Collaborative Metadata Repository (CoMetaR) Web App: Quantitative and Qualitative Usability Evaluation %A Stöhr,Mark R %A Günther,Andreas %A Majeed,Raphael W %+ Justus-Liebig-University Giessen, Universities of Giessen and Marburg Lung Center (UGMLC), German Center for Lung Research (DZL), Klinikstraße 36, Gießen, 35392, Germany, 49 641 985 42117, mark.stoehr@innere.med.uni-giessen.de %K usability %K metadata %K data visualization %K semantic web %K data management %K data warehousing %K communication barriers %K quality improvement %K biological ontologies %K data curation %D 2021 %7 29.11.2021 %9 Original Paper %J JMIR Med Inform %G English %X Background: In the field of medicine and medical informatics, the importance of comprehensive metadata has long been recognized, and the composition of metadata has become its own field of profession and research. To ensure sustainable and meaningful metadata are maintained, standards and guidelines such as the FAIR (Findability, Accessibility, Interoperability, Reusability) principles have been published. The compilation and maintenance of metadata is performed by field experts supported by metadata management apps. The usability of these apps, for example, in terms of ease of use, efficiency, and error tolerance, crucially determines their benefit to those interested in the data. Objective: This study aims to provide a metadata management app with high usability that assists scientists in compiling and using rich metadata. We aim to evaluate our recently developed interactive web app for our collaborative metadata repository (CoMetaR). This study reflects how real users perceive the app by assessing usability scores and explicit usability issues. Methods: We evaluated the CoMetaR web app by measuring the usability of 3 modules: core module, provenance module, and data integration module. We defined 10 tasks in which users must acquire information specific to their user role. The participants were asked to complete the tasks in a live web meeting. We used the System Usability Scale questionnaire to measure the usability of the app. For qualitative analysis, we applied a modified think aloud method with the following thematic analysis and categorization into the ISO 9241-110 usability categories. Results: A total of 12 individuals participated in the study. We found that over 97% (85/88) of all the tasks were completed successfully. We measured usability scores of 81, 81, and 72 for the 3 evaluated modules. The qualitative analysis resulted in 24 issues with the app. Conclusions: A usability score of 81 implies very good usability for the 2 modules, whereas a usability score of 72 still indicates acceptable usability for the third module. We identified 24 issues that serve as starting points for further development. Our method proved to be effective and efficient in terms of effort and outcome. It can be adapted to evaluate apps within the medical informatics field and potentially beyond. %M 34847059 %R 10.2196/30308 %U https://medinform.jmir.org/2021/11/e30308 %U https://doi.org/10.2196/30308 %U http://www.ncbi.nlm.nih.gov/pubmed/34847059 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 11 %P e23059 %T Fitness Tracker Information and Privacy Management: Empirical Study %A Abdelhamid,Mohamed %+ Department of Information Systems, California State University, Long Beach, 1250 N Bellflower Blvd, Long Beach, CA, 90840, United States, 1 5629852361, mohamed.abdelhamid@csulb.edu %K privacy %K information sharing %K fitness trackers %K wearable devices %D 2021 %7 16.11.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Fitness trackers allow users to collect, manage, track, and monitor fitness-related activities, such as distance walked, calorie intake, sleep quality, and heart rate. Fitness trackers have become increasingly popular in the past decade. One in five Americans use a device or an app to track their fitness-related activities. These devices generate massive and important data that could help physicians make better assessments of their patients’ health if shared with health providers. This ultimately could lead to better health outcomes and perhaps even lower costs for patients. However, sharing personal fitness information with health care providers has drawbacks, mainly related to the risk of privacy loss and information misuse. Objective: This study investigates the influence of granting users granular privacy control on their willingness to share fitness information. Methods: The study used 270 valid responses collected from Mtrurkers through Amazon Mechanical Turk (MTurk). Participants were randomly assigned to one of two groups. The conceptual model was tested using structural equation modeling (SEM). The dependent variable was the intention to share fitness information. The independent variables were perceived risk, perceived benefits, and trust in the system. Results: SEM explained about 60% of the variance in the dependent variable. Three of the four hypotheses were supported. Perceived risk and trust in the system had a significant relationship with the dependent variable, while trust in the system was not significant. Conclusions: The findings show that people are willing to share their fitness information if they have granular privacy control. This study has practical and theoretical implications. It integrates communication privacy management (CPM) theory with the privacy calculus model. %M 34783672 %R 10.2196/23059 %U https://www.jmir.org/2021/11/e23059 %U https://doi.org/10.2196/23059 %U http://www.ncbi.nlm.nih.gov/pubmed/34783672 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 10 %P e26280 %T The Development of a Web-Based Tobacco Tracker Tool to Crowdsource Campus Environmental Reports for Smoke and Tobacco–Free College Policies: Mixed Methods Study %A Loureiro,Sabrina F %A Pulvers,Kim %A Gosdin,Melissa M %A Clift,Keavagh %A Rice,Myra %A Tong,Elisa K %+ Department of Internal Medicine, University of California Davis, 4150 V Street, Suite 2400, Sacramento, CA, 95817, United States, 1 (916) 734 7005, ektong@ucdavis.edu %K tobacco cessation %K college smoke and tobacco–free policies %K crowdsourcing %K environmental reporting %K public health %K smoke and tobacco research %D 2021 %7 29.10.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: College campuses in the United States have begun implementing smoke and tobacco–free policies to discourage the use of tobacco. Smoke and tobacco–free policies, however, are contingent upon effective policy enforcement. Objective: This study aimed to develop an empirically derived web-based tracking tool (Tracker) for crowdsourcing campus environmental reports of tobacco use and waste to support smoke and tobacco–free college policies. Methods: An exploratory sequential mixed methods approach was utilized to inform the development and evaluation of Tracker. In October 2018, three focus groups across 2 California universities were conducted and themes were analyzed, guiding Tracker development. After 1 year of implementation, users were asked in April 2020 to complete a survey about their experience. Results: In the focus groups, two major themes emerged: barriers and facilitators to tool utilization. Further Tracker development was guided by focus group input to address these barriers (eg, information, policing, and logistical concerns) and facilitators (eg, environmental motivators and positive reinforcement). Amongst 1163 Tracker reports, those who completed the user survey (n=316) reported that the top motivations for using the tool had been having a cleaner environment (212/316, 79%) and health concerns (185/316, 69%). Conclusions: Environmental concerns, a motivator that emerged in focus groups, shaped Tracker’s development and was cited by the majority of users surveyed as a top motivator for utilization. %M 34714248 %R 10.2196/26280 %U https://www.jmir.org/2021/10/e26280 %U https://doi.org/10.2196/26280 %U http://www.ncbi.nlm.nih.gov/pubmed/34714248 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 10 %P e19789 %T Willingness to Share Wearable Device Data for Research Among Mechanical Turk Workers: Web-Based Survey Study %A Taylor,Casey Overby %A Flaks-Manov,Natalie %A Ramesh,Shankar %A Choe,Eun Kyoung %+ Departments of Medicine and Biomedical Engineering, Johns Hopkins University School of Medicine, 217D Hackerman Hall, 3101 Wyman Park Dr, Baltimore, MD, 21218, United States, 1 4432876657, cot@jhu.edu %K wearables %K personal data %K research participation %K crowdsourcing %D 2021 %7 21.10.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Wearable devices that are used for observational research and clinical trials hold promise for collecting data from study participants in a convenient, scalable way that is more likely to reach a broad and diverse population than traditional research approaches. Amazon Mechanical Turk (MTurk) is a potential resource that researchers can use to recruit individuals into studies that use data from wearable devices. Objective: This study aimed to explore the characteristics of wearable device users on MTurk that are associated with a willingness to share wearable device data for research. We also aimed to determine whether compensation was a factor that influenced the willingness to share such data. Methods: This was a secondary analysis of a cross-sectional survey study of MTurk workers who use wearable devices for health monitoring. A 19-question web-based survey was administered from March 1 to April 5, 2018, to participants aged ≥18 years by using the MTurk platform. In order to identify characteristics that were associated with a willingness to share wearable device data, we performed logistic regression and decision tree analyses. Results:  A total of 935 MTurk workers who use wearable devices completed the survey. The majority of respondents indicated a willingness to share their wearable device data (615/935, 65.8%), and the majority of these respondents were willing to share their data if they received compensation (518/615, 84.2%). The findings from our logistic regression analyses indicated that Indian nationality (odds ratio [OR] 2.74, 95% CI 1.48-4.01, P=.007), higher annual income (OR 2.46, 95% CI 1.26-3.67, P=.02), over 6 months of using a wearable device (OR 1.75, 95% CI 1.21-2.29, P=.006), and the use of heartbeat and pulse tracking monitoring devices (OR 1.60, 95% CI 0.14-2.07, P=.01) are significant parameters that influence the willingness to share data. The only factor associated with a willingness to share data if compensation is provided was Indian nationality (OR 0.47, 95% CI 0.24-0.9, P=.02). The findings from our decision tree analyses indicated that the three leading parameters associated with a willingness to share data were the duration of wearable device use, nationality, and income. Conclusions: Most wearable device users indicated a willingness to share their data for research use (with or without compensation; 615/935, 65.8%). The probability of having a willingness to share these data was higher among individuals who had used a wearable for more than 6 months, were of Indian nationality, or were of American (United States of America) nationality and had an annual income of more than US $20,000. Individuals of Indian nationality who were willing to share their data expected compensation significantly less often than individuals of American nationality (P=.02). %M 34673528 %R 10.2196/19789 %U https://www.jmir.org/2021/10/e19789 %U https://doi.org/10.2196/19789 %U http://www.ncbi.nlm.nih.gov/pubmed/34673528 %0 Journal Article %@ 2292-9495 %I JMIR Publications %V 8 %N 3 %P e28501 %T Examining How Internet Users Trust and Access Electronic Health Record Patient Portals: Survey Study %A Yin,Rong %A Law,Katherine %A Neyens,David %+ Department of Industrial Engineering, Clemson University, 100 Freeman Hall, Clemson, SC, United States, 1 8646564719, dneyens@clemson.edu %K internet %K consumer health informatics %K patient portal %K participatory medicine %K electronic health records %K logistic model %K surveys %K questionnaires %D 2021 %7 21.9.2021 %9 Original Paper %J JMIR Hum Factors %G English %X Background: Electronic health record (EHR) patient portals are designed to provide medical health records to patients. Using an EHR portal is expected to contribute to positive health outcomes and facilitate patient-provider communication. Objective: Our objective was to examine how portal users report using their portals and the factors associated with obtaining health information from the internet. We also examined the desired portal features, factors impacting users’ trust in portals, and barriers to using portals. Methods: An internet-based survey study was conducted using Amazon Mechanical Turk. All the participants were adults in the United States who used patient portals. The survey included questions about how the participants used their portals, what factors acted as barriers to using their portals, and how they used and how much they trusted other web-based health information sources as well as their portals. A logistic regression model was used to examine the factors influencing the participants’ trust in their portals. Additionally, the desired features and design characteristics were identified to support the design of future portals. Results: A total of 394 participants completed the survey. Most of the participants were less than 35 years old (212/394, 53.8%), with 36.3% (143/394) aged between 35 and 55 years, and 9.9% (39/394) aged above 55 years. Women accounted for 48.5% (191/394) of the survey participants. More than 78% (307/394) of the participants reported using portals at least monthly. The most common portal features used were viewing lab results, making appointments, and paying bills. Participants reported some barriers to portal use including data security and limited access to the internet. The results of a logistic regression model used to predict the trust in their portals suggest that those comfortable using their portals (odds ratio [OR] 7.97, 95% CI 1.11-57.32) thought that their portals were easy to use (OR 7.4, 95% CI 1.12-48.84), and frequent internet users (OR 43.72, 95% CI 1.83-1046.43) were more likely to trust their portals. Participants reporting that the portals were important in managing their health (OR 28.13, 95% CI 5.31-148.85) and that their portals were a valuable part of their health care (OR 6.75, 95% CI 1.51-30.11) were also more likely to trust their portals. Conclusions: There are several factors that impact the trust of EHR patient portal users in their portals. Designing easily usable portals and considering these factors may be the most effective approach to improving trust in patient portals. The desired features and usability of portals are critical factors that contribute to users’ trust in EHR portals. %M 34546182 %R 10.2196/28501 %U https://humanfactors.jmir.org/2021/3/e28501 %U https://doi.org/10.2196/28501 %U http://www.ncbi.nlm.nih.gov/pubmed/34546182 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 9 %N 5 %P e21177 %T Developing Messaging Content for a Physical Activity Smartphone App Tailored to Low-Income Patients: User-Centered Design and Crowdsourcing Approach %A Pathak,Laura Elizabeth %A Aguilera,Adrian %A Williams,Joseph Jay %A Lyles,Courtney Rees %A Hernandez-Ramos,Rosa %A Miramontes,Jose %A Cemballi,Anupama Gunshekar %A Figueroa,Caroline Astrid %+ School of Social Welfare, University of California, Berkeley, 120 Haviland Hall, MC 7400, Berkeley, CA, 94720, United States, 1 510 642 8564, aguila@berkeley.edu %K user centered design %K mHealth %K text messaging %K crowdsourcing %K mobile phone %D 2021 %7 19.5.2021 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Text messaging interventions can be an effective and efficient way to improve health behavioral changes. However, most texting interventions are neither tested nor designed with diverse end users, which could reduce their impact, and there is limited evidence regarding the optimal design methodology of health text messages tailored to low-income, low–health literacy populations and non-English speakers. Objective: This study aims to combine participant feedback, crowdsourced data, and researcher expertise to develop motivational text messages in English and Spanish that will be used in a smartphone app–based texting intervention that seeks to encourage physical activity in low-income minority patients with diabetes diagnoses and depression symptoms. Methods: The design process consisted of 5 phases and was iterative in nature, given that the findings from each step informed the subsequent steps. First, we designed messages to increase physical activity based on the behavior change theory and knowledge from the available evidence. Second, using user-centered design methods, we refined these messages after a card sorting task and semistructured interviews (N=10) and evaluated their likeability during a usability testing phase of the app prototype (N=8). Third, the messages were tested by English- and Spanish-speaking participants on the Amazon Mechanical Turk (MTurk) crowdsourcing platform (N=134). Participants on MTurk were asked to categorize the messages into overarching theoretical categories based on the capability, opportunity, motivation, and behavior framework. Finally, each coauthor rated the messages for their overall quality from 1 to 5. All messages were written at a sixth-grade or lower reading level and culturally adapted and translated into neutral Spanish by bilingual research staff. Results: A total of 200 messages were iteratively refined according to the feedback from target users gathered through user-centered design methods, crowdsourced results of a categorization test, and an expert review. User feedback was leveraged to discard unappealing messages and edit the thematic aspects of messages that did not resonate well with the target users. Overall, 54 messages were sorted into the correct theoretical categories at least 50% of the time in the MTurk categorization tasks and were rated 3.5 or higher by the research team members. These were included in the final text message bank, resulting in 18 messages per motivational category. Conclusions: By using an iterative process of expert opinion, feedback from participants that were reflective of our target study population, crowdsourcing, and feedback from the research team, we were able to acquire valuable inputs for the design of motivational text messages developed in English and Spanish with a low literacy level to increase physical activity. We describe the design considerations and lessons learned for the text messaging development process and provide a novel, integrative framework for future developers of health text messaging interventions. %M 34009130 %R 10.2196/21177 %U https://mhealth.jmir.org/2021/5/e21177 %U https://doi.org/10.2196/21177 %U http://www.ncbi.nlm.nih.gov/pubmed/34009130 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e23872 %T The Psychosocial Predictors and Day-Level Correlates of Substance Use Among Participants Recruited via an Online Crowdsourcing Platform in the United States: Daily Diary Study %A Jain,Jennifer Payaal %A Offer,Claudine %A Rowe,Christopher %A Turner,Caitlin %A Dawson-Rose,Carol %A Hoffmann,Thomas %A Santos,Glenn-Milo %+ San Francisco Department of Public Health, 101 Grove Street, San Francisco, CA, 94102, United States, 1 415 640 0674, jennifer.jain@ucsf.edu %K Amazon Mechanical Turk %K stimulant use %K alcohol use %K craving %K depression %K affect %K self-esteem %K men who have sex with men %D 2021 %7 27.4.2021 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Alcohol consumption and stimulant use are major public health problems and contribute to morbidity and mortality in the United States. To inform interventions for substance use, there is a need to identify the day-level correlates of substance use by collecting repeated measures data in one’s natural environment. There is also a need to use crowdsourcing platforms like Amazon Mechanical Turk (MTurk) to efficiently engage larger populations of people who use alcohol and stimulants in research. Objective: We aimed to (1) utilize daily diaries to examine the temporal relationship between day-level cravings for alcohol and stimulant/substance use (ie, heavy drinking or any drug use) in a given day over 14 days and (2) assess whether depression, negative affect, and self-esteem measured at baseline predict substance use in a given day over 14 days among people who use alcohol and/or stimulants in the United States. Methods: Individuals aged ≥18 years in the United States, who reported alcohol or stimulant (ie, cocaine, crack cocaine, and methamphetamine) use in the past year, were recruited using MTurk between March 26 and April 13, 2018. Eligible participants completed a baseline survey and 14 daily surveys online. The baseline survey assessed sociodemographics and psychosocial (ie, depression, affect, self-esteem, and stress) factors. Daily surveys assessed substance use and cravings for alcohol and stimulants. Four multivariable random-intercept logistic regression models were built to examine psychosocial constructs separately along with other significant predictors from bivariate analyses while controlling for age and education. Results: Among a total of 272 participants, 220 were White, 201 were male, and 134 were men who have sex with men (MSM). The mean age was 36.1 years (SD 10.5). At baseline, 173 participants engaged in any current or past hazardous alcohol consumption, 31 reported using cocaine, 19 reported using methamphetamine, 8 reported using crack cocaine, and 104 reported any noninjection or injection drug use in the past 6 months. Factors independently associated with substance use were depression (adjusted odds ratio [aOR] 1.11, 95% CI 1.02-1.21; P=.01), negative affect (aOR 1.08, 95% CI 1.01-1.16; P=.01), lower levels of self-esteem (aOR 0.90, 95% CI 0.82-0.98; P=.02), and cravings for alcohol (aOR 1.02, 95% CI 1.01-1.03; P<.001) and stimulants (aOR 1.03, 95% CI 1.01-1.04; P=.01). MSM had higher odds of engaging in substance use in all models (model 1: aOR 4.90, 95% CI 1.28-18.70; P=.02; model 2: aOR 5.47, 95% CI 1.43-20.87; P=.01; model 3: aOR 5.99, 95% CI 1.55-23.13; P=.009; and model 4: aOR 4.94, 95% CI 1.29-18.84; P=.01). Conclusions: Interventions for substance use should utilize evidenced-based approaches to reduce depression, negative affect, and cravings; increase self-esteem; and engage MSM. Interventions may also consider leveraging technology-based approaches to reduce substance use among populations who use crowdsourcing platforms. %M 33904828 %R 10.2196/23872 %U https://publichealth.jmir.org/2021/4/e23872 %U https://doi.org/10.2196/23872 %U http://www.ncbi.nlm.nih.gov/pubmed/33904828 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 23 %N 4 %P e25323 %T Factors Associated With Perceived Trust of False Abortion Websites: Cross-sectional Online Survey %A Chaiken,Sarina Rebecca %A Han,Lisa %A Darney,Blair G %A Han,Leo %+ Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, United States, 1 5034942999, chaiken@ohsu.edu %K abortion %K website trust %K internet use %K reproductive health %K misinformation %D 2021 %7 19.4.2021 %9 Original Paper %J J Med Internet Res %G English %X Background: Most patients use the internet to search for health information. While there is a vast repository of searchable information online, much of the content is unregulated and therefore potentially incorrect, conflicting, or confusing. Abortion information online is particularly prone to being inaccurate as antichoice websites publish purposefully misleading information in formats that appear as neutral resources. To understand how antichoice websites appear neutral, we need to understand the specific website features of antichoice websites that impart an impression of trustworthiness. Objective: We sought to identify the characteristics of false or misleading abortion websites that make these websites appear trustworthy to the public. Methods: We conducted a cross-sectional study using Amazon’s Mechanical Turk platform. We used validated questionnaires to ask participants to rate 11 antichoice websites and one neutral website identified by experts, focusing on website content, creators, and design. We collected sociodemographic data and participant views on abortion. We used a composite measure of “mean overall trust” as our primary outcome. Using correlation matrices, we determined which website characteristics were most associated with mean overall trust. Finally, we used linear regression to identify participant characteristics associated with overall trust. Results: Our analytic sample included 498 participants aged from 22 to 70 years, and 50.1% (247/493) identified as female. Across 11 antichoice websites, creator confidence (“I believe that the creators of this website are honest and trustworthy”) had the highest correlation coefficient (strongest relationship) with mean overall trust (coefficient=0.70). Professional appearance (coefficient=0.59), look and feel (coefficient=0.59), perception that the information is created by experts (coefficient=0.59), association with a trustworthy organization (coefficient=0.58), valued features and functionalities (coefficient=0.54), and interactive capabilities (coefficient=0.52) all demonstrated strong relationships with mean overall trust. At the individual level, prochoice leaning was associated with higher overall trust of the neutral website (B=−0.43, 95% CI −0.87 to 0.01) and lower mean overall trust of the antichoice websites (B=0.52, 95% CI 0.05 to 0.99). Conclusions: The mean overall trust of antichoice websites is most associated with design characteristics and perceived trustworthiness of website creators. Those who believe that access to abortion should be limited are more likely to have higher mean overall trust for antichoice websites. %M 33871378 %R 10.2196/25323 %U https://www.jmir.org/2021/4/e25323 %U https://doi.org/10.2196/25323 %U http://www.ncbi.nlm.nih.gov/pubmed/33871378 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 7 %N 4 %P e25762 %T Assessment of the Effectiveness of Identity-Based Public Health Announcements in Increasing the Likelihood of Complying With COVID-19 Guidelines: Randomized Controlled Cross-sectional Web-Based Study %A Dennis,Alexander S %A Moravec,Patricia L %A Kim,Antino %A Dennis,Alan R %+ Kelley School of Business, Indiana University, 1309 E 10th St, Bloomington, IN, 47405, United States, 1 8128552691, ardennis@indiana.edu %K Amazon Mechanical Turk %K compliance %K COVID-19 %K custom %K effectiveness %K guideline %K identity %K public health %K public health announcement %K public service announcement %K social media %K web-based health information %D 2021 %7 13.4.2021 %9 Short Paper %J JMIR Public Health Surveill %G English %X Background: Public health campaigns aimed at curbing the spread of COVID-19 are important in reducing disease transmission, but traditional information-based campaigns have received unexpectedly extreme backlash. Objective: This study aimed to investigate whether customizing of public service announcements (PSAs) providing health guidelines to match individuals’ identities increases their compliance. Methods: We conducted a within- and between-subjects, randomized controlled cross-sectional, web-based study in July 2020. Participants viewed two PSAs: one advocating wearing a mask in public settings and one advocating staying at home. The control PSA only provided information, and the treatment PSAs were designed to appeal to the identities held by individuals; that is, either a Christian identity or an economically motivated identity. Participants were asked about their identity and then provided a control PSA and treatment PSA matching their identity, in random order. The PSAs were of approximately 100 words. Results: We recruited 300 social media users from Amazon Mechanical Turk in accordance with usual protocols to ensure data quality. In total, 8 failed the data quality checks, and the remaining 292 were included in the analysis. In the identity-based PSA, the source of the PSA was changed, and a phrase of approximately 12 words relevant to the individual’s identity was inserted. A PSA tailored for Christians, when matched with a Christian identity, increased the likelihood of compliance by 12 percentage points. A PSA that focused on economic values, when shown to individuals who identified as economically motivated, increased the likelihood of compliance by 6 points. Conclusions: Using social media to deliver COVID-19 public health announcements customized to individuals’ identities is a promising measure to increase compliance with public health guidelines. Trial Registration: ISRCTN Registry 22331899; https://www.isrctn.com/ISRCTN22331899. %M 33819910 %R 10.2196/25762 %U https://publichealth.jmir.org/2021/4/e25762 %U https://doi.org/10.2196/25762 %U http://www.ncbi.nlm.nih.gov/pubmed/33819910 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 11 %P e19768 %T The Public’s Perception of the Severity and Global Impact at the Start of the SARS-CoV-2 Pandemic: A Crowdsourcing-Based Cross-Sectional Analysis %A Shauly,Orr %A Stone,Gregory %A Gould,Daniel %+ Keck School of Medicine, University of Southern California, 1450 San Pablo Street, Suite 415, Los Angeles, CA, United States, 1 323 442 7920, dr.danjgould@gmail.com %K Amazon Mechanical Turk %K crowdsourcing %K COVID-19, SARS-CoV-2 %K pandemic %K perception %K public opinion %K survey %K severity %K impact %K behavior %K education %D 2020 %7 26.11.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: COVID-19 is a rapidly developing threat to most people in the United States and abroad. The behaviors of the public are important to understand, as they may have a tremendous impact on the course of this novel coronavirus pandemic. Objective: This study intends to assess the US population’s perception and knowledge of the virus as a threat and the behaviors of the general population in response. Methods: A prospective cross-sectional study was conducted with random volunteers recruited through Amazon Mechanical Turk, an internet crowdsourcing service, on March 24, 2020. Results: A total of 969 participants met the inclusion criteria. It was found that the perceived severity of the COVID-19 pandemic significantly differed between age groups (P<.001) and men and women (P<.001). A majority of study participants were actively adhering to the Centers for Disease Control and Prevention guidelines. Conclusions: Though many participants identified COVID-19 as a threat, many failed to place themselves appropriately in the correct categories with respect to risk. This may indicate a need for additional public education for appropriately defining the risk of this novel pandemic. %M 33108314 %R 10.2196/19768 %U http://www.jmir.org/2020/11/e19768/ %U https://doi.org/10.2196/19768 %U http://www.ncbi.nlm.nih.gov/pubmed/33108314 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 8 %P e17830 %T Assessing Public Opinion on CRISPR-Cas9: Combining Crowdsourcing and Deep Learning %A Müller,Martin %A Schneider,Manuel %A Salathé,Marcel %A Vayena,Effy %+ Health Ethics and Policy Lab, Department of Health Sciences and Technology, ETH Zurich, Hottingerstrasse 10, Zurich, 8092, Switzerland, 41 44 632 26 16, manuel.schneider@hest.ethz.ch %K CRISPR %K natural language processing %K sentiment analysis %K digital methods %K infodemiology %K infoveillace %K empirical bioethics %K social media %D 2020 %7 31.8.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The discovery of the CRISPR-Cas9–based gene editing method has opened unprecedented new potential for biological and medical engineering, sparking a growing public debate on both the potential and dangers of CRISPR applications. Given the speed of technology development and the almost instantaneous global spread of news, it is important to follow evolving debates without much delay and in sufficient detail, as certain events may have a major long-term impact on public opinion and later influence policy decisions. Objective: Social media networks such as Twitter have shown to be major drivers of news dissemination and public discourse. They provide a vast amount of semistructured data in almost real-time and give direct access to the content of the conversations. We can now mine and analyze such data quickly because of recent developments in machine learning and natural language processing. Methods: Here, we used Bidirectional Encoder Representations from Transformers (BERT), an attention-based transformer model, in combination with statistical methods to analyze the entirety of all tweets ever published on CRISPR since the publication of the first gene editing application in 2013. Results: We show that the mean sentiment of tweets was initially very positive, but began to decrease over time, and that this decline was driven by rare peaks of strong negative sentiments. Due to the high temporal resolution of the data, we were able to associate these peaks with specific events and to observe how trending topics changed over time. Conclusions: Overall, this type of analysis can provide valuable and complementary insights into ongoing public debates, extending the traditional empirical bioethics toolset. %M 32865499 %R 10.2196/17830 %U http://www.jmir.org/2020/8/e17830/ %U https://doi.org/10.2196/17830 %U http://www.ncbi.nlm.nih.gov/pubmed/32865499 %0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 6 %P e16704 %T Factors Influencing Doctors’ Participation in the Provision of Medical Services Through Crowdsourced Health Care Information Websites: Elaboration-Likelihood Perspective Study %A Si,Yan %A Wu,Hong %A Liu,Qing %+ School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, 13 Hangkong road, Qiaokou District, Wuhan, China, 86 13277942186, hongwu@hust.edu.cn %K crowdsourcing %K crowdsourced medical services %K online health communities %K doctors’ participation %K elaboration-likelihood model %D 2020 %7 29.6.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: Web-based crowdsourcing promotes the goals achieved effectively by gaining solutions from public groups via the internet, and it has gained extensive attention in both business and academia. As a new mode of sourcing, crowdsourcing has been proven to improve efficiency, quality, and diversity of tasks. However, little attention has been given to crowdsourcing in the health sector. Objective: Crowdsourced health care information websites enable patients to post their questions in the question pool, which is accessible to all doctors, and the patients wait for doctors to respond to their questions. Since the sustainable development of crowdsourced health care information websites depends on the participation of the doctors, we aimed to investigate the factors influencing doctors’ participation in providing health care information in these websites from the perspective of the elaboration-likelihood model. Methods: We collected 1524 questions with complete patient-doctor interaction processes from an online health community in China to test all the hypotheses. We divided the doctors into 2 groups based on the sequence of the answers: (1) doctor who answered the patient’s question first and (2) the doctors who answered that question after the doctor who answered first. All analyses were conducted using the ordinary least squares method. Results: First, the ability of the doctor who first answered the health-related question was found to positively influence the participation of the following doctors who answered after the first doctor responded to the question (βoffline1=.177, P<.001; βoffline2=.063, P=.048; βonline=.418, P<.001). Second, the reward that the patient offered for the best answer showed a positive effect on doctors’ participation (β=.019, P<.001). Third, the question’s complexity was found to positively moderate the relationships between the ability of the first doctor who answered and the participation of the following doctors (β=.186, P=.05) and to mitigate the effect between the reward and the participation of the following doctors (β=–.003, P=.10). Conclusions: This study has both theoretical and practical contributions. Online health community managers can build effective incentive mechanisms to encourage highly competent doctors to participate in the provision of medical services in crowdsourced health care information websites and they can increase the reward incentives for each question to increase the participation of the doctors. %M 32597787 %R 10.2196/16704 %U http://medinform.jmir.org/2020/6/e16704/ %U https://doi.org/10.2196/16704 %U http://www.ncbi.nlm.nih.gov/pubmed/32597787 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 6 %P e19782 %T Internet Use, Risk Awareness, and Demographic Characteristics Associated With Engagement in Preventive Behaviors and Testing: Cross-Sectional Survey on COVID-19 in the United States %A Li,Siyue %A Feng,Bo %A Liao,Wang %A Pan,Wenjing %+ School of Journalism and Communication, Renmin University of China, 507, School of Journalism and Communication, 59 Zhongguancun St, Haidian District, Beijing, 100872, China, 86 010 82500855, wenjingpan@ruc.edu.cn %K COVID-19 %K coronavirus %K preventive behaviors %K testing %K online health information %K risk awareness %D 2020 %7 16.6.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: During the coronavirus disease (COVID-19) pandemic, engagement in preventive behaviors and getting tested for the virus play a crucial role in protecting people from contracting the new coronavirus. Objective: This study aims to examine how internet use, risk awareness, and demographic characteristics are associated with engagement in preventative behaviors and testing during the COVID-19 pandemic in the United States. Methods: A cross-sectional survey was conducted on Amazon Mechanical Turk from April 10, 2020, to April 14, 2020. Participants’ internet use (in terms of the extent of receiving information pertaining to COVID-19), risk awareness (whether any immediate family members, close friends or relatives, or people in local communities tested positive for COVID-19), demographics (sex, age, ethnicity, income, education level, marital status, and employment status), as well as their engagement in preventative behaviors and testing were assessed. Results: Our data included 979 valid responses from the United States. Participants who received more COVID-19–related health information online reported more frequent effort to engage in all types of preventive behaviors: wearing a facemask in public (odds ratio [OR] 1.55, 95% CI 1.34-1.79, P<.001), washing hands (OR 1.58, 95% CI 1.35-1.85, P<.001), covering nose and mouth when sneezing and coughing (OR 1.78, 95% CI 1.52-2.10, P<.001), keeping social distance with others (OR 1.41, 95% CI 1.21-1.65, P<.001), staying home (OR 1.40, 95% CI 1.20-1.62, P<.001), avoiding using public transportation (OR 1.57, 95% CI 1.32-1.88, P<.001), and cleaning frequently used surfaces (OR 1.55, 95% CI 1.34-1.79, P<.001). Compared with participants who did not have positive cases in their social circles, those who had immediate family members (OR 1.48, 95% CI 8.28-26.44, P<.001) or close friends and relatives (OR 2.52, 95% CI 1.58-4.03, P<.001) who tested positive were more likely to get tested. Participants’ sex, age, ethnicity, marital status, and employment status were also associated with preventive behaviors and testing. Conclusions: Our findings revealed that the extent of receiving COVID-19–related information online, risk awareness, and demographic characteristics including sex, ethnicity, age, marital status, and employment status are key factors associated with US residents’ engagement in various preventive behaviors and testing for COVID-19. %M 32501801 %R 10.2196/19782 %U http://www.jmir.org/2020/6/e19782/ %U https://doi.org/10.2196/19782 %U http://www.ncbi.nlm.nih.gov/pubmed/32501801 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e16303 %T Ambiguity in Communicating Intensity of Physical Activity: Survey Study %A Kim,Hyeoneui %A Kim,Jaemin %A Taira,Ricky %+ School of Nursing, Duke University, 307 Trent Drive, Durham, NC, 27710, United States, 1 919 684 7534, hyeoneui.kim@duke.edu %K exercise %K health communication %K exercise intensity %D 2020 %7 28.5.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Communicating physical activity information with sufficient details, such as activity type, frequency, duration, and intensity, is vital to accurately delineate the attributes of physical activity that bring positive health impact. Unlike frequency and duration, intensity is a subjective concept that can be interpreted differently by people depending on demographics, health status, physical fitness, and exercise habits. However, activity intensity is often communicated using general degree modifiers, degree of physical exertion, and physical activity examples, which are the expressions that people may interpret differently. Lack of clarity in communicating the intensity level of physical activity is a potential barrier to an accurate assessment of exercise effect and effective imparting of exercise recommendations. Objective: This study aimed to assess the variations in people’s perceptions and interpretations of commonly used intensity descriptions of physical activities and to identify factors that may contribute to these variations. Methods: A Web-based survey with a 25-item questionnaire was conducted using Amazon Mechanical Turk, targeting adults residing in the United States. The questionnaire included questions on participants’ demographics, exercise habits, overall perceived health status, and perceived intensity of 10 physical activity examples. The survey responses were analyzed using the R statistical package. Results: The analyses included 498 responses. The majority of respondents were females (276/498, 55.4%) and whites (399/498, 79.9%). Numeric ratings of physical exertion after exercise were relatively well associated with the 3 general degree descriptors of exercise intensity: light, moderate, and vigorous. However, there was no clear association between the intensity expressed with those degree descriptors and the degree of physical exertion the participants reported to have experienced after exercise. Intensity ratings of various examples of physical activity differed significantly according to respondents’ characteristics. Regression analyses showed that those who reported good health or considered regular exercise was important for their health tended to rate the intensity levels of the activity examples significantly higher than their counterparts. The respondents’ age and race (white vs nonwhite) were not significant predictors of the intensity rating. Conclusions: This survey showed significant variations in how people perceive and interpret the intensity levels of physical activities described with general severity modifiers, degrees of physical exertion, and physical activity examples. Considering that these are among the most widely used methods of communicating physical activity intensity in current practice, a possible miscommunication in assessing and promoting physical activity seems to be a real concern. We need to adopt a method that represents activity intensity in a quantifiable manner to avoid unintended miscommunication. %M 32348256 %R 10.2196/16303 %U http://publichealth.jmir.org/2020/2/e16303/ %U https://doi.org/10.2196/16303 %U http://www.ncbi.nlm.nih.gov/pubmed/32348256 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 8 %N 5 %P e18400 %T Adaptive Mobile Health Intervention for Adolescents with Asthma: Iterative User-Centered Development %A Fedele,David A %A Cushing,Christopher C %A Koskela-Staples,Natalie %A Patton,Susana R %A McQuaid,Elizabeth L %A Smyth,Joshua M %A Prabhakaran,Sreekala %A Gierer,Selina %A Nezu,Arthur M %+ Department of Clinical & Health Psychology, University of Florida, 101 S Newell Dr, Rm 3151, PO Box 100165, Gainesville, FL, 32610, United States, 1 3522945765, dfedele@phhp.ufl.edu %K asthma %K mobile health %K adherence %K adolescence %K self-regulation %K problem-solving %K adolescent %K youth %D 2020 %7 6.5.2020 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Adolescents diagnosed with persistent asthma commonly take less than 50% of their prescribed inhaled corticosteroids (ICS), placing them at risk for asthma-related morbidity. Adolescents’ difficulties with adherence occur in the context of normative developmental changes (eg, increased responsibility for disease management) and rely upon still developing self-regulation and problem-solving skills that are integral for asthma self-management. We developed an adaptive mobile health system, Responsive Asthma Care for Teens (ReACT), that facilitates self-regulation and problem-solving skills during times when adolescents’ objectively measured ICS adherence data indicate suboptimal rates of medication use. Objective: The current paper describes our user-centered and evidence-based design process in developing ReACT. We explain how we leveraged a combination of individual interviews, national crowdsourced feedback, and an advisory board comprised of target users to develop the intervention content. Methods: We developed ReACT over a 15-month period using one-on-one interviews with target ReACT users (n=20), national crowdsourcing (n=257), and an advisory board (n=4) to refine content. Participants included 13-17–year-olds with asthma and their caregivers. A total of 280 adolescents and their caregivers participated in at least one stage of ReACT development. Results: Consistent with self-regulation theory, adolescents identified a variety of salient intrapersonal (eg, forgetfulness, mood) and external (eg, changes in routine) barriers to ICS use during individual interviews. Adolescents viewed the majority of ReACT intervention content (514/555 messages, 93%) favorably during the crowdsourcing phase, and the advisory board helped to refine the content that did not receive favorable feedback during crowdsourcing. Additionally, the advisory board provided suggestions for improving additional components of ReACT (eg, videos, message flow). Conclusions: ReACT involved stakeholders via qualitative approaches and crowdsourcing throughout the creation and refinement of intervention content. The feedback we received from participants largely supported ReACT’s emphasis on providing adaptive and personalized intervention content to facilitate self-regulation and problem-solving skills, and the research team successfully completed the recommended refinements to the intervention content during the iterative development process. %M 32374273 %R 10.2196/18400 %U https://mhealth.jmir.org/2020/5/e18400 %U https://doi.org/10.2196/18400 %U http://www.ncbi.nlm.nih.gov/pubmed/32374273 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 5 %P e18394 %T The Role of Health Concerns in Phishing Susceptibility: Survey Design Study %A Abdelhamid,Mohamed %+ Department of Information Systems, College of Business, California State University Long Beach, 1250 Bellflower Boulevard, Long Beach, CA, 90840, United States, 1 562 985 2361, mohamed.abdelhamid@csulb.edu %K phishing %K health concerns %K disposition to trust %K risk-taking propensity %K cybercrime %K security, internet %K trust %K risk-taking %K crime victims %D 2020 %7 4.5.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Phishing is a cybercrime in which the attackers usually impersonate a trusted source. The attackers usually send an email that contains a link that allows them to steal the receiver’s personal information. In the United States, phishing is the number one cybercrime by victim count according to the Federal Bureau of Investigation’s 2019 internet crime report. Several studies investigated ways to increase awareness and improve employees’ resistance to phishing attacks. However, in 2019, successful phishing attacks continued to rise at a high rate Objective: The objective of this study was to investigate the influence of personality-based antecedents on phishing susceptibility in a health care context. Methods: Survey data were collected from participants through Amazon Mechanical Turk to test a proposed conceptual model using structural equation modeling. Results: A total of 200 participants took part. Health concerns, disposition to trust, and risk-taking propensity yielded higher phishing susceptibility. This highlights the important of personality-based factors in phishing attacks. In addition, females had a higher phishing susceptibility than male participants Conclusions: While previous studies used health concerns as a motivator for contexts such as sharing personal health records with providers, this study shed light on the danger of higher health concerns in enabling the number one cybercrime. %M 32364511 %R 10.2196/18394 %U https://www.jmir.org/2020/5/e18394 %U https://doi.org/10.2196/18394 %U http://www.ncbi.nlm.nih.gov/pubmed/32364511 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 4 %P e15075 %T Development and Validation of a Comprehensive Well-Being Scale for People in the University Environment (Pitt Wellness Scale) Using a Crowdsourcing Approach: Cross-Sectional Study %A Zhou,Leming %A Parmanto,Bambang %+ Department of Health Information Management, University of Pittsburgh, 6021 Forbes Tower, 3600 Forbes Ave at Meyran Ave, Pittsburgh, PA, 15260, United States, 1 412 383 6653, Leming.Zhou@pitt.edu %K crowdsourcing %K questionnaire design %K university %D 2020 %7 29.4.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: Well-being has multiple domains, and these domains are unique to the population being examined. Therefore, to precisely assess the well-being of a population, a scale specifically designed for that population is needed. Objective: The goal of this study was to design and validate a comprehensive well-being scale for people in a university environment, including students, faculty, and staff. Methods: A crowdsourcing approach was used to determine relevant domains for the comprehensive well-being scale in this population and identify specific questions to include in each domain. A web-based questionnaire (Q1) was used to collect opinions from a group of university students, faculty, and staff about the domains and subdomains of the scale. A draft of a new well-being scale (Q2) was created in response to the information collected via Q1, and a second group of study participants was invited to evaluate the relevance and clarity of each statement. A newly created well-being scale (Q3) was then used by a third group of university students, faculty, and staff. A psychometric analysis was performed on the data collected via Q3 to determine the validity and reliability of the well-being scale. Results: In the first step, a group of 518 university community members (students, faculty, and staff) indicated the domains and subdomains that they desired to have in a comprehensive well-being scale. In the second step, a second group of 167 students, faculty, and staff evaluated the relevance and clarity of the proposed statements in each domain. In the third step, a third group of 546 students, faculty, and staff provided their responses to the new well-being scale (Pitt Wellness Scale). The psychometric analysis indicated that the reliability of the well-being scale was high. Conclusions: Using a crowdsourcing approach, we successfully created a comprehensive and highly reliable well-being scale for people in the university environment. Our new Pitt Wellness Scale may be used to measure the well-being of people in the university environment. %M 32347801 %R 10.2196/15075 %U http://www.jmir.org/2020/4/e15075/ %U https://doi.org/10.2196/15075 %U http://www.ncbi.nlm.nih.gov/pubmed/32347801 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 6 %N 2 %P e16119 %T Participatory Surveillance Based on Crowdsourcing During the Rio 2016 Olympic Games Using the Guardians of Health Platform: Descriptive Study %A Leal Neto,Onicio %A Cruz,Oswaldo %A Albuquerque,Jones %A Nacarato de Sousa,Mariana %A Smolinski,Mark %A Pessoa Cesse,Eduarda Ângela %A Libel,Marlo %A Vieira de Souza,Wayner %+ University of Zurich, Schönberggasse 1, Room SOF-H10, Zurich, , Switzerland, 41 44 634 55 50, onicio@gmail.com %K participatory surveillance %K epidemiology %K infectious diseases %K pandemics %K health innovation %K digital disease detection %K disease surveillance %K mobile phone %D 2020 %7 7.4.2020 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: With the evolution of digital media, areas such as public health are adding new platforms to complement traditional systems of epidemiological surveillance. Participatory surveillance and digital epidemiology have become innovative tools for the construction of epidemiological landscapes with citizens’ participation, improving traditional sources of information. Strategies such as these promote the timely detection of warning signs for outbreaks and epidemics in the region. Objective: This study aims to describe the participatory surveillance platform Guardians of Health, which was used in a project conducted during the 2016 Olympic and Paralympic Games in Rio de Janeiro, Brazil, and officially used by the Brazilian Ministry of Health for the monitoring of outbreaks and epidemics. Methods: This is a descriptive study carried out using secondary data from Guardians of Health available in a public digital repository. Based on syndromic signals, the information subsidy for decision making by policy makers and health managers becomes more dynamic and assertive. This type of information source can be used as an early route to understand the epidemiological scenario. Results: The main result of this research was demonstrating the use of the participatory surveillance platform as an additional source of information for the epidemiological surveillance performed in Brazil during a mass gathering. The platform Guardians of Health had 7848 users who generated 12,746 reports about their health status. Among these reports, the following were identified: 161 users with diarrheal syndrome, 68 users with respiratory syndrome, and 145 users with rash syndrome. Conclusions: It is hoped that epidemiological surveillance professionals, researchers, managers, and workers become aware of, and allow themselves to use, new tools that improve information management for decision making and knowledge production. This way, we may follow the path for a more intelligent, efficient, and pragmatic disease control system. %M 32254042 %R 10.2196/16119 %U https://publichealth.jmir.org/2020/2/e16119 %U https://doi.org/10.2196/16119 %U http://www.ncbi.nlm.nih.gov/pubmed/32254042 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 22 %N 1 %P e15597 %T Analysis of Collective Human Intelligence for Diagnosis of Pigmented Skin Lesions Harnessed by Gamification Via a Web-Based Training Platform: Simulation Reader Study %A Rinner,Christoph %A Kittler,Harald %A Rosendahl,Cliff %A Tschandl,Philipp %+ Department of Dermatology, Medical University of Vienna, Währinger Gürtel 18-20, Vienna, 1090, Austria, 43 40400 ext 77000, philipp.tschandl@meduniwien.ac.at %K skin cancer %K crowdsourcing %K games, experimental %K diagnosis %K melanoma %K nevi %K skin pigmentation %K basal cell carcinoma %K dermatoscopy %D 2020 %7 24.1.2020 %9 Original Paper %J J Med Internet Res %G English %X Background: The diagnosis of pigmented skin lesion is error prone and requires domain-specific expertise, which is not readily available in many parts of the world. Collective intelligence could potentially decrease the error rates of nonexperts. Objective: The aim of this study was to evaluate the feasibility and impact of collective intelligence for the detection of skin cancer. Methods: We created a gamified study platform on a stack of established Web technologies and presented 4216 dermatoscopic images of the most common benign and malignant pigmented skin lesions to 1245 human raters with different levels of experience. Raters were recruited via scientific meetings, mailing lists, and social media posts. Education was self-declared, and domain-specific experience was tested by screening tests. In the target test, the readers had to assign 30 dermatoscopic images to 1 of the 7 disease categories. The readers could repeat the test with different lesions at their own discretion. Collective human intelligence was achieved by sampling answers from multiple readers. The disease category with most votes was regarded as the collective vote per image. Results: We collected 111,019 single ratings, with a mean of 25.2 (SD 18.5) ratings per image. As single raters, nonexperts achieved a lower mean accuracy (58.6%) than experts (68.4%; mean difference=−9.4%; 95% CI −10.74% to −8.1%; P<.001). Collectives of nonexperts achieved higher accuracies than single raters, and the improvement increased with the size of the collective. A collective of 4 nonexperts surpassed single nonexperts in accuracy by 6.3% (95% CI 6.1% to 6.6%; P<.001). The accuracy of a collective of 8 nonexperts was 9.7% higher (95% CI 9.5% to 10.29%; P<.001) than that of single nonexperts, an improvement similar to single experts (P=.73). The sensitivity for malignant images increased for nonexperts (66.3% to 77.6%) and experts (64.6% to 79.4%) for answers given faster than the intrarater mean. Conclusions: A high number of raters can be attracted by elements of gamification and Web-based marketing via mailing lists and social media. Nonexperts increase their accuracy to expert level when acting as a collective, and faster answers correspond to higher accuracy. This information could be useful in a teledermatology setting. %M 32012058 %R 10.2196/15597 %U http://www.jmir.org/2020/1/e15597/ %U https://doi.org/10.2196/15597 %U http://www.ncbi.nlm.nih.gov/pubmed/32012058 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 5 %N 4 %P e13027 %T A Crowdsourced Physician Finder Prototype Platform for Men Who Have Sex with Men in China: Qualitative Study of Acceptability and Feasibility %A Wu,Dan %A Huang,Wenting %A Zhao,Peipei %A Li,Chunyan %A Cao,Bolin %A Wang,Yifan %A Stoneking,Shelby %A Tang,Weiming %A Luo,Zhenzhou %A Wei,Chongyi %A Tucker,Joseph %+ Social Entrepreneurship to Spur Health Global, No 2 Lujing Road Yuexiu District, Guangzhou, China, 86 13560294997, jdtucker@med.unc.edu %K gay-friendly doctors %K social media %K crowdsourcing %K prototype evaluation %K men who have sex with men %K China %D 2019 %7 8.10.2019 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Men who have sex with men (MSM), including both gay and bisexual men, have a high prevalence of HIV and sexually transmitted infections (STIs) in China. However, healthcare seeking behaviors and engagement in clinical services among MSM are often suboptimal. Global evidence shows that embedding online HIV or sexual health services into gay social networking applications holds promise for facilitating higher rates of healthcare utilization among MSM. We developed a prototype of a gay-friendly health services platform, designed for integration within a popular gay social networking app (Blued) in China. Objective: The purpose of this study was to evaluate the acceptability of the platform and ask for user feedback through focus group interviews with young MSM in Guangzhou and Shenzhen, cities in Southern China. Methods: The prototype was developed through an open, national crowdsourcing contest. Open crowdsourcing contests solicit community input on a topic in order to identify potential improvements and implement creative solutions. The prototype included a local, gay-friendly, STI physician finder tool and online psychological consulting services. Semistructured focus group discussions were conducted with MSM to ask for their feedback on the platform, and a short survey was administered following discussions. Thematic analysis was used to analyze the data in NVivo, and we developed a codebook based on the first interview. Double coding was conducted, and discrepancies were discussed with a third individual until consensus was reached. We then carried out descriptive analysis of the survey data. Results: A total of 34 participants attended four focus group discussions. The mean age was 27.3 years old (SD 4.6). A total of 32 (94%) participants obtained at least university education, and 29 (85%) men had seen a doctor at least once before. Our survey results showed that 24 (71%) participants had interest in using the online health services platform and 25 (74%) thought that the system was easy to use. Qualitative data also revealed that there was a high demand for gay-friendly healthcare services which could help with care seeking. Men felt that the platform could bridge gaps in the existing HIV or STI service delivery system, specifically by identifying local gay-friendly physicians and counselors, providing access to online physician consultation and psychological counseling services, creating space for peer support, and distributing pre-exposure prophylaxis and sexual health education. Conclusions: Crowdsourcing can help develop a community-centered online platform linking MSM to local gay-friendly HIV or STI services. Further research on developing social media–based platforms for MSM and evaluating the effectiveness of such platforms may be useful for improving sexual health outcomes. %M 31596245 %R 10.2196/13027 %U https://publichealth.jmir.org/2019/4/e13027 %U https://doi.org/10.2196/13027 %U http://www.ncbi.nlm.nih.gov/pubmed/31596245 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 7 %P e13792 %T Overcoming Barriers to Mobilizing Collective Intelligence in Research: Qualitative Study of Researchers With Experience of Collective Intelligence %A Nguyen,Van Thu %A Young,Bridget %A Ravaud,Philippe %A Naidoo,Nivantha %A Benchoufi,Mehdi %A Boutron,Isabelle %+ INSERM, U1153 Epidemiology and Biostatistics Sorbonne Paris Cité Research Center (CRESS), Methods of Therapeutic Evaluation of Chronic Diseases Team (METHODS), , Paris,, France, 33 142347825, van.nguyen@clinicalepidemio.fr %K collective intelligence %K crowdsourcing %K open innovation %K health %K research %K survey %K interview %D 2019 %7 02.07.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Innovative ways of planning and conducting research have emerged recently, based on the concept of collective intelligence. Collective intelligence is defined as shared intelligence emerging when people are mobilized within or outside an organization to work on a specific task that could result in more innovative outcomes than those when individuals work alone. Crowdsourcing is defined as “the act of taking a job traditionally performed by a designated agent and outsourcing it to an undefined, generally large group of people in the form of an open call.” Objective: This qualitative study aimed to identify the barriers to mobilizing collective intelligence and ways to overcome these barriers and provide good practice advice for planning and conducting collective intelligence projects across different research disciplines. Methods: We conducted a multinational online open-ended question survey and semistructured audio-recorded interviews with a purposive sample of researchers who had experience in running collective intelligence projects. The questionnaires had an interactive component, enabling respondents to rate and comment on the advice of their fellow respondents. Data were analyzed thematically, drawing on the framework method. Results: A total of 82 respondents from various research fields participated in the survey (n=65) or interview (n=17). The main barriers identified were the lack of evidence-based guidelines for implementing collective intelligence, complexity in recruiting and engaging the community, and difficulties in disseminating the results of collective intelligence projects. We drew on respondents’ experience to provide tips and good practice advice for governance, planning, and conducting collective intelligence projects. Respondents particularly suggested establishing a diverse coordination team to plan and manage collective intelligence projects and setting up common rules of governance for participants in projects. In project planning, respondents provided advice on identifying research problems that could be answered by collective intelligence and identifying communities of participants. They shared tips on preparing the task and interface and organizing communication activities to recruit and engage participants. Conclusions: Mobilizing collective intelligence through crowdsourcing is an innovative method to increase research efficiency, although there are several barriers to its implementation. We present good practice advice from researchers with experience of collective intelligence across different disciplines to overcome barriers to mobilizing collective intelligence. %M 31267977 %R 10.2196/13792 %U https://www.jmir.org/2019/7/e13792/ %U https://doi.org/10.2196/13792 %U http://www.ncbi.nlm.nih.gov/pubmed/31267977 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 5 %P e13668 %T Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks %A Washington,Peter %A Kalantarian,Haik %A Tariq,Qandeel %A Schwartz,Jessey %A Dunlap,Kaitlyn %A Chrisman,Brianna %A Varma,Maya %A Ning,Michael %A Kline,Aaron %A Stockham,Nathaniel %A Paskov,Kelley %A Voss,Catalin %A Haber,Nick %A Wall,Dennis Paul %+ Division of Systems Medicine, Department of Biomedical Data Science, Stanford University, 1265 Welch Rd, Palo Alto, CA,, United States, 1 617 304 6031, dpwall@stanford.edu %K crowdsourcing %K autism %K mechanical turk %K pediatrics %K diagnostics %K diagnosis %K neuropsychiatric conditions %K human-computer interaction %K citizen healthcare %K biomedical data science %K mobile health %K digital health %D 2019 %7 23.05.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Obtaining a diagnosis of neuropsychiatric disorders such as autism requires long waiting times that can exceed a year and can be prohibitively expensive. Crowdsourcing approaches may provide a scalable alternative that can accelerate general access to care and permit underserved populations to obtain an accurate diagnosis. Objective: We aimed to perform a series of studies to explore whether paid crowd workers on Amazon Mechanical Turk (AMT) and citizen crowd workers on a public website shared on social media can provide accurate online detection of autism, conducted via crowdsourced ratings of short home video clips. Methods: Three online studies were performed: (1) a paid crowdsourcing task on AMT (N=54) where crowd workers were asked to classify 10 short video clips of children as “Autism” or “Not autism,” (2) a more complex paid crowdsourcing task (N=27) with only those raters who correctly rated ≥8 of the 10 videos during the first study, and (3) a public unpaid study (N=115) identical to the first study. Results: For Study 1, the mean score of the participants who completed all questions was 7.50/10 (SD 1.46). When only analyzing the workers who scored ≥8/10 (n=27/54), there was a weak negative correlation between the time spent rating the videos and the sensitivity (ρ=–0.44, P=.02). For Study 2, the mean score of the participants rating new videos was 6.76/10 (SD 0.59). The average deviation between the crowdsourced answers and gold standard ratings provided by two expert clinical research coordinators was 0.56, with an SD of 0.51 (maximum possible SD is 3). All paid crowd workers who scored 8/10 in Study 1 either expressed enjoyment in performing the task in Study 2 or provided no negative comments. For Study 3, the mean score of the participants who completed all questions was 6.67/10 (SD 1.61). There were weak correlations between age and score (r=0.22, P=.014), age and sensitivity (r=–0.19, P=.04), number of family members with autism and sensitivity (r=–0.195, P=.04), and number of family members with autism and precision (r=–0.203, P=.03). A two-tailed t test between the scores of the paid workers in Study 1 and the unpaid workers in Study 3 showed a significant difference (P<.001). Conclusions: Many paid crowd workers on AMT enjoyed answering screening questions from videos, suggesting higher intrinsic motivation to make quality assessments. Paid crowdsourcing provides promising screening assessments of pediatric autism with an average deviation <20% from professional gold standard raters, which is potentially a clinically informative estimate for parents. Parents of children with autism likely overfit their intuition to their own affected child. This work provides preliminary demographic data on raters who may have higher ability to recognize and measure features of autism across its wide range of phenotypic manifestations. %M 31124463 %R 10.2196/13668 %U http://www.jmir.org/2019/5/e13668/ %U https://doi.org/10.2196/13668 %U http://www.ncbi.nlm.nih.gov/pubmed/31124463 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 4 %P e12953 %T Crowdsourcing the Citation Screening Process for Systematic Reviews: Validation Study %A Nama,Nassr %A Sampson,Margaret %A Barrowman,Nicholas %A Sandarage,Ryan %A Menon,Kusum %A Macartney,Gail %A Murto,Kimmo %A Vaccani,Jean-Philippe %A Katz,Sherri %A Zemek,Roger %A Nasr,Ahmed %A McNally,James Dayre %+ Department of Pediatrics, Children’s Hospital of Eastern Ontario, 401 Smyth Road, Ottawa, ON,, Canada, 1 6137377600 ext 3553, dmcnally@cheo.on.ca %K crowdsourcing %K systematic reviews as topic %K meta-analysis as topic %K research design %D 2019 %7 29.04.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Systematic reviews (SRs) are often cited as the highest level of evidence available as they involve the identification and synthesis of published studies on a topic. Unfortunately, it is increasingly challenging for small teams to complete SR procedures in a reasonable time period, given the exponential rise in the volume of primary literature. Crowdsourcing has been postulated as a potential solution. Objective: The feasibility objective of this study was to determine whether a crowd would be willing to perform and complete abstract and full text screening. The validation objective was to assess the quality of the crowd’s work, including retention of eligible citations (sensitivity) and work performed for the investigative team, defined as the percentage of citations excluded by the crowd. Methods: We performed a prospective study evaluating crowdsourcing essential components of an SR, including abstract screening, document retrieval, and full text assessment. Using CrowdScreenSR citation screening software, 2323 articles from 6 SRs were available to an online crowd. Citations excluded by less than or equal to 75% of the crowd were moved forward for full text assessment. For the validation component, performance of the crowd was compared with citation review through the accepted, gold standard, trained expert approach. Results: Of 312 potential crowd members, 117 (37.5%) commenced abstract screening and 71 (22.8%) completed the minimum requirement of 50 citation assessments. The majority of participants were undergraduate or medical students (192/312, 61.5%). The crowd screened 16,988 abstracts (median: 8 per citation; interquartile range [IQR] 7-8), and all citations achieved the minimum of 4 assessments after a median of 42 days (IQR 26-67). Crowd members retrieved 83.5% (774/927) of the articles that progressed to the full text phase. A total of 7604 full text assessments were completed (median: 7 per citation; IQR 3-11). Citations from all but 1 review achieved the minimum of 4 assessments after a median of 36 days (IQR 24-70), with 1 review remaining incomplete after 3 months. When complete crowd member agreement at both levels was required for exclusion, sensitivity was 100% (95% CI 97.9-100) and work performed was calculated at 68.3% (95% CI 66.4-70.1). Using the predefined alternative 75% exclusion threshold, sensitivity remained 100% and work performed increased to 72.9% (95% CI 71.0-74.6; P<.001). Finally, when a simple majority threshold was considered, sensitivity decreased marginally to 98.9% (95% CI 96.0-99.7; P=.25) and work performed increased substantially to 80.4% (95% CI 78.7-82.0; P<.001). Conclusions: Crowdsourcing of citation screening for SRs is feasible and has reasonable sensitivity and specificity. By expediting the screening process, crowdsourcing could permit the investigative team to focus on more complex SR tasks. Future directions should focus on developing a user-friendly online platform that allows research teams to crowdsource their reviews. %M 31033444 %R 10.2196/12953 %U http://www.jmir.org/2019/4/e12953/ %U https://doi.org/10.2196/12953 %U http://www.ncbi.nlm.nih.gov/pubmed/31033444 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 4 %P e12047 %T Crowdsourcing for Food Purchase Receipt Annotation via Amazon Mechanical Turk: A Feasibility Study %A Lu,Wenhua %A Guttentag,Alexandra %A Elbel,Brian %A Kiszko,Kamila %A Abrams,Courtney %A Kirchner,Thomas R %+ Department of Childhood Studies, Rutgers, The State University of New Jersey, 329, Cooper Street, Camden, NJ, 08102, United States, 1 856 225 6083, w.lu@rutgers.edu %K Amazon Mechanical Turk %K food purchase receipt %K crowdsourcing %K feasibility %K reliability %K validity %D 2019 %7 05.04.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: The decisions that individuals make about the food and beverage products they purchase and consume directly influence their energy intake and dietary quality and may lead to excess weight gain and obesity. However, gathering and interpreting data on food and beverage purchase patterns can be difficult. Leveraging novel sources of data on food and beverage purchase behavior can provide us with a more objective understanding of food consumption behaviors. Objective: Food and beverage purchase receipts often include time-stamped location information, which, when associated with product purchase details, can provide a useful behavioral measurement tool. The purpose of this study was to assess the feasibility, reliability, and validity of processing data from fast-food restaurant receipts using crowdsourcing via Amazon Mechanical Turk (MTurk). Methods: Between 2013 and 2014, receipts (N=12,165) from consumer purchases were collected at 60 different locations of five fast-food restaurant chains in New Jersey and New York City, USA (ie, Burger King, KFC, McDonald’s, Subway, and Wendy’s). Data containing the restaurant name, location, receipt ID, food items purchased, price, and other information were manually entered into an MS Access database and checked for accuracy by a second reviewer; this was considered the gold standard. To assess the feasibility of coding receipt data via MTurk, a prototype set of receipts (N=196) was selected. For each receipt, 5 turkers were asked to (1) identify the receipt identifier and the name of the restaurant and (2) indicate whether a beverage was listed in the receipt; if yes, they were to categorize the beverage as cold (eg, soda or energy drink) or hot (eg, coffee or tea). Interturker agreement for specific questions (eg, restaurant name and beverage inclusion) and agreement between turker consensus responses and the gold standard values in the manually entered dataset were calculated. Results: Among the 196 receipts completed by turkers, the interturker agreement was 100% (196/196) for restaurant names (eg, Burger King, McDonald’s, and Subway), 98.5% (193/196) for beverage inclusion (ie, hot, cold, or none), 92.3% (181/196) for types of hot beverage (eg, hot coffee or hot tea), and 87.2% (171/196) for types of cold beverage (eg, Coke or bottled water). When compared with the gold standard data, the agreement level was 100% (196/196) for restaurant name, 99.5% (195/196) for beverage inclusion, and 99.5% (195/196) for beverage types. Conclusions: Our findings indicated high interrater agreement for questions across difficulty levels (eg, single- vs binary- vs multiple-choice items). Compared with traditional methods for coding receipt data, MTurk can produce excellent-quality data in a lower-cost, more time-efficient manner. %M 30950801 %R 10.2196/12047 %U http://www.jmir.org/2019/4/e12047/ %U https://doi.org/10.2196/12047 %U http://www.ncbi.nlm.nih.gov/pubmed/30950801 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 2 %P e12525 %T QuikLitE, a Framework for Quick Literacy Evaluation in Medicine: Development and Validation %A Zheng,Jiaping %A Yu,Hong %+ Department of Computer Science, University of Massachusetts Lowell, One University Avenue, Lowell, MA, 01854, United States, 1 978 934 3620, yu_hong@uml.edu %K health literacy %K psychometrics %K crowdsourcing %D 2019 %7 22.02.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: A plethora of health literacy instruments was developed over the decades. They usually start with experts curating passages of text or word lists, followed by psychometric validation and revision based on test results obtained from a sample population. This process is costly and it is difficult to customize for new usage scenarios. Objective: This study aimed to develop and evaluate a framework for dynamically creating test instruments that can provide a focused assessment of patients’ health literacy. Methods: A health literacy framework and scoring method were extended from the vocabulary knowledge test to accommodate a wide range of item difficulties and various degrees of uncertainty in the participant’s answer. Web-based tests from Amazon Mechanical Turk users were used to assess reliability and validity. Results: Parallel forms of our tests showed high reliability (correlation=.78; 95% CI 0.69-0.85). Validity measured as correlation with an electronic health record comprehension instrument was higher (.47-.61 among 3 groups) than 2 existing tools (Short Assessment of Health Literacy-English, .38-.43; Short Test of Functional Health Literacy in Adults, .34-.46). Our framework is able to distinguish higher literacy levels that are often not measured by other instruments. It is also flexible, allowing customizations to the test the designer’s focus on a particular interest in a subject matter or domain. The framework is among the fastest health literacy instrument to administer. Conclusions: We proposed a valid and highly reliable framework to dynamically create health literacy instruments, alleviating the need to repeat a time-consuming process when a new use scenario arises. This framework can be customized to a specific need on demand and can measure skills beyond the basic level. %M 30794206 %R 10.2196/12525 %U http://www.jmir.org/2019/2/e12525/ %U https://doi.org/10.2196/12525 %U http://www.ncbi.nlm.nih.gov/pubmed/30794206 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 21 %N 1 %P e10793 %T Improving Electronic Health Record Note Comprehension With NoteAid: Randomized Trial of Electronic Health Record Note Comprehension Interventions With Crowdsourced Workers %A Lalor,John P %A Woolf,Beverly %A Yu,Hong %+ Department of Computer Science, University of Massachusetts Lowell, 1 University Avenue, Lowell, MA,, United States, 1 508 612 7292, hong.yu@umassmed.edu %K health literacy %K crowdsourcing %K natural language processing %K information storage and retrieval %K psychometrics %K MedlinePlus %D 2019 %7 16.01.2019 %9 Original Paper %J J Med Internet Res %G English %X Background: Patient portals are becoming more common, and with them, the ability of patients to access their personal electronic health records (EHRs). EHRs, in particular the free-text EHR notes, often contain medical jargon and terms that are difficult for laypersons to understand. There are many Web-based resources for learning more about particular diseases or conditions, including systems that directly link to lay definitions or educational materials for medical concepts. Objective: Our goal is to determine whether use of one such tool, NoteAid, leads to higher EHR note comprehension ability. We use a new EHR note comprehension assessment tool instead of patient self-reported scores. Methods: In this work, we compare a passive, self-service educational resource (MedlinePlus) with an active resource (NoteAid) where definitions are provided to the user for medical concepts that the system identifies. We use Amazon Mechanical Turk (AMT) to recruit individuals to complete ComprehENotes, a new test of EHR note comprehension. Results: Mean scores for individuals with access to NoteAid are significantly higher than the mean baseline scores, both for raw scores (P=.008) and estimated ability (P=.02). Conclusions: In our experiments, we show that the active intervention leads to significantly higher scores on the comprehension test as compared with a baseline group with no resources provided. In contrast, there is no significant difference between the group that was provided with the passive intervention and the baseline group. Finally, we analyze the demographics of the individuals who participated in our AMT task and show differences between groups that align with the current understanding of health literacy between populations. This is the first work to show improvements in comprehension using tools such as NoteAid as measured by an EHR note comprehension assessment tool as opposed to patient self-reported scores. %M 30664453 %R 10.2196/10793 %U http://www.jmir.org/2019/1/e10793/ %U https://doi.org/10.2196/10793 %U http://www.ncbi.nlm.nih.gov/pubmed/30664453 %0 Journal Article %@ 2369-2529 %I JMIR Publications %V 6 %N 1 %P e11127 %T Global Consensus From Clinicians Regarding Low Back Pain Outcome Indicators for Older Adults: Pairwise Wiki Survey Using Crowdsourcing %A Wong,Arnold YL %A Lauridsen,Henrik H %A Samartzis,Dino %A Macedo,Luciana %A Ferreira,Paulo H %A Ferreira,Manuela L %+ Department of Rehabilitation Sciences, Hong Kong Polytechnic University, Room ST512, 5/F, Ng Wing Hong Building, 11 Yuk Choi Road, Hong Kong,, China (Hong Kong), 852 2766 6741, arnold.wong@polyu.edu.hk %K crowdsourcing %K wiki survey %K low back pain %K older people %K outcome indicators %D 2019 %7 15.01.2019 %9 Original Paper %J JMIR Rehabil Assist Technol %G English %X Background: Low back pain (LBP) is one of the most debilitating conditions among older adults. Unfortunately, existing LBP outcome questionnaires are not adapted for specific circumstances related to old age, which may make these measures less than ideal for evaluating LBP in older adults. Objective: To explore the necessity of developing age-specific outcome measures, crowdsourcing was conducted to solicit opinions from clinicians globally. Methods: Clinicians around the world voted and/or prioritized various LBP outcome indicators for older adults on a pairwise wiki survey website. Seven seed outcome indicators were posted for voting while respondents were encouraged to suggest new indicators for others to vote/prioritize. The website was promoted on the social media of various health care professional organizations. An established algorithm calculated the mean scores of all ideas. A score >50 points means that the idea has >50% probability of beating another randomly presented indicator. Results: Within 42 days, 128 respondents from 6 continents cast 2466 votes and proposed 14 ideas. Indicators pertinent to improvements of physical functioning and age-related social functioning scored >50 while self-perceived reduction of LBP scored 32. Conclusions: This is the first crowdsourcing study to address LBP outcome indicators for older adults. The study noted that age-specific outcome indicators should be integrated into future LBP outcome measures for older adults. Future research should solicit opinions from older patients with LBP to develop age-specific back pain outcome measures that suit clinicians and patients alike. %M 30664493 %R 10.2196/11127 %U http://rehab.jmir.org/2019/1/e11127/ %U https://doi.org/10.2196/11127 %U http://www.ncbi.nlm.nih.gov/pubmed/30664493 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 7 %N 2 %P e17 %T Calorie Estimation From Pictures of Food: Crowdsourcing Study %A Zhou,Jun %A Bell,Dane %A Nusrat,Sabrina %A Hingle,Melanie %A Surdeanu,Mihai %A Kobourov,Stephen %+ Department of Computer Science, University of Arizona, Gould-Simpson 917, 1040 East 4th Street, Tucson, AZ, 85721, United States, 1 520 621 4632, kobourov@email.arizona.edu %K calorie estimation %K image annotation %K crowdsourcing %K obesity %K public health %D 2018 %7 05.11.2018 %9 Original Paper %J Interact J Med Res %G English %X Background: Software designed to accurately estimate food calories from still images could help users and health professionals identify dietary patterns and food choices associated with health and health risks more effectively. However, calorie estimation from images is difficult, and no publicly available software can do so accurately while minimizing the burden associated with data collection and analysis. Objective: The aim of this study was to determine the accuracy of crowdsourced annotations of calorie content in food images and to identify and quantify sources of bias and noise as a function of respondent characteristics and food qualities (eg, energy density). Methods: We invited adult social media users to provide calorie estimates for 20 food images (for which ground truth calorie data were known) using a custom-built webpage that administers an online quiz. The images were selected to provide a range of food types and energy density. Participants optionally provided age range, gender, and their height and weight. In addition, 5 nutrition experts provided annotations for the same data to form a basis of comparison. We examined estimated accuracy on the basis of expertise, demographic data, and food qualities using linear mixed-effects models with participant and image index as random variables. We also analyzed the advantage of aggregating nonexpert estimates. Results: A total of 2028 respondents agreed to participate in the study (males: 770/2028, 37.97%, mean body mass index: 27.5 kg/m2). Average accuracy was 5 out of 20 correct guesses, where “correct” was defined as a number within 20% of the ground truth. Even a small crowd of 10 individuals achieved an accuracy of 7, exceeding the average individual and expert annotator’s accuracy of 5. Women were more accurate than men (P<.001), and younger people were more accurate than older people (P<.001). The calorie content of energy-dense foods was overestimated (P=.02). Participants performed worse when images contained reference objects, such as credit cards, for scale (P=.01). Conclusions: Our findings provide new information about how calories are estimated from food images, which can inform the design of related software and analyses. %M 30401671 %R 10.2196/ijmr.9359 %U http://www.i-jmr.org/2018/2/e17/ %U https://doi.org/10.2196/ijmr.9359 %U http://www.ncbi.nlm.nih.gov/pubmed/30401671 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 6 %P e10148 %T Towards an Artificially Empathic Conversational Agent for Mental Health Applications: System Design and User Perceptions %A Morris,Robert R %A Kouddous,Kareem %A Kshirsagar,Rohan %A Schueller,Stephen M %+ Koko, 155 Rivington St, New York, NY, 10002, United States, 1 617 851 4967, rob@koko.ai %K conversational agents %K mental health %K empathy %K crowdsourcing %K peer support %D 2018 %7 26.06.2018 %9 Original Paper %J J Med Internet Res %G English %X Background: Conversational agents cannot yet express empathy in nuanced ways that account for the unique circumstances of the user. Agents that possess this faculty could be used to enhance digital mental health interventions. Objective: We sought to design a conversational agent that could express empathic support in ways that might approach, or even match, human capabilities. Another aim was to assess how users might appraise such a system. Methods: Our system used a corpus-based approach to simulate expressed empathy. Responses from an existing pool of online peer support data were repurposed by the agent and presented to the user. Information retrieval techniques and word embeddings were used to select historical responses that best matched a user’s concerns. We collected ratings from 37,169 users to evaluate the system. Additionally, we conducted a controlled experiment (N=1284) to test whether the alleged source of a response (human or machine) might change user perceptions. Results: The majority of responses created by the agent (2986/3770, 79.20%) were deemed acceptable by users. However, users significantly preferred the efforts of their peers (P<.001). This effect was maintained in a controlled study (P=.02), even when the only difference in responses was whether they were framed as coming from a human or a machine. Conclusions: Our system illustrates a novel way for machines to construct nuanced and personalized empathic utterances. However, the design had significant limitations and further research is needed to make this approach viable. Our controlled study suggests that even in ideal conditions, nonhuman agents may struggle to express empathy as well as humans. The ethical implications of empathic agents, as well as their potential iatrogenic effects, are also discussed. %M 29945856 %R 10.2196/10148 %U http://www.jmir.org/2018/6/e10148/ %U https://doi.org/10.2196/10148 %U http://www.ncbi.nlm.nih.gov/pubmed/29945856 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 5 %P e187 %T Mapping of Crowdsourcing in Health: Systematic Review %A Créquit,Perrine %A Mansouri,Ghizlène %A Benchoufi,Mehdi %A Vivot,Alexandre %A Ravaud,Philippe %+ INSERM UMR1153, Methods Team, Epidemiology and Statistics Sorbonne Paris Cité Research Center, Paris Descartes University, 1 place du Parvis Notre Dame, Paris, 75004, France, 33 142348932, perrine.crequit@aphp.fr %K review [publication type] %K crowdsourcing %K health %D 2018 %7 15.05.2018 %9 Review %J J Med Internet Res %G English %X Background: Crowdsourcing involves obtaining ideas, needed services, or content by soliciting Web-based contributions from a crowd. The 4 types of crowdsourced tasks (problem solving, data processing, surveillance or monitoring, and surveying) can be applied in the 3 categories of health (promotion, research, and care). Objective: This study aimed to map the different applications of crowdsourcing in health to assess the fields of health that are using crowdsourcing and the crowdsourced tasks used. We also describe the logistics of crowdsourcing and the characteristics of crowd workers. Methods: MEDLINE, EMBASE, and ClinicalTrials.gov were searched for available reports from inception to March 30, 2016, with no restriction on language or publication status. Results: We identified 202 relevant studies that used crowdsourcing, including 9 randomized controlled trials, of which only one had posted results at ClinicalTrials.gov. Crowdsourcing was used in health promotion (91/202, 45.0%), research (73/202, 36.1%), and care (38/202, 18.8%). The 4 most frequent areas of application were public health (67/202, 33.2%), psychiatry (32/202, 15.8%), surgery (22/202, 10.9%), and oncology (14/202, 6.9%). Half of the reports (99/202, 49.0%) referred to data processing, 34.6% (70/202) referred to surveying, 10.4% (21/202) referred to surveillance or monitoring, and 5.9% (12/202) referred to problem-solving. Labor market platforms (eg, Amazon Mechanical Turk) were used in most studies (190/202, 94%). The crowd workers’ characteristics were poorly reported, and crowdsourcing logistics were missing from two-thirds of the reports. When reported, the median size of the crowd was 424 (first and third quartiles: 167-802); crowd workers’ median age was 34 years (32-36). Crowd workers were mainly recruited nationally, particularly in the United States. For many studies (58.9%, 119/202), previous experience in crowdsourcing was required, and passing a qualification test or training was seldom needed (11.9% of studies; 24/202). For half of the studies, monetary incentives were mentioned, with mainly less than US $1 to perform the task. The time needed to perform the task was mostly less than 10 min (58.9% of studies; 119/202). Data quality validation was used in 54/202 studies (26.7%), mainly by attention check questions or by replicating the task with several crowd workers. Conclusions: The use of crowdsourcing, which allows access to a large pool of participants as well as saving time in data collection, lowering costs, and speeding up innovations, is increasing in health promotion, research, and care. However, the description of crowdsourcing logistics and crowd workers’ characteristics is frequently missing in study reports and needs to be precisely reported to better interpret the study findings and replicate them. %M 29764795 %R 10.2196/jmir.9330 %U http://www.jmir.org/2018/5/e187/ %U https://doi.org/10.2196/jmir.9330 %U http://www.ncbi.nlm.nih.gov/pubmed/29764795 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 4 %P e139 %T ComprehENotes, an Instrument to Assess Patient Reading Comprehension of Electronic Health Record Notes: Development and Validation %A Lalor,John P %A Wu,Hao %A Chen,Li %A Mazor,Kathleen M %A Yu,Hong %+ Department of Medicine, University of Massachusetts Medical School, 55 Lake Avenue North, AC7-059, Worcester, MA, 01605, United States, 1 508 856 3474, hong.yu@umassmed.edu %K electronic health records %K health literacy %K psychometrics %K crowdsourcing %D 2018 %7 25.04.2018 %9 Original Paper %J J Med Internet Res %G English %X Background: Patient portals are widely adopted in the United States and allow millions of patients access to their electronic health records (EHRs), including their EHR clinical notes. A patient’s ability to understand the information in the EHR is dependent on their overall health literacy. Although many tests of health literacy exist, none specifically focuses on EHR note comprehension. Objective: The aim of this paper was to develop an instrument to assess patients’ EHR note comprehension. Methods: We identified 6 common diseases or conditions (heart failure, diabetes, cancer, hypertension, chronic obstructive pulmonary disease, and liver failure) and selected 5 representative EHR notes for each disease or condition. One note that did not contain natural language text was removed. Questions were generated from these notes using Sentence Verification Technique and were analyzed using item response theory (IRT) to identify a set of questions that represent a good test of ability for EHR note comprehension. Results: Using Sentence Verification Technique, 154 questions were generated from the 29 EHR notes initially obtained. Of these, 83 were manually selected for inclusion in the Amazon Mechanical Turk crowdsourcing tasks and 55 were ultimately retained following IRT analysis. A follow-up validation with a second Amazon Mechanical Turk task and IRT analysis confirmed that the 55 questions test a latent ability dimension for EHR note comprehension. A short test of 14 items was created along with the 55-item test. Conclusions: We developed ComprehENotes, an instrument for assessing EHR note comprehension from existing EHR notes, gathered responses using crowdsourcing, and used IRT to analyze those responses, thus resulting in a set of questions to measure EHR note comprehension. Crowdsourced responses from Amazon Mechanical Turk can be used to estimate item parameters and select a subset of items for inclusion in the test set using IRT. The final set of questions is the first test of EHR note comprehension. %M 29695372 %R 10.2196/jmir.9380 %U http://www.jmir.org/2018/4/e139/ %U https://doi.org/10.2196/jmir.9380 %U http://www.ncbi.nlm.nih.gov/pubmed/29695372 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 20 %N 3 %P e75 %T Ethical Concerns of and Risk Mitigation Strategies for Crowdsourcing Contests and Innovation Challenges: Scoping Review %A Tucker,Joseph D %A Pan,Stephen W %A Mathews,Allison %A Stein,Gabriella %A Bayus,Barry %A Rennie,Stuart %+ University of North Carolina Project-China, 2 Lujing Road, Guangzhou,, China, 86 13560294997, jdtucker@med.unc.edu %K crowdsourcing %K health communication %K ethical analysis %D 2018 %7 09.03.2018 %9 Review %J J Med Internet Res %G English %X Background: Crowdsourcing contests (also called innovation challenges, innovation contests, and inducement prize contests) can be used to solicit multisectoral feedback on health programs and design public health campaigns. They consist of organizing a steering committee, soliciting contributions, engaging the community, judging contributions, recognizing a subset of contributors, and sharing with the community. Objective: This scoping review describes crowdsourcing contests by stage, examines ethical problems at each stage, and proposes potential ways of mitigating risk. Methods: Our analysis was anchored in the specific example of a crowdsourcing contest that our team organized to solicit videos promoting condom use in China. The purpose of this contest was to create compelling 1-min videos to promote condom use. We used a scoping review to examine the existing ethical literature on crowdsourcing to help identify and frame ethical concerns at each stage. Results: Crowdsourcing has a group of individuals solve a problem and then share the solution with the public. Crowdsourcing contests provide an opportunity for community engagement at each stage: organizing, soliciting, promoting, judging, recognizing, and sharing. Crowdsourcing poses several ethical concerns: organizing—potential for excluding community voices; soliciting—potential for overly narrow participation; promoting—potential for divulging confidential information; judging—potential for biased evaluation; recognizing—potential for insufficient recognition of the finalist; and sharing—potential for the solution to not be implemented or widely disseminated. Conclusions: Crowdsourcing contests can be effective and engaging public health tools but also introduce potential ethical problems. We present methods for the responsible conduct of crowdsourcing contests. %M 29523500 %R 10.2196/jmir.8226 %U http://www.jmir.org/2018/3/e75/ %U https://doi.org/10.2196/jmir.8226 %U http://www.ncbi.nlm.nih.gov/pubmed/29523500 %0 Journal Article %@ 1929-073X %I JMIR Publications %V 7 %N 1 %P e4 %T The Patient Perspective on the Impact of Tenosynovial Giant Cell Tumors on Daily Living: Crowdsourcing Study on Physical Function and Quality of Life %A Mastboom,Monique Josephine %A Planje,Rosa %A van de Sande,Michiel Adreanus %+ Department of Orthopedics, Leiden University Medical Center, University of Leiden, Albinusdreef 2, Leiden, 2333 ZA, Netherlands, 31 715264088, mjlmastboom@lumc.nl %K synovitis %K pigmented villonodular %K giant cell tumor of tendon sheath %K rare diseases %K crowdsourcing %K social media %K patient-reported outcome measures %K quality of life %K health-related quality of life %K social participation %K surveys and questionnaires %D 2018 %7 23.02.2018 %9 Original Paper %J Interact J Med Res %G English %X Background: Tenosynovial giant cell tumor (TGCT) is a rare, benign lesion affecting the synovial lining of joints, bursae, and tendon sheaths. It is generally characterized as a locally aggressive and often recurring tumor. A distinction is made between localized- and diffuse-type. The impact of TGCT on daily living is currently ill-described. Objective: The aim of this crowdsourcing study was to evaluate the impact of TGCT on physical function, daily activities, societal participation (work, sports, and hobbies), and overall quality of life from a patient perspective. The secondary aim was to define risk factors for deteriorated outcome in TGCT. Methods: Members of the largest known TGCT Facebook community, PVNS is Pants!!, were invited to an e-survey, partially consisting of validated questionnaires, for 6 months. To confirm disease presence and TGCT-type, patients were requested to share histological or radiological proof of TGCT. Unpaired t tests and chi-square tests were used to compare groups with and without proof and to define risk factors for deteriorated outcome. Results: Three hundred thirty-seven questionnaires, originating from 30 countries, were completed. Median age at diagnosis was 33 (interquartile range [IQR]=25-42) years, majority was female (79.8% [269/337]), diffuse TGCT (70.3% [237/337]), and affected lower extremities (knee 70.9% [239/337] and hip 9.5% [32/337]). In 299 lower-extremity TGCT patients (32.4% [97/299]) with disease confirmation, recurrence rate was 36% and 69.5% in localized and diffuse type, respectively. For both types, pain and swelling decreased after treatment; in contrast, stiffness and range of motion worsened. Patients were limited in their employment (localized 13% [8/61]; diffuse 11.0% [21/191]) and sport-activities (localized 58% [40/69]; diffuse 63.9% [147/230]). Compared with general US population, all patients showed lower Patient-Reported Outcomes Measurements Information System-Physical Function (PROMIS-PF), Short Form-12 (SF-12), and EuroQoL 5 Dimensions 5 Levels (EQ5D-5L) scores, considered clinically relevant, according to estimated minimal important difference (MID). Diffuse versus localized type scored almost 0.5 standard deviation lower for PROMIS-PF (P<.001) and demonstrated a utility score of 5% lower for EQ-5D-5L (P=.03). In localized TGCT, recurrent disease and ≥2 surgeries negatively influenced scores of Visual Analog Scale (VAS)-pain/stiffness, SF-12, and EQ-5D-5L (P<.05). In diffuse type, recurrence resulted in lower score for VAS, PROMIS-PF, SF-12, and EQ-5D-5L (P<.05). In both types, patients with treatment ≤1year had significantly lower SF-12. Conclusions: TGCT has a major impact on daily living in a relatively young and working population. Patients with diffuse type, recurrent disease, and ≥2 surgeries represent lowest functional and quality of life outcomes. Physicians should be aware that TGCT patients frequently continue to experience declined health-related quality of life and physical function and often remain limited in daily life, even after treatment(s). %M 29475829 %R 10.2196/ijmr.9325 %U http://www.i-jmr.org/2018/1/e4/ %U https://doi.org/10.2196/ijmr.9325 %U http://www.ncbi.nlm.nih.gov/pubmed/29475829 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 3 %N 4 %P e87 %T Will Participatory Syndromic Surveillance Work in Latin America? Piloting a Mobile Approach to Crowdsource Influenza-Like Illness Data in Guatemala %A Prieto,José Tomás %A Jara,Jorge H %A Alvis,Juan Pablo %A Furlan,Luis R %A Murray,Christian Travis %A Garcia,Judith %A Benghozi,Pierre-Jean %A Kaydos-Daniels,Susan Cornelia %+ Center for Health Studies, Universidad del Valle de Guatemala, 18 Av. 11-95, Zona 15, Vista Hermosa III, Guatemala City, 01015, Guatemala, +1 4044216455, josetomasprieto@gmail.com %K crowdsourcing %K human flu %K influenza %K grippe %K mHealth %K texting %K mobile apps %K short message service %K text message %K developing countries %D 2017 %7 14.11.2017 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: In many Latin American countries, official influenza reports are neither timely nor complete, and surveillance of influenza-like illness (ILI) remains thin in consistency and precision. Public participation with mobile technology may offer new ways of identifying nonmedically attended cases and reduce reporting delays, but no published studies to date have assessed the viability of ILI surveillance with mobile tools in Latin America. We implemented and assessed an ILI-tailored mobile health (mHealth) participatory reporting system. Objective: The objectives of this study were to evaluate the quality and characteristics of electronically collected data, the user acceptability of the symptom reporting platform, and the costs of running the system and of identifying ILI cases, and to use the collected data to characterize cases of reported ILI. Methods: We recruited the heads of 189 households comprising 584 persons during randomly selected home visits in Guatemala. From August 2016 to March 2017, participants used text messages or an app to report symptoms of ILI at home, the ages of the ILI cases, if medical attention was sought, and if medicines were bought in pharmacies. We sent weekly reminders to participants and compensated those who sent reports with phone credit. We assessed the simplicity, flexibility, acceptability, stability, timeliness, and data quality of the system. Results: Nearly half of the participants (47.1%, 89/189) sent one or more reports. We received 468 reports, 83.5% (391/468) via text message and 16.4% (77/468) via app. Nine-tenths of the reports (93.6%, 438/468) were received within 48 hours of the transmission of reminders. Over a quarter of the reports (26.5%, 124/468) indicated that at least someone at home had ILI symptoms. We identified 202 ILI cases and collected age information from almost three-fifths (58.4%, 118/202): 20 were aged between 0 and 5 years, 95 were aged between 6 and 64 years, and three were aged 65 years or older. Medications were purchased from pharmacies, without medical consultation, in 33.1% (41/124) of reported cases. Medical attention was sought in 27.4% (34/124) of reported cases. The cost of identifying an ILI case was US $6.00. We found a positive correlation (Pearson correlation coefficient=.8) between reported ILI and official surveillance data for noninfluenza viruses from weeks 41 (2016) to 13 (2017). Conclusions: Our system has the potential to serve as a practical complement to respiratory virus surveillance in Guatemala. Its strongest attributes are simplicity, flexibility, and timeliness. The biggest challenge was low enrollment caused by people’s fear of victimization and lack of phone credit. Authorities in Central America could test similar methods to improve the timeliness, and extend the breadth, of disease surveillance. It may allow them to rapidly detect localized or unusual circulation of acute respiratory illness and trigger appropriate public health actions. %M 29138128 %R 10.2196/publichealth.8610 %U http://publichealth.jmir.org/2017/4/e87/ %U https://doi.org/10.2196/publichealth.8610 %U http://www.ncbi.nlm.nih.gov/pubmed/29138128 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 19 %N 10 %P e341 %T The Promise and Pitfalls of Using Crowdsourcing in Research Prioritization for Back Pain: Cross-Sectional Surveys %A Bartek,Matthew A %A Truitt,Anjali R %A Widmer-Rodriguez,Sierra %A Tuia,Jordan %A Bauer,Zoya A %A Comstock,Bryan A %A Edwards,Todd C %A Lawrence,Sarah O %A Monsell,Sarah E %A Patrick,Donald L %A Jarvik,Jeffrey G %A Lavallee,Danielle C %+ Surgical Outcomes Research Center, Department of Surgery, University of Washington, 1107 NE 45th St, Suite 502, Seattle, WA, 98105, United States, 1 206 685 9524, bartek@uw.edu %K research prioritization %K crowdsourcing %K MTurk %K Amazon Mechanical Turk %K patient engagement %K stakeholder engagement %K back pain %K comparative effectiveness research %K patient participation %K low back pain %D 2017 %7 06.10.2017 %9 Original Paper %J J Med Internet Res %G English %X Background: The involvement of patients in research better aligns evidence generation to the gaps that patients themselves face when making decisions about health care. However, obtaining patients’ perspectives is challenging. Amazon’s Mechanical Turk (MTurk) has gained popularity over the past decade as a crowdsourcing platform to reach large numbers of individuals to perform tasks for a small reward for the respondent, at small cost to the investigator. The appropriateness of such crowdsourcing methods in medical research has yet to be clarified. Objective: The goals of this study were to (1) understand how those on MTurk who screen positive for back pain prioritize research topics compared with those who screen negative for back pain, and (2) determine the qualitative differences in open-ended comments between groups. Methods: We conducted cross-sectional surveys on MTurk to assess participants’ back pain and allow them to prioritize research topics. We paid respondents US $0.10 to complete the 24-point Roland Morris Disability Questionnaire (RMDQ) to categorize participants as those “with back pain” and those “without back pain,” then offered both those with (RMDQ score ≥7) and those without back pain (RMDQ <7) an opportunity to rank their top 5 (of 18) research topics for an additional US $0.75. We compared demographic information and research priorities between the 2 groups and performed qualitative analyses on free-text commentary that participants provided. Results: We conducted 2 screening waves. We first screened 2189 individuals for back pain over 33 days and invited 480 (21.93%) who screened positive to complete the prioritization, of whom 350 (72.9% of eligible) did. We later screened 664 individuals over 7 days and invited 474 (71.4%) without back pain to complete the prioritization, of whom 397 (83.7% of eligible) did. Those with back pain who prioritized were comparable with those without in terms of age, education, marital status, and employment. The group with back pain had a higher proportion of women (234, 67.2% vs 229, 57.8%, P=.02). The groups’ rank lists of research priorities were highly correlated: Spearman correlation coefficient was .88 when considering topics ranked in the top 5. The 2 groups agreed on 4 of the top 5 and 9 of the top 10 research priorities. Conclusions: Crowdsourcing platforms such as MTurk support efforts to efficiently reach large groups of individuals to obtain input on research activities. In the context of back pain, a prevalent and easily understood condition, the rank list of those with back pain was highly correlated with that of those without back pain. However, subtle differences in the content and quality of free-text comments suggest supplemental efforts may be needed to augment the reach of crowdsourcing in obtaining perspectives from patients, especially from specific populations. %M 28986339 %R 10.2196/jmir.8821 %U http://www.jmir.org/2017/10/e341/ %U https://doi.org/10.2196/jmir.8821 %U http://www.ncbi.nlm.nih.gov/pubmed/28986339 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 19 %N 9 %P e292 %T The User Knows What to Call It: Incorporating Patient Voice Through User-Contributed Tags on a Participatory Platform About Health Management %A Chen,Annie T %A Carriere,Rachel M %A Kaplan,Samantha Jan %+ Department of Biomedical Informatics and Medical Education, University of Washington School of Medicine, 850 Republican Street, Box 358047, C238, Seattle, WA, 98109, United States, 1 206 221 9218, atchen@uw.edu %K collaborative tagging %K folksonomy %K knowledge organization %K self-management %K body listening %K body awareness %D 2017 %7 07.09.2017 %9 Original Paper %J J Med Internet Res %G English %X Background: Body listening, described as the act of paying attention to the body’s signals and cues, can be an important component of long-term health management. Objective: The aim of this study was to introduce and evaluate the Body Listening Project, an innovative effort to engage the public in the creation of a public resource—to leverage collective wisdom in the health domain. This project involved a website where people could contribute their experiences of and dialogue with others concerning body listening and self-management. This article presents an analysis of the tags contributed, with a focus on the value of these tags for knowledge organization and incorporation into consumer-friendly health information retrieval systems. Methods: First, we performed content analysis of the tags contributed, identifying a set of categories and refining the relational structure of the categories to develop a preliminary classification scheme, the Body Listening and Self-Management Taxonomy. Second, we compared the concepts in the Body Listening and Self-Management Taxonomy with concepts that were automatically identified from an extant health knowledge resource, the Unified Medical Language System (UMLS), to better characterize the information that participants contributed. Third, we employed visualization techniques to explore the concept space of the tags. A correlation matrix, based on the extent to which categories tended to be assigned to the same tags, was used to study the interrelatedness of the taxonomy categories. Then a network visualization was used to investigate structural relationships among the categories in the taxonomy. Results: First, we proposed a taxonomy called the Body Listening and Self-Management Taxonomy, with four meta-level categories: (1) health management strategies, (2) concepts and states, (3) influencers, and (4) health-related information behavior. This taxonomy could inform future efforts to organize knowledge and content of this subject matter. Second, we compared the categories from this taxonomy with the UMLS concepts that were identified. Though the UMLS offers benefits such as speed and breadth of coverage, the Body Listening and Self-Management Taxonomy is more consumer-centric. Third, the correlation matrix and network visualization demonstrated that there are natural areas of ambiguity and semantic relatedness in the meanings of the concepts in the Body Listening and Self-Management Taxonomy. Use of these visualizations can be helpful in practice settings, to help library and information science practitioners understand and resolve potential challenges in classification; in research, to characterize the structure of the conceptual space of health management; and in the development of consumer-centric health information retrieval systems. Conclusions: A participatory platform can be employed to collect data concerning patient experiences of health management, which can in turn be used to develop new health knowledge resources or augment existing ones, as well as be incorporated into consumer-centric health information systems. %M 28882809 %R 10.2196/jmir.7673 %U http://www.jmir.org/2017/9/e292/ %U https://doi.org/10.2196/jmir.7673 %U http://www.ncbi.nlm.nih.gov/pubmed/28882809 %0 Journal Article %@ 2371-4379 %I JMIR Publications %V 2 %N 2 %P e11 %T Assessing Diabetes-Relevant Data Provided by Undergraduate and Crowdsourced Web-Based Survey Participants for Honesty and Accuracy %A DePalma,Mary Turner %A Rizzotti,Michael C %A Branneman,Matthew %+ Ithaca College, Department of Psychology, 119F Williams Hall, 953 Danby Road, Ithaca, NY, 14850, United States, 1 607 274 1323, depalma@ithaca.edu %K crowdsourcing %K diabetes mellitus %K survey design %K survey methodology %K survey quality %K mechanical turks %K MTurk %K data accuracy %D 2017 %7 12.07.2017 %9 Original Paper %J JMIR Diabetes %G English %X Background: To eliminate health disparities, research will depend on our ability to reach select groups of people (eg, samples of a particular racial or ethnic group with a particular disease); unfortunately, researchers often experience difficulty obtaining high-quality data from samples of sufficient size. Objective: Past studies utilizing MTurk applaud its diversity, so our initial objective was to capitalize on MTurk’s diversity to investigate psychosocial factors related to diabetes self-care. Methods: In Study 1, a “Health Survey” was posted on MTurk to examine diabetes-relevant psychosocial factors. The survey was restricted to individuals who were 18 years of age or older with diabetes. Detection of irregularities in the data, however, prompted an evaluation of the quality of MTurk health-relevant data. This ultimately led to Study 2, which utilized an alert statement to improve conscientious behavior, or the likelihood that participants would be thorough and diligent in their responses. Trap questions were also embedded to assess conscientious behavior. Results: In Study 1, of 4165 responses, 1246 were generated from 533 unique IP addresses completing the survey multiple times within close temporal proximity. Ultimately, only 252 responses were found to be acceptable. Further analyses indicated additional quality concerns with this subsample. In Study 2, as compared with the MTurk sample (N=316), the undergraduate sample (N=300) included more females, and fewer individuals who were married. The samples did not differ with respect to race. Although the presence of an alert resulted in fewer trap failures (mean=0.07) than when no alert was present (mean=0.11), this difference failed to reach significance: F1,604=2.5, P=.11, ƞ²=.004, power=.35. The modal trap failure response was zero, while the mean was 0.092 (SD=0.32). There were a total of 60 trap failures in a context where the potential could have exceeded 16,000. Conclusions: Published studies that utilize MTurk participants are rapidly appearing in the health domain. While MTurk may have the potential to be more diverse than an undergraduate sample, our efforts did not meet the criteria for what would constitute a diverse sample in and of itself. Because some researchers have experienced successful data collection on MTurk, while others report disastrous results, Kees et al recently identified that one essential area of research is of the types and magnitude of cheating behavior occurring on Web-based platforms. The present studies can contribute to this dialogue, and alternately provide evidence of disaster and success. Moving forward, it is recommended that researchers employ best practices in survey design and deliberately embed trap questions to assess participant behavior. We would strongly suggest that standards be in place for publishing the results of Web-based surveys—standards that protect against publication unless there are suitable quality assurance tests built into the survey design, distribution, and analysis. %M 30291072 %R 10.2196/diabetes.7473 %U http://diabetes.jmir.org/2017/2/e11/ %U https://doi.org/10.2196/diabetes.7473 %U http://www.ncbi.nlm.nih.gov/pubmed/30291072 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 3 %N 3 %P e42 %T A Platform for Crowdsourced Foodborne Illness Surveillance: Description of Users and Reports %A Quade,Patrick %A Nsoesie,Elaine Okanyene %+ Iwaspoisoned.com, 322 W 52nd St #633, New York, NY, 10101, United States, 1 9179034815, patrick@dinesafe.org %K foodborne illness surveillance %K crowdsourced surveillance %K foodborne diseases %K infectious diseases %K outbreaks %K food poisoning %K Internet %K mobile %K participatory surveillance %K participatory epidemiology %D 2017 %7 05.07.2017 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Underreporting of foodborne illness makes foodborne disease burden estimation, timely outbreak detection, and evaluation of policies toward improving food safety challenging. Objective: The objective of this study was to present and evaluate Iwaspoisoned.com, an openly accessible Internet-based crowdsourcing platform that was launched in 2009 for the surveillance of foodborne illness. The goal of this system is to collect data that can be used to augment traditional approaches to foodborne disease surveillance. Methods: Individuals affected by a foodborne illness can use this system to report their symptoms and the suspected location (eg, restaurant, hotel, hospital) of infection. We present descriptive statistics of users and businesses and highlight three instances where reports of foodborne illness were submitted before the outbreaks were officially confirmed by the local departments of health. Results: More than 49,000 reports of suspected foodborne illness have been submitted on Iwaspoisoned.com since its inception by individuals from 89 countries and every state in the United States. Approximately 95.51% (42,139/44,119) of complaints implicated restaurants as the source of illness. Furthermore, an estimated 67.55% (3118/4616) of users who responded to a demographic survey were between the ages of 18 and 34, and 60.14% (2776/4616) of the respondents were female. The platform is also currently used by health departments in 90% (45/50) of states in the US to supplement existing programs on foodborne illness reporting. Conclusions: Crowdsourced disease surveillance through systems such as Iwaspoisoned.com uses the influence and familiarity of social media to create an infrastructure for easy reporting and surveillance of suspected foodborne illness events. If combined with traditional surveillance approaches, these systems have the potential to lessen the problem of foodborne illness underreporting and aid in early detection and monitoring of foodborne disease outbreaks. %M 28679492 %R 10.2196/publichealth.7076 %U http://publichealth.jmir.org/2017/3/e42/ %U https://doi.org/10.2196/publichealth.7076 %U http://www.ncbi.nlm.nih.gov/pubmed/28679492 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 19 %N 6 %P e222 %T Improving Consensus Scoring of Crowdsourced Data Using the Rasch Model: Development and Refinement of a Diagnostic Instrument %A Brady,Christopher John %A Mudie,Lucy Iluka %A Wang,Xueyang %A Guallar,Eliseo %A Friedman,David Steven %+ Dana Center for Preventive Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, 600 N Wolfe St, Baltimore, MD,, United States, 1 410 502 2789, brady@jhmi.edu %K crowdsourcing %K diabetic retinopathy %K Rasch analysis %K Amazon Mechanical Turk %D 2017 %7 20.06.2017 %9 Original Paper %J J Med Internet Res %G English %X Background: Diabetic retinopathy (DR) is a leading cause of vision loss in working age individuals worldwide. While screening is effective and cost effective, it remains underutilized, and novel methods are needed to increase detection of DR. This clinical validation study compared diagnostic gradings of retinal fundus photographs provided by volunteers on the Amazon Mechanical Turk (AMT) crowdsourcing marketplace with expert-provided gold-standard grading and explored whether determination of the consensus of crowdsourced classifications could be improved beyond a simple majority vote (MV) using regression methods. Objective: The aim of our study was to determine whether regression methods could be used to improve the consensus grading of data collected by crowdsourcing. Methods: A total of 1200 retinal images of individuals with diabetes mellitus from the Messidor public dataset were posted to AMT. Eligible crowdsourcing workers had at least 500 previously approved tasks with an approval rating of 99% across their prior submitted work. A total of 10 workers were recruited to classify each image as normal or abnormal. If half or more workers judged the image to be abnormal, the MV consensus grade was recorded as abnormal. Rasch analysis was then used to calculate worker ability scores in a random 50% training set, which were then used as weights in a regression model in the remaining 50% test set to determine if a more accurate consensus could be devised. Outcomes of interest were the percent correctly classified images, sensitivity, specificity, and area under the receiver operating characteristic (AUROC) for the consensus grade as compared with the expert grading provided with the dataset. Results: Using MV grading, the consensus was correct in 75.5% of images (906/1200), with 75.5% sensitivity, 75.5% specificity, and an AUROC of 0.75 (95% CI 0.73-0.78). A logistic regression model using Rasch-weighted individual scores generated an AUROC of 0.91 (95% CI 0.88-0.93) compared with 0.89 (95% CI 0.86-92) for a model using unweighted scores (chi-square P value<.001). Setting a diagnostic cut-point to optimize sensitivity at 90%, 77.5% (465/600) were graded correctly, with 90.3% sensitivity, 68.5% specificity, and an AUROC of 0.79 (95% CI 0.76-0.83). Conclusions: Crowdsourced interpretations of retinal images provide rapid and accurate results as compared with a gold-standard grading. Creating a logistic regression model using Rasch analysis to weight crowdsourced classifications by worker ability improves accuracy of aggregated grades as compared with simple majority vote. %M 28634154 %R 10.2196/jmir.7984 %U http://www.jmir.org/2017/6/e222/ %U https://doi.org/10.2196/jmir.7984 %U http://www.ncbi.nlm.nih.gov/pubmed/28634154 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 19 %N 5 %P e168 %T Harnessing Facebook for Smoking Reduction and Cessation Interventions: Facebook User Engagement and Social Support Predict Smoking Reduction %A Kim,Sunny Jung %A Marsch,Lisa A %A Brunette,Mary F %A Dallery,Jesse %+ Center for Technology and Behavioral Health, Department of Biomedical Data Science, Department of Psychiatry, Geisel School of Medicine at Dartmouth, Dartmouth College, 46 Centerra Parkway Suite 301, Lebanon, NH, 03766, United States, 1 603 646 7041, sunny.j.kim@dartmouth.edu %K social media %K social support %K behavior and behavior mechanisms %K smoking cessation %K persuasive communication %K social networking %K technology %K health promotion %D 2017 %7 23.05.2017 %9 Original Paper %J J Med Internet Res %G English %X Background: Social media technologies offer a novel opportunity for scalable health interventions that can facilitate user engagement and social support, which in turn may reinforce positive processes for behavior change. Objective: By using principles from health communication and social support literature, we implemented a Facebook group–based intervention that targeted smoking reduction and cessation. This study hypothesized that participants’ engagement with and perceived social support from our Facebook group intervention would predict smoking reduction. Methods: We recruited 16 regular smokers who live in the United States and who were motivated in quitting smoking at screening. We promoted message exposure as well as engagement and social support systems throughout the intervention. For message exposure, we posted prevalidated, antismoking messages (such as national antismoking campaigns) on our smoking reduction and cessation Facebook group. For engagement and social support systems, we delivered a high degree of engagement and social support systems during the second and third week of the intervention and a low degree of engagement and social support systems during the first and fourth week. A total of six surveys were conducted via Amazon Mechanical Turk (MTurk) at baseline on a weekly basis and at a 2-week follow-up. Results: Of the total 16 participants, most were female (n=13, 81%), white (n=15, 94%), and between 25 and 50 years of age (mean 34.75, SD 8.15). There was no study attrition throughout the 6-time-point baseline, weekly, and follow-up surveys. We generated Facebook engagement and social support composite scores (mean 19.19, SD 24.35) by combining the number of likes each participant received and the number of comments or wall posts each participant posted on our smoking reduction and cessation Facebook group during the intervention period. The primary outcome was smoking reduction in the past 7 days measured at baseline and at the two-week follow-up. Compared with the baseline, participants reported smoking an average of 60.56 fewer cigarettes per week (SD 38.83) at the follow-up, and 4 participants out of 16 (25%) reported 7-day point prevalence smoking abstinence at the follow-up. Adjusted linear regression models revealed that a one-unit increase in the Facebook engagement and social support composite scores predicted a 0.56-unit decrease in cigarettes smoked per week (standard error =.24, P=.04, 95% CI 0.024-1.09) when baseline readiness to quit, gender, and baseline smoking status were controlled (F4, 11=8.85, P=.002). Conclusions: This study is the first Facebook group–based intervention that systemically implemented health communication strategies and engagement and social support systems to promote smoking reduction and cessation. Our findings imply that receiving one like or posting on the Facebook-based intervention platform predicted smoking approximately one less cigarette in the past 7 days, and that interventions should facilitate user interactions to foster user engagement and social support. %M 28536096 %R 10.2196/jmir.6681 %U http://www.jmir.org/2017/5/e168/ %U https://doi.org/10.2196/jmir.6681 %U http://www.ncbi.nlm.nih.gov/pubmed/28536096 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 5 %N 5 %P e63 %T North American Public Opinion Survey on the Acceptability of Crowdsourcing Basic Life Support for Out-of-Hospital Cardiac Arrest With the PulsePoint Mobile Phone App %A Dainty,Katie N %A Vaid,Haris %A Brooks,Steven C %+ Rescu, Li Ka Shing Knowledge Institute, St Michael's Hospital, 30 Bond Street, Toronto, ON, M5B 1W8, Canada, 1 6474482485, daintyk@smh.ca %K sudden cardiac death %K surveys and questionnaires %K cardiopulmonary resuscitation %K PulsePoint %K North America %D 2017 %7 17.05.2017 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: The PulsePoint Respond app is a novel system that can be implemented in emergency dispatch centers to crowdsource basic life support (BLS) for patients with cardiac arrest and facilitate bystander cardiopulmonary resuscitation (CPR) and automated external defibrillator use while first responders are en route. Objective: The aim of this study was to conduct a North American survey to evaluate the public perception of the above-mentioned strategy, including acceptability and willingness to respond to alerts. Methods: We designed a Web-based survey administered by IPSOS Reid, an established external polling vendor. Sampling was designed to ensure broad representation using recent census statistics. Results: A total of 2415 survey responses were analyzed (1106 from Canada and 1309 from the United States). It was found that 98.37% (1088/1106) of Canadians and 96% (1259/1309) of Americans had no objections to PulsePoint being implemented in their community; 84.27% (932/1106) of Canadians and 55.61% (728/1309) of Americans said they would download the app to become a potential responder to cardiac arrest, respectively. Among Canadians, those who said they were likely to download PulsePoint were also more likely to have ever had CPR training (OR 1.7, 95% CI 1.2-2.4; P=.002); however, this was not true of American respondents (OR 1.0, 95% CI 0.79-1.3; P=.88). When asked to imagine themselves as a cardiac arrest victim, 95.39% (1055/1106) of Canadians and 92.44% (1210/1309) of Americans had no objections to receiving crowdsourced help in a public setting; 88.79% (982/1106) of Canadians and 84.87% (1111/1309) of Americans also had no objections to receiving help in a private setting, respectively. The most common concern identified with respect to PulsePoint implementation was a responder’s lack of ability, training, or access to proper equipment in a public setting. Conclusions: The North American public finds the concept of crowdsourcing BLS for out-of-hospital cardiac arrest to be acceptable. It demonstrates willingness to respond to PulsePoint CPR notifications and to accept help from others alerted by the app if they themselves suffered a cardiac arrest. %M 28526668 %R 10.2196/mhealth.6926 %U http://mhealth.jmir.org/2017/5/e63/ %U https://doi.org/10.2196/mhealth.6926 %U http://www.ncbi.nlm.nih.gov/pubmed/28526668 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 6 %N 5 %P e83 %T Crowdsourced Identification of Possible Allergy-Associated Factors: Automated Hypothesis Generation and Validation Using Crowdsourcing Services %A Aramaki,Eiji %A Shikata,Shuko %A Ayaya,Satsuki %A Kumagaya,Shin-Ichiro %+ Social Computing Lab, Graduate School of Information Science, Nara Institute of Science and Technology, Takayama-cho, BLD.405, Ikoma, 630-0192, Japan, 81 0743 72 6065, aramaki@is.naist.jp %K allergy %K crowdsourcing %K disease risk %K automatic abduction %K Tohjisha-Kenkyu %K self-support study %D 2017 %7 16.05.2017 %9 Original Paper %J JMIR Res Protoc %G English %X Background: Hypothesis generation is an essential task for clinical research, and it can require years of research experience to formulate a meaningful hypothesis. Recent studies have endeavored to apply crowdsourcing to generate novel hypotheses for research. In this study, we apply crowdsourcing to explore previously unknown allergy-associated factors. Objective: In this study, we aimed to collect and test hypotheses of unknown allergy-associated factors using a crowdsourcing service. Methods: Using a series of questionnaires, we asked crowdsourcing participants to provide hypotheses on associated factors for seven different allergies, and validated the candidate hypotheses with odds ratios calculated for each associated factor. We repeated this abductive validation process to identify a set of reliable hypotheses. Results: We obtained two primary findings: (1) crowdsourcing showed that 8 of the 13 known hypothesized allergy risks were statically significant; and (2) among the total of 157 hypotheses generated by the crowdsourcing service, 75 hypotheses were statistically significant allergy-associated factors, comprising the 8 known risks and 53 previously unknown allergy-associated factors. These findings suggest that there are still many topics to be examined in future allergy studies. Conclusions: Crowdsourcing generated new hypotheses on allergy-associated factors. In the near future, clinical trials should be conducted to validate the hypotheses generated in this study. %M 28512079 %R 10.2196/resprot.5851 %U http://www.researchprotocols.org/2017/5/e83/ %U https://doi.org/10.2196/resprot.5851 %U http://www.ncbi.nlm.nih.gov/pubmed/28512079 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 6 %N 4 %P e56 %T Comparing Crowdsourcing and Friendsourcing: A Social Media-Based Feasibility Study to Support Alzheimer Disease Caregivers %A Bateman,Daniel Robert %A Brady,Erin %A Wilkerson,David %A Yi,Eun-Hye %A Karanam,Yamini %A Callahan,Christopher M %+ Indiana University Center for Aging Research, 1101 West Tenth Street, Indianapolis, IN, 46202, United States, 1 317 274 9334, darbate@iupui.edu %K Alzheimer disease %K Alzheimer disease and related dementias %K caregivers %K mobile health %K social media %K crowdsourcing %K friendsourcing %K emotional support %K informational support %K online support %D 2017 %7 10.04.2017 %9 Original Paper %J JMIR Res Protoc %G English %X Background: In the United States, over 15 million informal caregivers provide unpaid care to people with Alzheimer disease (AD). Compared with others in their age group, AD caregivers have higher rates of stress, and medical and psychiatric illnesses. Psychosocial interventions improve the health of caregivers. However, constraints of time, distance, and availability inhibit the use of these services. Newer online technologies, such as social media, online groups, friendsourcing, and crowdsourcing, present alternative methods of delivering support. However, limited work has been done in this area with caregivers. Objective: The primary aims of this study were to determine (1) the feasibility of innovating peer support group work delivered through social media with friendsourcing, (2) whether the intervention provides an acceptable method for AD caregivers to obtain support, and (3) whether caregiver outcomes were affected by the intervention. A Facebook app provided support to AD caregivers through collecting friendsourced answers to caregiver questions from participants’ social networks. The study’s secondary aim was to descriptively compare friendsourced answers versus crowdsourced answers. Methods: We recruited AD caregivers online to participate in a 6-week-long asynchronous, online, closed group on Facebook, where caregivers received support through moderator prompts, group member interactions, and friendsourced answers to caregiver questions. We surveyed and interviewed participants before and after the online group to assess their needs, views on technology, and experience with the intervention. Caregiver questions were pushed automatically to the participants’ Facebook News Feed, allowing participants’ Facebook friends to see and post answers to the caregiver questions (Friendsourced answers). Of these caregiver questions, 2 were pushed to crowdsource workers through the Amazon Mechanical Turk platform. We descriptively compared characteristics of these crowdsourced answers with the friendsourced answers. Results: In total, 6 AD caregivers completed the initial online survey and semistructured telephone interview. Of these, 4 AD caregivers agreed to participate in the online Facebook closed group activity portion of the study. Friendsourcing and crowdsourcing answers to caregiver questions had similar rates of acceptability as rated by content experts: 90% (27/30) and 100% (45/45), respectively. Rates of emotional support and informational support for both groups of answers appeared to trend with the type of support emphasized in the caregiver question (emotional vs informational support question). Friendsourced answers included more shared experiences (20/30, 67%) than did crowdsourced answers (4/45, 9%). Conclusions: We found an asynchronous, online, closed group on Facebook to be generally acceptable as a means to deliver support to caregivers of people with AD. This pilot is too small to make judgments on effectiveness; however, results trended toward an improvement in caregivers’ self-efficacy, sense of support, and perceived stress, but these results were not statistically significant. Both friendsourced and crowdsourced answers may be an acceptable way to provide informational and emotional support to caregivers of people with AD. %M 28396304 %R 10.2196/resprot.6904 %U http://www.researchprotocols.org/2017/4/e56/ %U https://doi.org/10.2196/resprot.6904 %U http://www.ncbi.nlm.nih.gov/pubmed/28396304 %0 Journal Article %@ 2291-5222 %I JMIR Publications %V 4 %N 4 %P e134 %T The Mobile Phone Affinity Scale: Enhancement and Refinement %A Bock,Beth C %A Lantini,Ryan %A Thind,Herpreet %A Walaska,Kristen %A Rosen,Rochelle K %A Fava,Joseph L %A Barnett,Nancy P %A Scott-Sheldon,Lori AJ %+ Centers for Behavioral and Preventive Medicine, The Miriam Hospital, CORO Building, Suite 309, 164 Summit Avenue, Providence, RI, 02906, United States, 1 401 793 8020, Bbock@lifespan.org %K mobile phone %K psychometrics %K assessment %K measure %D 2016 %7 15.12.2016 %9 Original Paper %J JMIR Mhealth Uhealth %G English %X Background: Existing instruments that assess individuals’ relationships with mobile phones tend to focus on negative constructs such as addiction or dependence, and appear to assume that high mobile phone use reflects pathology. Mobile phones can be beneficial for health behavior change, disease management, work productivity, and social connections, so there is a need for an instrument that provides a more balanced assessment of the various aspects of individuals’ relationships with mobile phones. Objective: The purpose of this research was to develop, revise, and validate the Mobile Phone Affinity Scale, a multi-scale instrument designed to assess key factors associated with mobile phone use. Methods: Participants (N=1058, mean age 33) were recruited from Amazon Mechanical Turk between March and April of 2016 to complete a survey that assessed participants’ mobile phone attitudes and use, anxious and depressive symptoms, and resilience. Results: Confirmatory factor analysis supported a 6-factor model. The final measure consisted of 24 items, with 4 items on each of 6 factors: Connectedness, Productivity, Empowerment, Anxious Attachment, Addiction, and Continuous Use. The subscales demonstrated strong internal consistency (Cronbach alpha range=0.76-0.88, mean 0.83), and high item factor loadings (range=0.57-0.87, mean 0.75). Tests for validity further demonstrated support for the individual subscales. Conclusions: Mobile phone affinity may have an important impact in the development and effectiveness of mobile health interventions, and continued research is needed to assess its predictive ability in health behavior change interventions delivered via mobile phones. %M 27979792 %R 10.2196/mhealth.6705 %U http://mhealth.jmir.org/2016/4/e134/ %U https://doi.org/10.2196/mhealth.6705 %U http://www.ncbi.nlm.nih.gov/pubmed/27979792 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 5 %N 3 %P e166 %T Cloud Based Surveys to Assess Patient Perceptions of Health Care: 1000 Respondents in 3 days for US $300 %A Bardos,Jonah %A Friedenthal,Jenna %A Spiegelman,Jessica %A Williams,Zev %+ Program for Early and Recurrent Pregnancy Loss (PEARL), Department of Obstetrics and Gynecology and Women’s Health, Albert Einstein College of Medicine, 1301 Morris Park Ave, Room 474, Bronx, NY, 10461, United States, 1 718 405 8590, zev.williams@einstein.yu.edu %K Mechanical Turk %K MTurk %K crowdsourcing %K medical survey %K cloud-based survey %K health care perceptions %D 2016 %7 23.08.2016 %9 Original Paper %J JMIR Res Protoc %G English %X Background: There are many challenges in conducting surveys of study participants, including cost, time, and ability to obtain quality and reproducible work. Cloudsourcing (an arrangement where a cloud provider is paid to carry out services that could be provided in-house) has the potential to provide vastly larger, less expensive, and more generalizable survey pools. Objective: The objective of this study is to evaluate, using Amazon's Mechanical Turk (MTurk), a cloud-based workforce to assess patients’ perspectives of health care. Methods: A national online survey posted to Amazon's MTurk consisted of 33 multiple choice and open-ended questions. Continuous attributes were compared using t tests. Results: We obtained 1084 responses for a total cost of US $298.10 in less than 3 days with 300 responses in under 6 hours. Of those, 44.74% (485/1084) were male and 54.80% (594/1084) female, representing 49 out of 50 states and aged 18 to 69 years. Conclusions: Amazon’s MTurk is a potentially useful survey method for attaining information regarding public opinions and/or knowledge with the distinct advantage of cost, speed, and a wide and relatively good representation of the general population, in a confidential setting for respondents. %M 27554915 %R 10.2196/resprot.5772 %U http://www.researchprotocols.org/2016/3/e166/ %U https://doi.org/10.2196/resprot.5772 %U http://www.ncbi.nlm.nih.gov/pubmed/27554915 %0 Journal Article %@ 1929-0748 %I JMIR Publications %V 5 %N 2 %P e104 %T Consumers’ Patient Portal Preferences and Health Literacy: A Survey Using Crowdsourcing %A Zide,Mary %A Caswell,Kaitlyn %A Peterson,Ellen %A Aberle,Denise R %A Bui,Alex AT %A Arnold,Corey W %+ Department of Bioengineering, University of California, Los Angeles, 924 Westwood Blvd, Suite 420, Los Angeles, CA, 90024, United States, 1 5102075012, mmcnamara@ucla.edu %K consumer health information %K health literacy %K eHealth %K patient portal %D 2016 %7 08.06.2016 %9 Original Paper %J JMIR Res Protoc %G English %X Background: eHealth apps have the potential to meet the information needs of patient populations and improve health literacy rates. However, little work has been done to document perceived usability of portals and health literacy of specific topics. Objective: Our aim was to establish a baseline of lung cancer health literacy and perceived portal usability. Methods: A survey based on previously validated instruments was used to assess a baseline of patient portal usability and health literacy within the domain of lung cancer. The survey was distributed via Amazon’s Mechanical Turk to 500 participants. Results: Our results show differences in preferences and literacy by demographic cohorts, with a trend of chronically ill patients having a more positive reception of patient portals and a higher health literacy rate of lung cancer knowledge (P<.05). Conclusions: This article provides a baseline of usability needs and health literacy that suggests that chronically ill patients have a greater preference for patient portals and higher level of health literacy within the domain of lung cancer. %M 27278634 %R 10.2196/resprot.5122 %U http://www.researchprotocols.org/2016/2/e104/ %U https://doi.org/10.2196/resprot.5122 %U http://www.ncbi.nlm.nih.gov/pubmed/27278634 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 18 %N 6 %P e137 %T Manipulating Google’s Knowledge Graph Box to Counter Biased Information Processing During an Online Search on Vaccination: Application of a Technological Debiasing Strategy %A Ludolph,Ramona %A Allam,Ahmed %A Schulz,Peter J %+ Institute of Communication and Health, Faculty of Communication Sciences, University of Lugano (Università della Svizzera italiana), Via G. Buffi 13, Lugano, 6904, Switzerland, 41 58666 ext 4821, ramona.alexandra.ludolph@usi.ch %K search engine %K online health information search %K vaccination %K debiasing %K search behavior %K health communication %K information processing %K information seeking %D 2016 %7 02.06.2016 %9 Original Paper %J J Med Internet Res %G English %X Background: One of people’s major motives for going online is the search for health-related information. Most consumers start their search with a general search engine but are unaware of the fact that its sorting and ranking criteria do not mirror information quality. This misconception can lead to distorted search outcomes, especially when the information processing is characterized by heuristic principles and resulting cognitive biases instead of a systematic elaboration. As vaccination opponents are vocal on the Web, the chance of encountering their non‒evidence-based views on immunization is high. Therefore, biased information processing in this context can cause subsequent impaired judgment and decision making. A technological debiasing strategy could counter this by changing people’s search environment. Objective: This study aims at testing a technological debiasing strategy to reduce the negative effects of biased information processing when using a general search engine on people’s vaccination-related knowledge and attitudes. This strategy is to manipulate the content of Google’s knowledge graph box, which is integrated in the search interface and provides basic information about the search topic. Methods: A full 3x2 factorial, posttest-only design was employed with availability of basic factual information (comprehensible vs hardly comprehensible vs not present) as the first factor and a warning message as the second factor of experimental manipulation. Outcome variables were the evaluation of the knowledge graph box, vaccination-related knowledge, as well as beliefs and attitudes toward vaccination, as represented by three latent variables emerged from an exploratory factor analysis. Results: Two-way analysis of variance revealed a significant main effect of availability of basic information in the knowledge graph box on participants’ vaccination knowledge scores (F2,273=4.86, P=.01), skepticism/fear of vaccination side effects (F2,273=3.5, P=.03), and perceived information quality (F2,273=3.73, P=.02). More specifically, respondents receiving comprehensible information appeared to be more knowledgeable, less skeptical of vaccination, and more critical of information quality compared to participants exposed to hardly comprehensible information. Although, there was no significant interaction effect between the availability of information and the presence of the warning, there was a dominant pattern in which the presence of the warning appeared to have a positive influence on the group receiving comprehensible information while the opposite was true for the groups exposed to hardly comprehensible information and no information at all. Participants evaluated the knowledge graph box as moderately to highly useful, with no significant differences among the experimental groups. Conclusion: Overall, the results suggest that comprehensible information in the knowledge graph box positively affects participants’ vaccination-related knowledge and attitudes. A small change in the content retrieval procedure currently used by Google could already make a valuable difference in the pursuit of an unbiased online information search. Further research is needed to gain insights into the knowledge graph box’s entire potential. %M 27255736 %R 10.2196/jmir.5430 %U http://www.jmir.org/2016/6/e137/ %U https://doi.org/10.2196/jmir.5430 %U http://www.ncbi.nlm.nih.gov/pubmed/27255736 %0 Journal Article %@ 1438-8871 %I JMIR Publications %V 18 %N 6 %P e127 %T The Impact of an Online Crowdsourcing Diagnostic Tool on Health Care Utilization: A Case Study Using a Novel Approach to Retrospective Claims Analysis %A Juusola,Jessie L %A Quisel,Thomas R %A Foschini,Luca %A Ladapo,Joseph A %+ Evidation Health, Inc, 15 N. Ellsworth Ave, San Mateo, CA, 94401, United States, 1 650 279 8855, jjuusola@evidation.com %K crowdsourcing %K diagnosis %K mHealth %K online systems %D 2016 %7 01.06.2016 %9 Original Paper %J J Med Internet Res %G English %X Background: Patients with difficult medical cases often remain undiagnosed despite visiting multiple physicians. A new online platform, CrowdMed, uses crowdsourcing to quickly and efficiently reach an accurate diagnosis for these patients. Objective: This study sought to evaluate whether CrowdMed decreased health care utilization for patients who have used the service. Methods: Novel, electronic methods of patient recruitment and data collection were utilized. Patients who completed cases on CrowdMed’s platform between July 2014 and April 2015 were recruited for the study via email and screened via an online survey. After providing eConsent, participants provided identifying information used to access their medical claims data, which was retrieved through a third-party web application program interface (API). Utilization metrics including frequency of provider visits and medical charges were compared pre- and post-case resolution to assess the impact of resolving a case on CrowdMed. Results: Of 45 CrowdMed users who completed the study survey, comprehensive claims data was available via API for 13 participants, who made up the final enrolled sample. There were a total of 221 health care provider visits collected for the study participants, with service dates ranging from September 2013 to July 2015. Frequency of provider visits was significantly lower after resolution of a case on CrowdMed (mean of 1.07 visits per month pre-resolution vs. 0.65 visits per month post-resolution, P=.01). Medical charges were also significantly lower after case resolution (mean of US $719.70 per month pre-resolution vs. US $516.79 per month post-resolution, P=.03). There was no significant relationship between study results and disease onset date, and there was no evidence of regression to the mean influencing results. Conclusions: This study employed technology-enabled methods to demonstrate that patients who used CrowdMed had lower health care utilization after case resolution. However, since the final sample size was limited, results should be interpreted as a case study. Despite this limitation, the statistically significant results suggest that online crowdsourcing shows promise as an efficient method of solving difficult medical cases. %M 27251384 %R 10.2196/jmir.5644 %U http://www.jmir.org/2016/6/e127/ %U https://doi.org/10.2196/jmir.5644 %U http://www.ncbi.nlm.nih.gov/pubmed/27251384 %0 Journal Article %@ 1929-0748 %I JMIR Publications Inc. %V 5 %N 2 %P e65 %T Comparing Brief Internet-Based Compassionate Mind Training and Cognitive Behavioral Therapy for Perinatal Women: Study Protocol for a Randomized Controlled Trial %A Kelman,Alex R %A Stanley,Meagan L %A Barrera,Alinne Z %A Cree,Michelle %A Heineberg,Yotam %A Gilbert,Paul %+ Palo Alto University, 1791 Arastradero Road, Palo Alto, CA, 94304, United States, 1 650 396 9349, akelman@paloaltou.edu %K perinatal depression %K comparative trial %K Internet intervention %K Amazon Mechanical Turk %D 2016 %7 15.04.2016 %9 Protocol %J JMIR Res Protoc %G English %X Background: Depression that occurs during the perinatal period has substantial costs for both the mother and her baby. Since in-person care often falls short of meeting the global need of perinatal women, Internet interventions may function as an alternate to help women who currently lack adequate access to face-to-face psychological resources. However, at present there are insufficient empirically supported Internet-based resources for perinatal women. Objective: The aim of this study is to compare the relative efficacy of Internet-based cognitive behavioral therapy (CBT) to a novel Internet-based compassionate mind training approach (CMT) across measures of affect, self-reassurance, self-criticizing, self-attacking, self-compassion, depression, and anxiety. While CBT has been tested and has some support as an Internet tool for perinatal women, this is the first trial to look at CMT for perinatal women over the Internet. Methods: Participants were recruited through Amazon Mechanical Turk (MTurk) and professional networks. Following completion of demographic items, participants were randomly assigned to either the CBT or CMT condition. Each condition consisted of 45-minute interactive didactic and follow-up exercises to be completed over the course of two weeks. Results: Post course data was gathered at two weeks. A 2x2 repeated measures analysis of variance will be conducted to analyze differences between conditions at post course. Conclusions: The implications of the trial will be discussed as well as the strengths and limitations of MTurk as a tool for recruitment. We will also briefly introduce the future directions along this same line of research. Trial Registration: ClinicalTrials.gov NCT02469324; https://clinicaltrials.gov/ct2/show/NCT02469324 (Archived by WebCite at http://www.webcitation.org/6fkSG3yuW) %M 27084301 %R 10.2196/resprot.5332 %U http://www.researchprotocols.org/2016/2/e65/ %U https://doi.org/10.2196/resprot.5332 %U http://www.ncbi.nlm.nih.gov/pubmed/27084301 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 18 %N 4 %P e81 %T Crowdsourcing and the Accuracy of Online Information Regarding Weight Gain in Pregnancy: A Descriptive Study %A Chang,Tammy %A Verma,Bianca A %A Shull,Trevor %A Moniz,Michelle H %A Kohatsu,Lauren %A Plegue,Melissa A %A Collins-Thompson,Kevyn %+ Department of Family Medicine, University of Michigan, 2800 Plymouth Rd, Building 14- Room G107, Ann Arbor, MI, , United States, 1 734 647 3305, tachang@med.umich.edu %K Internet %K crowdsourcing %K weight gain %K pregnancy %D 2016 %7 07.04.2016 %9 Original Paper %J J Med Internet Res %G English %X Background: Excess weight gain affects nearly half of all pregnancies in the United States and is a strong risk factor for adverse maternal and fetal outcomes, including long-term obesity. The Internet is a prominent source of information during pregnancy; however, the accuracy of this online information is unknown. Objective: To identify, characterize, and assess the accuracy of frequently accessed webpages containing information about weight gain during pregnancy. Methods: A descriptive study was used to identify and search frequently used phrases related to weight gain during pregnancy on the Google search engine. The first 10 webpages of each query were characterized by type and then assessed for accuracy and completeness, as compared to Institute of Medicine guidelines, using crowdsourcing. Results: A total of 114 queries were searched, yielding 305 unique webpages. Of these webpages, 181 (59.3%) included information regarding weight gain during pregnancy. Out of 181 webpages, 62 (34.3%) contained no specific recommendations, 48 (26.5%) contained accurate but incomplete recommendations, 41 (22.7%) contained complete and accurate recommendations, and 22 (12.2%) were inaccurate. Webpages were most commonly from for-profit websites (112/181, 61.9%), followed by government (19/181, 10.5%), medical organizations or associations (13/181, 7.2%), and news sites (12/181, 6.6%). The largest proportion of for-profit sites contained no specific recommendations (44/112, 39.3%). Among pages that provided inaccurate information (22/181, 12.2%), 68% (15/22) were from for-profit sites. Conclusions: For-profit websites dominate the online space with regard to weight gain during pregnancy and largely contain incomplete, inaccurate, or no specific recommendations. This represents a significant information gap regarding an important risk factor for obesity among mothers and infants. Our findings suggest that greater clinical and public health efforts to disseminate accurate information regarding healthy weight gain during pregnancy may help prevent significant morbidity and may support healthier pregnancies among at-risk women and children. %M 27056465 %R 10.2196/jmir.5138 %U http://www.jmir.org/2016/4/e81/ %U https://doi.org/10.2196/jmir.5138 %U http://www.ncbi.nlm.nih.gov/pubmed/27056465 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 18 %N 1 %P e12 %T Crowdsourcing Diagnosis for Patients With Undiagnosed Illnesses: An Evaluation of CrowdMed %A Meyer,Ashley N.D %A Longhurst,Christopher A %A Singh,Hardeep %+ Houston Veterans Affairs Center for Innovations in Quality, Effectiveness and Safety, Michael E. DeBakey Veterans Affairs Medical Center, VA HSR&D Center of Innovation (152), 2002 Holcombe Boulevard, Houston, TX, 77030, United States, 1 713 794 8601, hardeeps@bcm.edu %K crowdsourcing %K diagnosis %K diagnostic errors %K patient safety %K World Wide Web %D 2016 %7 14.01.2016 %9 Original Paper %J J Med Internet Res %G English %X Background: Despite visits to multiple physicians, many patients remain undiagnosed. A new online program, CrowdMed, aims to leverage the “wisdom of the crowd” by giving patients an opportunity to submit their cases and interact with case solvers to obtain diagnostic possibilities. Objective: To describe CrowdMed and provide an independent assessment of its impact. Methods: Patients submit their cases online to CrowdMed and case solvers sign up to help diagnose patients. Case solvers attempt to solve patients’ diagnostic dilemmas and often have an interactive online discussion with patients, including an exchange of additional diagnostic details. At the end, patients receive detailed reports containing diagnostic suggestions to discuss with their physicians and fill out surveys about their outcomes. We independently analyzed data collected from cases between May 2013 and April 2015 to determine patient and case solver characteristics and case outcomes. Results: During the study period, 397 cases were completed. These patients previously visited a median of 5 physicians, incurred a median of US $10,000 in medical expenses, spent a median of 50 hours researching their illnesses online, and had symptoms for a median of 2.6 years. During this period, 357 active case solvers participated, of which 37.9% (132/348) were male and 58.3% (208/357) worked or studied in the medical industry. About half (50.9%, 202/397) of patients were likely to recommend CrowdMed to a friend, 59.6% (233/391) reported that the process gave insights that led them closer to the correct diagnoses, 57% (52/92) reported estimated decreases in medical expenses, and 38% (29/77) reported estimated improvement in school or work productivity. Conclusions: Some patients with undiagnosed illnesses reported receiving helpful guidance from crowdsourcing their diagnoses during their difficult diagnostic journeys. However, further development and use of crowdsourcing methods to facilitate diagnosis requires long-term evaluation as well as validation to account for patients’ ultimate correct diagnoses. %M 26769236 %R 10.2196/jmir.4887 %U http://www.jmir.org/2016/1/e12/ %U https://doi.org/10.2196/jmir.4887 %U http://www.ncbi.nlm.nih.gov/pubmed/26769236 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 17 %N 12 %P e281 %T Assessing Pictograph Recognition: A Comparison of Crowdsourcing and Traditional Survey Approaches %A Kuang,Jinqiu %A Argo,Lauren %A Stoddard,Greg %A Bray,Bruce E %A Zeng-Treitler,Qing %+ Department of Biomedical Informatics, University of Utah, 421 Wakara Way, Suite 140, Salt Lake City, UT, 84108, United States, 1 801 581 4080, Jinqiu.kuang@utah.edu %K crowdsourcing %K patient discharge summaries %K Amazon Mechanical Turk %K pictograph recognition %K cardiovascular %D 2015 %7 17.12.2015 %9 Original Paper %J J Med Internet Res %G English %X Background: Compared to traditional methods of participant recruitment, online crowdsourcing platforms provide a fast and low-cost alternative. Amazon Mechanical Turk (MTurk) is a large and well-known crowdsourcing service. It has developed into the leading platform for crowdsourcing recruitment. Objective: To explore the application of online crowdsourcing for health informatics research, specifically the testing of medical pictographs. Methods: A set of pictographs created for cardiovascular hospital discharge instructions was tested for recognition. This set of illustrations (n=486) was first tested through an in-person survey in a hospital setting (n=150) and then using online MTurk participants (n=150). We analyzed these survey results to determine their comparability. Results: Both the demographics and the pictograph recognition rates of online participants were different from those of the in-person participants. In the multivariable linear regression model comparing the 2 groups, the MTurk group scored significantly higher than the hospital sample after adjusting for potential demographic characteristics (adjusted mean difference 0.18, 95% CI 0.08-0.28, P<.001). The adjusted mean ratings were 2.95 (95% CI 2.89-3.02) for the in-person hospital sample and 3.14 (95% CI 3.07-3.20) for the online MTurk sample on a 4-point Likert scale (1=totally incorrect, 4=totally correct). Conclusions: The findings suggest that crowdsourcing is a viable complement to traditional in-person surveys, but it cannot replace them. %M 26678085 %R 10.2196/jmir.4582 %U http://www.jmir.org/2015/12/e281/ %U https://doi.org/10.2196/jmir.4582 %U http://www.ncbi.nlm.nih.gov/pubmed/26678085 %0 Journal Article %@ 2369-2960 %I JMIR Publications %V 1 %N 2 %P e7 %T Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter %A Adrover,Cosme %A Bodnar,Todd %A Huang,Zhuojie %A Telenti,Amalio %A Salathé,Marcel %+ Center for Infectious Disease Dynamics, Department of Biology, Penn State University, MSC W-251, University Park, PA, 16803, United States, 1 4083868916, salathe.marcel@gmail.com %K Twitter %K HIV %K AIDS %K pharmacovigilance %K mTurk %K mechanical Turk %D 2015 %7 27.07.2015 %9 Original Paper %J JMIR Public Health Surveill %G English %X Background: Social media platforms are increasingly seen as a source of data on a wide range of health issues. Twitter is of particular interest for public health surveillance because of its public nature. However, the very public nature of social media platforms such as Twitter may act as a barrier to public health surveillance, as people may be reluctant to publicly disclose information about their health. This is of particular concern in the context of diseases that are associated with a certain degree of stigma, such as HIV/AIDS. Objective: The objective of the study is to assess whether adverse effects of HIV drug treatment and associated sentiments can be determined using publicly available data from social media. Methods: We describe a combined approach of machine learning and crowdsourced human assessment to identify adverse effects of HIV drug treatment solely on individual reports posted publicly on Twitter. Starting from a large dataset of 40 million tweets collected over three years, we identify a very small subset (1642; 0.004%) of individual reports describing personal experiences with HIV drug treatment. Results: Despite the small size of the extracted final dataset, the summary representation of adverse effects attributed to specific drugs, or drug combinations, accurately captures well-recognized toxicities. In addition, the data allowed us to discriminate across specific drug compounds, to identify preferred drugs over time, and to capture novel events such as the availability of preexposure prophylaxis. Conclusions: The effect of limited data sharing due to the public nature of the data can be partially offset by the large number of people sharing data in the first place, an observation that may play a key role in digital epidemiology in general. %M 27227141 %R 10.2196/publichealth.4488 %U http://publichealth.jmir.org/2015/2/e7/ %U https://doi.org/10.2196/publichealth.4488 %U http://www.ncbi.nlm.nih.gov/pubmed/27227141 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 17 %N 6 %P e138 %T A Scalable Framework to Detect Personal Health Mentions on Twitter %A Yin,Zhijun %A Fabbri,Daniel %A Rosenbloom,S Trent %A Malin,Bradley %+ Dept. of Electrical Engineering & Computer Science, Vanderbilt University, Department of Biomedical Informatics, Vanderbilt University, 2525 West End Avenue, Suite 1030, Nashville, TN, 37203, United States, 1 615 343 9096, b.malin@vanderbilt.edu %K consumer health %K information retrieval %K machine learning %K social media %K twitter %K infodemiology %D 2015 %7 05.06.2015 %9 Original Paper %J J Med Internet Res %G English %X Background: Biomedical research has traditionally been conducted via surveys and the analysis of medical records. However, these resources are limited in their content, such that non-traditional domains (eg, online forums and social media) have an opportunity to supplement the view of an individual’s health. Objective: The objective of this study was to develop a scalable framework to detect personal health status mentions on Twitter and assess the extent to which such information is disclosed. Methods: We collected more than 250 million tweets via the Twitter streaming API over a 2-month period in 2014. The corpus was filtered down to approximately 250,000 tweets, stratified across 34 high-impact health issues, based on guidance from the Medical Expenditure Panel Survey. We created a labeled corpus of several thousand tweets via a survey, administered over Amazon Mechanical Turk, that documents when terms correspond to mentions of personal health issues or an alternative (eg, a metaphor). We engineered a scalable classifier for personal health mentions via feature selection and assessed its potential over the health issues. We further investigated the utility of the tweets by determining the extent to which Twitter users disclose personal health status. Results: Our investigation yielded several notable findings. First, we find that tweets from a small subset of the health issues can train a scalable classifier to detect health mentions. Specifically, training on 2000 tweets from four health issues (cancer, depression, hypertension, and leukemia) yielded a classifier with precision of 0.77 on all 34 health issues. Second, Twitter users disclosed personal health status for all health issues. Notably, personal health status was disclosed over 50% of the time for 11 out of 34 (33%) investigated health issues. Third, the disclosure rate was dependent on the health issue in a statistically significant manner (P<.001). For instance, more than 80% of the tweets about migraines (83/100) and allergies (85/100) communicated personal health status, while only around 10% of the tweets about obesity (13/100) and heart attack (12/100) did so. Fourth, the likelihood that people disclose their own versus other people’s health status was dependent on health issue in a statistically significant manner as well (P<.001). For example, 69% (69/100) of the insomnia tweets disclosed the author’s status, while only 1% (1/100) disclosed another person’s status. By contrast, 1% (1/100) of the Down syndrome tweets disclosed the author’s status, while 21% (21/100) disclosed another person’s status. Conclusions: It is possible to automatically detect personal health status mentions on Twitter in a scalable manner. These mentions correspond to the health issues of the Twitter users themselves, but also other individuals. Though this study did not investigate the veracity of such statements, we anticipate such information may be useful in supplementing traditional health-related sources for research purposes. %M 26048075 %R 10.2196/jmir.4305 %U http://www.jmir.org/2015/6/e138/ %U https://doi.org/10.2196/jmir.4305 %U http://www.ncbi.nlm.nih.gov/pubmed/26048075 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 17 %N 3 %P e72 %T Efficacy of a Web-Based, Crowdsourced Peer-To-Peer Cognitive Reappraisal Platform for Depression: Randomized Controlled Trial %A Morris,Robert R %A Schueller,Stephen M %A Picard,Rosalind W %+ MIT Media Lab, Massachusetts Institute of Technology, E14-348A, 75 Amherst St, Cambridge, MA, 02139, United States, 1 6172530611, rmorris@media.mit.edu %K Web-based intervention %K crowdsourcing %K randomized controlled trial %K depression %K cognitive behavioral therapy %K mental health %K social networks %D 2015 %7 30.03.2015 %9 Original Paper %J J Med Internet Res %G English %X Background: Self-guided, Web-based interventions for depression show promising results but suffer from high attrition and low user engagement. Online peer support networks can be highly engaging, but they show mixed results and lack evidence-based content. Objective: Our aim was to introduce and evaluate a novel Web-based, peer-to-peer cognitive reappraisal platform designed to promote evidence-based techniques, with the hypotheses that (1) repeated use of the platform increases reappraisal and reduces depression and (2) that the social, crowdsourced interactions enhance engagement. Methods: Participants aged 18-35 were recruited online and were randomly assigned to the treatment group, “Panoply” (n=84), or an active control group, online expressive writing (n=82). Both are fully automated Web-based platforms. Participants were asked to use their assigned platform for a minimum of 25 minutes per week for 3 weeks. Both platforms involved posting descriptions of stressful thoughts and situations. Participants on the Panoply platform additionally received crowdsourced reappraisal support immediately after submitting a post (median response time=9 minutes). Panoply participants could also practice reappraising stressful situations submitted by other users. Online questionnaires administered at baseline and 3 weeks assessed depression symptoms, reappraisal, and perseverative thinking. Engagement was assessed through self-report measures, session data, and activity levels. Results: The Panoply platform produced significant improvements from pre to post for depression (P=.001), reappraisal (P<.001), and perseverative thinking (P<.001). The expressive writing platform yielded significant pre to post improvements for depression (P=.02) and perseverative thinking (P<.001), but not reappraisal (P=.45). The two groups did not diverge significantly at post-test on measures of depression or perseverative thinking, though Panoply users had significantly higher reappraisal scores (P=.02) than expressive writing. We also found significant group by treatment interactions. Individuals with elevated depression symptoms showed greater comparative benefit from Panoply for depression (P=.02) and perseverative thinking (P=.008). Individuals with baseline reappraisal deficits showed greater comparative benefit from Panoply for depression (P=.002) and perseverative thinking (P=.002). Changes in reappraisal mediated the effects of Panoply, but not the expressive writing platform, for both outcomes of depression (ab=-1.04, SE 0.58, 95% CI -2.67 to -.12) and perseverative thinking (ab=-1.02, SE 0.61, 95% CI -2.88 to -.20). Dropout rates were similar for the two platforms; however, Panoply yielded significantly more usage activity (P<.001) and significantly greater user experience scores (P<.001). Conclusions: Panoply engaged its users and was especially helpful for depressed individuals and for those who might ordinarily underutilize reappraisal techniques. Further investigation is needed to examine the long-term effects of such a platform and whether the benefits generalize to a more diverse population of users. Trial Registration: ClinicalTrials.gov NCT02302248; https://clinicaltrials.gov/ct2/show/NCT02302248 (Archived by WebCite at http://www.webcitation.org/6Wtkj6CXU). %M 25835472 %R 10.2196/jmir.4167 %U http://www.jmir.org/2015/3/e72/ %U https://doi.org/10.2196/jmir.4167 %U http://www.ncbi.nlm.nih.gov/pubmed/25835472 %0 Journal Article %@ 1929-0748 %I JMIR Publications Inc. %V 4 %N 1 %P e34 %T Focus Groups Move Online: Feasibility of Tumblr Use for eHealth Curriculum Development %A Elliot,Diane %A Rohlman,Diane %A Parish,Megan %+ Oregon Health & Science University, Department of Medicine, CR110, 3181 SW Sam Jackson Park Road, Portland, OR, 97239, United States, 1 503 494 6554, elliotd@ohsu.edu %K Tumblr %K focus group %K crowdsourcing %K curriculum development %K Internet %D 2015 %7 27.03.2015 %9 Short Paper %J JMIR Res Protoc %G English %X Background: Constructing successful online programs requires engaging potential users in development. However, assembling focus groups can be costly and time consuming. Objective: The aim of this study is to assess whether Tumblr can be used to prioritize activities for an online younger worker risk reduction and health promotion program. Methods: Younger summer parks and recreation employees were encouraged to visit Tumblr using weekly announcements and competitions. Each week, new activities were posted on Tumblr with linked survey questions. Responses were downloaded and analyzed. Results: An average of 36 young workers rated each activity on its likeability and perceived educational value. The method was feasible, efficient, and sustainable across the summer weeks. Ratings indicated significant differences in likeability among activities (P<.005). Conclusions: Tumblr is a means to crowdsource formative feedback on potential curricular components when assembling an online intervention. This paper describes its initial use as well as suggestions for future refinements. %M 25831197 %R 10.2196/resprot.3432 %U http://www.researchprotocols.org/2015/1/e34/ %U https://doi.org/10.2196/resprot.3432 %U http://www.ncbi.nlm.nih.gov/pubmed/25831197 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 17 %N 3 %P e80 %T Ranking Adverse Drug Reactions With Crowdsourcing %A Gottlieb,Assaf %A Hoehndorf,Robert %A Dumontier,Michel %A Altman,Russ B %+ Departments of Genetics and Bioengineering, Stanford University, Shriram Center Room 209 443 Via Ortega MC 4245, Stanford, CA, 94305, United States, 1 (650) 725 3394, russ.altman@stanford.edu %K pharmacovigilance %K adverse drug reactions %K drug side effects %K crowdsourcing %K patient-centered care %K alert fatigue %D 2015 %7 23.03.2015 %9 Original Paper %J J Med Internet Res %G English %X Background: There is no publicly available resource that provides the relative severity of adverse drug reactions (ADRs). Such a resource would be useful for several applications, including assessment of the risks and benefits of drugs and improvement of patient-centered care. It could also be used to triage predictions of drug adverse events. Objective: The intent of the study was to rank ADRs according to severity. Methods: We used Internet-based crowdsourcing to rank ADRs according to severity. We assigned 126,512 pairwise comparisons of ADRs to 2589 Amazon Mechanical Turk workers and used these comparisons to rank order 2929 ADRs. Results: There is good correlation (rho=.53) between the mortality rates associated with ADRs and their rank. Our ranking highlights severe drug-ADR predictions, such as cardiovascular ADRs for raloxifene and celecoxib. It also triages genes associated with severe ADRs such as epidermal growth-factor receptor (EGFR), associated with glioblastoma multiforme, and SCN1A, associated with epilepsy. Conclusions: ADR ranking lays a first stepping stone in personalized drug risk assessment. Ranking of ADRs using crowdsourcing may have useful clinical and financial implications, and should be further investigated in the context of health care decision making. %M 25800813 %R 10.2196/jmir.3962 %U http://www.jmir.org/2015/3/e80/ %U https://doi.org/10.2196/jmir.3962 %U http://www.ncbi.nlm.nih.gov/pubmed/25800813 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 16 %N 10 %P e233 %T Rapid Grading of Fundus Photographs for Diabetic Retinopathy Using Crowdsourcing %A Brady,Christopher J %A Villanti,Andrea C %A Pearson,Jennifer L %A Kirchner,Thomas R %A Gupta,Omesh P %A Shah,Chirag P %+ Wilmer Eye Institute, Johns Hopkins University School of Medicine, 600 N Wolfe St., Maumenee 711, Baltimore, MD, 21287, United States, 1 (410) 502 2789, brady@jhmi.edu %K diabetic retinopathy %K telemedicine %K fundus photography %K crowdsourcing %K Amazon Mechanical Turk %D 2014 %7 30.10.2014 %9 Original Paper %J J Med Internet Res %G English %X Background: Screening for diabetic retinopathy is both effective and cost-effective, but rates of screening compliance remain suboptimal. As screening improves, new methods to deal with screening data may help reduce the human resource needs. Crowdsourcing has been used in many contexts to harness distributed human intelligence for the completion of small tasks including image categorization. Objective: Our goal was to develop and validate a novel method for fundus photograph grading. Methods: An interface for fundus photo classification was developed for the Amazon Mechanical Turk crowdsourcing platform. We posted 19 expert-graded images for grading by Turkers, with 10 repetitions per photo for an initial proof-of-concept (Phase I). Turkers were paid US $0.10 per image. In Phase II, one prototypical image from each of the four grading categories received 500 unique Turker interpretations. Fifty draws of 1-50 Turkers were then used to estimate the variance in accuracy derived from randomly drawn samples of increasing crowd size to determine the minimum number of Turkers needed to produce valid results. In Phase III, the interface was modified to attempt to improve Turker grading. Results: Across 230 grading instances in the normal versus abnormal arm of Phase I, 187 images (81.3%) were correctly classified by Turkers. Average time to grade each image was 25 seconds, including time to review training images. With the addition of grading categories, time to grade each image increased and percentage of images graded correctly decreased. In Phase II, area under the curve (AUC) of the receiver-operator characteristic (ROC) indicated that sensitivity and specificity were maximized after 7 graders for ratings of normal versus abnormal (AUC=0.98) but was significantly reduced (AUC=0.63) when Turkers were asked to specify the level of severity. With improvements to the interface in Phase III, correctly classified images by the mean Turker grade in four-category grading increased to a maximum of 52.6% (10/19 images) from 26.3% (5/19 images). Throughout all trials, 100% sensitivity for normal versus abnormal was maintained. Conclusions: With minimal training, the Amazon Mechanical Turk workforce can rapidly and correctly categorize fundus photos of diabetic patients as normal or abnormal, though further refinement of the methodology is needed to improve Turker ratings of the degree of retinopathy. Images were interpreted for a total cost of US $1.10 per eye. Crowdsourcing may offer a novel and inexpensive means to reduce the skilled grader burden and increase screening for diabetic retinopathy. %M 25356929 %R 10.2196/jmir.3807 %U http://www.jmir.org/2014/10/e233/ %U https://doi.org/10.2196/jmir.3807 %U http://www.ncbi.nlm.nih.gov/pubmed/25356929 %0 Journal Article %@ 1438-8871 %I JMIR Publications Inc. %V 16 %N 9 %P e216 %T Crowdsourcing Knowledge Discovery and Innovations in Medicine %A Celi,Leo Anthony %A Ippolito,Andrea %A Montgomery,Robert A %A Moses,Christopher %A Stone,David J %+ Institute for Medical Engineering and Science, Laboratory of Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, E25-505, Cambridge, MA, 02139, United States, 1 617 253 7937, lceli@mit.edu %K knowledge discovery %K crowdsourcing %K innovation %K hackathon %D 2014 %7 19.09.2014 %9 Viewpoint %J J Med Internet Res %G English %X Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. “Health hackathons” and “data marathons”, in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled. %M 25239002 %R 10.2196/jmir.3761 %U http://www.jmir.org/2014/9/e216/ %U https://doi.org/10.2196/jmir.3761 %U http://www.ncbi.nlm.nih.gov/pubmed/25239002 %0 Journal Article %@ 2291-5222 %I JMIR Publications Inc. %V 2 %N 3 %P e37 %T FoodSwitch: A Mobile Phone App to Enable Consumers to Make Healthier Food Choices and Crowdsourcing of National Food Composition Data %A Dunford,Elizabeth %A Trevena,Helen %A Goodsell,Chester %A Ng,Ka Hung %A Webster,Jacqui %A Millis,Audra %A Goldstein,Stan %A Hugueniot,Orla %A Neal,Bruce %+ The George Institute for Global Health, Food Policy Division, PO Box M201 Missenden Rd, Camperdown, 2050, Australia, 61 285072529, edunford@georgeinstitute.org.au %K smartphone technology %K traffic light labeling %K food choices %K public health nutrition %K processed food %D 2014 %7 21.08.2014 %9 Original Paper %J JMIR mHealth uHealth %G English %X Background: Front-of-pack nutrition labeling (FoPL) schemes can help consumers understand the nutritional content of foods and may aid healthier food choices. However, most packaged foods in Australia carry no easily interpretable FoPL, and no standard FoPL system has yet been mandated. About two thirds of Australians now own a smartphone. Objective: We sought to develop a mobile phone app that would provide consumers with easy-to-understand nutrition information and support the selection of healthier choices when shopping for food. Methods: An existing branded food database including 17,000 Australian packaged foods underpinned the project. An iterative process of development, review, and testing was undertaken to define a user interface that could deliver nutritional information. A parallel process identified the best approach to rank foods based on nutritional content, so that healthier alternative products could be recommended. Results: Barcode scanning technology was identified as the optimal mechanism for interaction of the mobile phone with the food database. Traffic light labels were chosen as the preferred format for presenting nutritional information, and the Food Standards Australia New Zealand nutrient profiling method as the best strategy for identifying healthier products. The resulting FoodSwitch mobile phone app was launched in Australia in January 2012 and was downloaded by about 400,000 users in the first 18 months. FoodSwitch has maintained a 4-plus star rating, and more than 2000 users have provided feedback about the functionality. Nutritional information for more than 30,000 additional products has been obtained from users through a crowdsourcing function integrated within the app. Conclusions: FoodSwitch has empowered Australian consumers seeking to make better food choices. In parallel, the huge volume of crowdsourced data has provided a novel means for low-cost, real-time tracking of the nutritional composition of Australian foods. There appears to be significant opportunity for this approach in many other countries. %M 25147135 %R 10.2196/mhealth.3230 %U http://mhealth.jmir.org/2014/3/e37/ %U https://doi.org/10.2196/mhealth.3230 %U http://www.ncbi.nlm.nih.gov/pubmed/25147135 %0 Journal Article %@ 2291-9279 %I JMIR Publications Inc. %V 2 %N 2 %P e7 %T The Cure: Design and Evaluation of a Crowdsourcing Game for Gene Selection for Breast Cancer Survival Prediction %A Good,Benjamin M %A Loguercio,Salvatore %A Griffith,Obi L %A Nanis,Max %A Wu,Chunlei %A Su,Andrew I %+ The Scripps Research Institute, Department of Molecular and Experimental Medicine, MEM-216, 10550 North Torrey Pines Road, La Jolla, CA, 92037, United States, 1 619 261 2046, bgood@scripps.edu %K breast neoplasms %K gene expression %K artificial intelligence %K survival analysis %K crowdsourcing %K Web applications %K computer games %K collaborative and social computing systems and tools %K supervised learning %K feature selection %D 2014 %7 29.07.2014 %9 Original Paper %J JMIR Serious Games %G English %X Background: Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility, and biological interpretability. Methods that take advantage of structured prior knowledge (eg, protein interaction networks) show promise in helping to define better signatures, but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes unheard of before. Objective: The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player’s prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. Methods: We developed and evaluated an online game called The Cure that captured information from players regarding genes for use as predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach, and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10 year survival. Results: Between its launch in September 2012 and September 2013, The Cure attracted more than 1000 registered players, who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as cancer, disease progression, and recurrence. In terms of the predictive accuracy of models trained using this information, these gene sets provided comparable performance to gene sets generated using other methods, including those used in commercial tests. The Cure is available on the Internet. Conclusions: The principal contribution of this work is to show that crowdsourcing games can be developed as a means to address problems involving domain knowledge. While most prior work on scientific discovery games and crowdsourcing in general takes as a premise that contributors have little or no expertise, here we demonstrated a crowdsourcing system that succeeded in capturing expert knowledge. %M 25654473 %R 10.2196/games.3350 %U http://games.jmir.org/2014/2/e7/ %U https://doi.org/10.2196/games.3350 %U http://www.ncbi.nlm.nih.gov/pubmed/25654473 %0 Journal Article %@ 1929-0748 %I JMIR Publications Inc. %V 3 %N 2 %P e22 %T Cameras for Public Health Surveillance: A Methods Protocol for Crowdsourced Annotation of Point-of-Sale Photographs %A Ilakkuvan,Vinu %A Tacelosky,Michael %A Ivey,Keith C %A Pearson,Jennifer L %A Cantrell,Jennifer %A Vallone,Donna M %A Abrams,David B %A Kirchner,Thomas R %+ Department of Research and Evaluation, Legacy, 1724 Massachusetts Avenue NW, Washington, DC, 20036, United States, 1 2024545791, vilakkuvan@legacyforhealth.org %K image processing %K crowdsourcing %K annotation %K public health %K surveillance %D 2014 %7 09.04.2014 %9 Protocol %J JMIR Res Protoc %G English %X Background: Photographs are an effective way to collect detailed and objective information about the environment, particularly for public health surveillance. However, accurately and reliably annotating (ie, extracting information from) photographs remains difficult, a critical bottleneck inhibiting the use of photographs for systematic surveillance. The advent of distributed human computation (ie, crowdsourcing) platforms represents a veritable breakthrough, making it possible for the first time to accurately, quickly, and repeatedly annotate photos at relatively low cost. Objective: This paper describes a methods protocol, using photographs from point-of-sale surveillance studies in the field of tobacco control to demonstrate the development and testing of custom-built tools that can greatly enhance the quality of crowdsourced annotation. Methods: Enhancing the quality of crowdsourced photo annotation requires a number of approaches and tools. The crowdsourced photo annotation process is greatly simplified by decomposing the overall process into smaller tasks, which improves accuracy and speed and enables adaptive processing, in which irrelevant data is filtered out and more difficult targets receive increased scrutiny. Additionally, zoom tools enable users to see details within photographs and crop tools highlight where within an image a specific object of interest is found, generating a set of photographs that answer specific questions. Beyond such tools, optimizing the number of raters (ie, crowd size) for accuracy and reliability is an important facet of crowdsourced photo annotation. This can be determined in a systematic manner based on the difficulty of the task and the desired level of accuracy, using receiver operating characteristic (ROC) analyses. Usability tests of the zoom and crop tool suggest that these tools significantly improve annotation accuracy. The tests asked raters to extract data from photographs, not for the purposes of assessing the quality of that data, but rather to assess the usefulness of the tool. The proportion of individuals accurately identifying the presence of a specific advertisement was higher when provided with pictures of the product’s logo and an example of the ad, and even higher when also provided the zoom tool (χ22=155.7, P<.001). Similarly, when provided cropped images, a significantly greater proportion of respondents accurately identified the presence of cigarette product ads (χ21=75.14, P<.001), as well as reported being able to read prices (χ22=227.6, P<.001). Comparing the results of crowdsourced photo-only assessments to traditional field survey data, an excellent level of correspondence was found, with area under the ROC curves produced by sensitivity analyses averaging over 0.95, requiring on average 10 to 15 crowdsourced raters to achieve values of over 0.90. Results: Further testing and improvement of these tools and processes is currently underway. This includes conducting systematic evaluations that crowdsource photograph annotation and methodically assess the quality of raters’ work. Conclusions: Overall, the combination of crowdsourcing technologies with tiered data flow and tools that enhance annotation quality represents a breakthrough solution to the problem of photograph annotation, vastly expanding opportunities for the use of photographs rich in public health and other data on a scale previously unimaginable. %M 24717168 %R 10.2196/resprot.3277 %U http://www.researchprotocols.org/2014/2/e22/ %U https://doi.org/10.2196/resprot.3277 %U http://www.ncbi.nlm.nih.gov/pubmed/24717168 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 16 %N 4 %P e100 %T The Impact of Search Engine Selection and Sorting Criteria on Vaccination Beliefs and Attitudes: Two Experiments Manipulating Google Output %A Allam,Ahmed %A Schulz,Peter Johannes %A Nakamoto,Kent %+ Institute of Communication and Health, Faculty of Communication Sciences, University of Lugano (Università della Svizzera italiana), Blue Building, 1st floor, 13 G Buffi street, Lugano, 6900, Switzerland, 41 58 666 4821, ahmed.allam@usi.ch %K consumer health information %K search engine %K searching behavior %K Internet %K information storage and retrieval %K online systems %K public health informatics %K vaccination %K health communication %D 2014 %7 02.04.2014 %9 Original Paper %J J Med Internet Res %G English %X Background: During the past 2 decades, the Internet has evolved to become a necessity in our daily lives. The selection and sorting algorithms of search engines exert tremendous influence over the global spread of information and other communication processes. Objective: This study is concerned with demonstrating the influence of selection and sorting/ranking criteria operating in search engines on users’ knowledge, beliefs, and attitudes of websites about vaccination. In particular, it is to compare the effects of search engines that deliver websites emphasizing on the pro side of vaccination with those focusing on the con side and with normal Google as a control group. Method: We conducted 2 online experiments using manipulated search engines. A pilot study was to verify the existence of dangerous health literacy in connection with searching and using health information on the Internet by exploring the effect of 2 manipulated search engines that yielded either pro or con vaccination sites only, with a group receiving normal Google as control. A pre-post test design was used; participants were American marketing students enrolled in a study-abroad program in Lugano, Switzerland. The second experiment manipulated the search engine by applying different ratios of con versus pro vaccination webpages displayed in the search results. Participants were recruited from Amazon’s Mechanical Turk platform where it was published as a human intelligence task (HIT). Results: Both experiments showed knowledge highest in the group offered only pro vaccination sites (Z=–2.088, P=.03; Kruskal-Wallis H test [H5]=11.30, P=.04). They acknowledged the importance/benefits (Z=–2.326, P=.02; H5=11.34, P=.04) and effectiveness (Z=–2.230, P=.03) of vaccination more, whereas groups offered antivaccination sites only showed increased concern about effects (Z=–2.582, P=.01; H5=16.88, P=.005) and harmful health outcomes (Z=–2.200, P=.02) of vaccination. Normal Google users perceived information quality to be positive despite a small effect on knowledge and a negative effect on their beliefs and attitudes toward vaccination and willingness to recommend the information (χ25=14.1, P=.01). More exposure to antivaccination websites lowered participants’ knowledge (J=4783.5, z=−2.142, P=.03) increased their fear of side effects (J=6496, z=2.724, P=.006), and lowered their acknowledgment of benefits (J=4805, z=–2.067, P=.03). Conclusion: The selection and sorting/ranking criteria of search engines play a vital role in online health information seeking. Search engines delivering websites containing credible and evidence-based medical information impact positively Internet users seeking health information. Whereas sites retrieved by biased search engines create some opinion change in users. These effects are apparently independent of users’ site credibility and evaluation judgments. Users are affected beneficially or detrimentally but are unaware, suggesting they are not consciously perceptive of indicators that steer them toward the credible sources or away from the dangerous ones. In this sense, the online health information seeker is flying blind. %M 24694866 %R 10.2196/jmir.2642 %U http://www.jmir.org/2014/4/e100/ %U https://doi.org/10.2196/jmir.2642 %U http://www.ncbi.nlm.nih.gov/pubmed/24694866 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 16 %N 2 %P e45 %T Evaluation of a Novel Conjunctive Exploratory Navigation Interface for Consumer Health Information: A Crowdsourced Comparative Study %A Cui,Licong %A Carter,Rebecca %A Zhang,Guo-Qiang %+ Department of Electrical Engineering and Computer Science, Division of Medical Informatics, Case Western Reserve University, 2103 Cornell Road, Cleveland, OH, 44106, United States, 1 216 368 3286, gq@case.edu %K crowdsourcing %K consumer health information %K human computer interaction %K information retrieval %K search interfaces %K comparative user evaluation %D 2014 %7 10.02.2014 %9 Original Paper %J J Med Internet Res %G English %X Background: Numerous consumer health information websites have been developed to provide consumers access to health information. However, lookup search is insufficient for consumers to take full advantage of these rich public information resources. Exploratory search is considered a promising complementary mechanism, but its efficacy has never before been rigorously evaluated for consumer health information retrieval interfaces. Objective: This study aims to (1) introduce a novel Conjunctive Exploratory Navigation Interface (CENI) for supporting effective consumer health information retrieval and navigation, and (2) evaluate the effectiveness of CENI through a search-interface comparative evaluation using crowdsourcing with Amazon Mechanical Turk (AMT). Methods: We collected over 60,000 consumer health questions from NetWellness, one of the first consumer health websites to provide high-quality health information. We designed and developed a novel conjunctive exploratory navigation interface to explore NetWellness health questions with health topics as dynamic and searchable menus. To investigate the effectiveness of CENI, we developed a second interface with keyword-based search only. A crowdsourcing comparative study was carefully designed to compare three search modes of interest: (A) the topic-navigation-based CENI, (B) the keyword-based lookup interface, and (C) either the most commonly available lookup search interface with Google, or the resident advanced search offered by NetWellness. To compare the effectiveness of the three search modes, 9 search tasks were designed with relevant health questions from NetWellness. Each task included a rating of difficulty level and questions for validating the quality of answers. Ninety anonymous and unique AMT workers were recruited as participants. Results: Repeated-measures ANOVA analysis of the data showed the search modes A, B, and C had statistically significant differences among their levels of difficulty (P<.001). Wilcoxon signed-rank test (one-tailed) between A and B showed that A was significantly easier than B (P<.001). Paired t tests (one-tailed) between A and C showed A was significantly easier than C (P<.001). Participant responses on the preferred search modes showed that 47.8% (43/90) participants preferred A, 25.6% (23/90) preferred B, 24.4% (22/90) preferred C. Participant comments on the preferred search modes indicated that CENI was easy to use, provided better organization of health questions by topics, allowed users to narrow down to the most relevant contents quickly, and supported the exploratory navigation by non-experts or those unsure how to initiate their search. Conclusions: We presented a novel conjunctive exploratory navigation interface for consumer health information retrieval and navigation. Crowdsourcing permitted a carefully designed comparative search-interface evaluation to be completed in a timely and cost-effective manner with a relatively large number of participants recruited anonymously. Accounting for possible biases, our study has shown for the first time with crowdsourcing that the combination of exploratory navigation and lookup search is more effective than lookup search alone. %M 24513593 %R 10.2196/jmir.3111 %U http://www.jmir.org/2014/2/e45/ %U https://doi.org/10.2196/jmir.3111 %U http://www.ncbi.nlm.nih.gov/pubmed/24513593 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 16 %N 2 %P e14 %T Understanding Messaging Preferences to Inform Development of Mobile Goal-Directed Behavioral Interventions %A Muench,Frederick %A van Stolk-Cooke,Katherine %A Morgenstern,Jon %A Kuerbis,Alexis N %A Markle,Kendra %+ CASPIR, Department of Psychiatry, Columbia University College of Physicians & Surgeons, 3 Columbus Circle 1404, New York, NY, 10017, United States, 1 212 974 0547, fm2148@columbia.edu %K mHealth %K text messaging %K behavioral health %K preferences %K linguistics %K tailoring %K participatory design %K agile design %D 2014 %7 05.02.2014 %9 Original Paper %J J Med Internet Res %G English %X Background: Mobile messaging interventions have been shown to improve outcomes across a number of mental health and health-related conditions, but there are still significant gaps in our knowledge of how to construct and deliver the most effective brief messaging interventions. Little is known about the ways in which subtle linguistic variations in message content can affect user receptivity and preferences. Objective: The aim of this study was to determine whether any global messaging preferences existed for different types of language content, and how certain characteristics moderate those preferences, in an effort to inform the development of mobile messaging interventions. Methods: This study examined user preferences for messages within 22 content groupings. Groupings were presented online in dyads of short messages that were identical in their subject matter, but structurally or linguistically varied. Participants were 277 individuals residing in the United States who were recruited and compensated through Amazon’s Mechanical Turk (MTurk) system. Participants were instructed to select the message in each dyad that they would prefer to receive to help them achieve a personal goal of their choosing. Results: Results indicate global preferences of more than 75% of subjects for certain types of messages, such as those that were grammatically correct, free of textese, benefit-oriented, polite, nonaggressive, and directive as opposed to passive, among others. For several classes of messages, few or no clear global preferences were found. There were few personality- and trait-based moderators of message preferences, but subtle manipulations of message structure, such as changing “Try to…” to “You might want to try to…” affected message choice. Conclusions: The results indicate that individuals are sensitive to variations in the linguistic content of text messages designed to help them achieve a personal goal and, in some cases, have clear preferences for one type of message over another. Global preferences were indicated for messages that contained accurate spelling and grammar, as well as messages that emphasize the positive over the negative. Research implications and a guide for developing short messages for goal-directed behaviors are presented in this paper. %M 24500775 %R 10.2196/jmir.2945 %U http://www.jmir.org/2014/2/e14/ %U https://doi.org/10.2196/jmir.2945 %U http://www.ncbi.nlm.nih.gov/pubmed/24500775 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 16 %N 2 %P e29 %T Investigating the Congruence of Crowdsourced Information With Official Government Data: The Case of Pediatric Clinics %A Kim,Minki %A Jung,Yuchul %A Jung,Dain %A Hur,Cinyoung %+ Department of Business and Technology Management, Korea Advanced Institute of Science and Technology, KAIST N5-2109, 291 Daehak-ro, Yuseong-gu, Daejeon, 305-701, Korea, Republic Of, 82 423506315, minki.kim@kaist.ac.kr %K online health community %K crowdsourcing %K risk of misinformation %K public health %D 2014 %7 03.02.2014 %9 Original Paper %J J Med Internet Res %G English %X Background: Health 2.0 is a benefit to society by helping patients acquire knowledge about health care by harnessing collective intelligence. However, any misleading information can directly affect patients’ choices of hospitals and drugs, and potentially exacerbate their health condition. Objective: This study investigates the congruence between crowdsourced information and official government data in the health care domain and identifies the determinants of low congruence where it exists. In-line with infodemiology, we suggest measures to help the patients in the regions vulnerable to inaccurate health information. Methods: We text-mined multiple online health communities in South Korea to construct the data for crowdsourced information on public health services (173,748 messages). Kendall tau and Spearman rank order correlation coefficients were used to compute the differences in 2 ranking systems of health care quality: actual government evaluations of 779 hospitals and mining results of geospecific online health communities. Then we estimated the effect of sociodemographic characteristics on the level of congruence by using an ordinary least squares regression. Results: The regression results indicated that the standard deviation of married women’s education (P=.046), population density (P=.01), number of doctors per pediatric clinic (P=.048), and birthrate (P=.002) have a significant effect on the congruence of crowdsourced data (adjusted R2=.33). Specifically, (1) the higher the birthrate in a given region, (2) the larger the variance in educational attainment, (3) the higher the population density, and (4) the greater the number of doctors per clinic, the more likely that crowdsourced information from online communities is congruent with official government data. Conclusions: To investigate the cause of the spread of misleading health information in the online world, we adopted a unique approach by associating mining results on hospitals from geospecific online health communities with the sociodemographic characteristics of corresponding regions. We found that the congruence of crowdsourced information on health care services varied across regions and that these variations could be explained by geospecific demographic factors. This finding can be helpful to governments in reducing the potential risk of misleading online information and the accompanying safety issues. %M 24496094 %R 10.2196/jmir.3078 %U http://www.jmir.org/2014/2/e29/ %U https://doi.org/10.2196/jmir.3078 %U http://www.ncbi.nlm.nih.gov/pubmed/24496094 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 15 %N 8 %P e178 %T Crowdsourcing Black Market Prices For Prescription Opioids %A Dasgupta,Nabarun %A Freifeld,Clark %A Brownstein,John S %A Menone,Christopher Mark %A Surratt,Hilary L %A Poppish,Luke %A Green,Jody L %A Lavonas,Eric J %A Dart,Richard C %+ Epidemico, 268 Newbury Street, 2nd Floor, Boston, MA , United States, 1 9192603808, nabarund@gmail.com %K opioids %K black market %K economics %K drug abuse %K surveillance %K crowdsourcing %K Internet %K Silk Road %K StreetRx %K RADARS System %K police %K law enforcement %D 2013 %7 16.08.2013 %9 Original Paper %J J Med Internet Res %G English %X Background: Prescription opioid diversion and abuse are major public health issues in the United States and internationally. Street prices of diverted prescription opioids can provide an indicator of drug availability, demand, and abuse potential, but these data can be difficult to collect. Crowdsourcing is a rapid and cost-effective way to gather information about sales transactions. We sought to determine whether crowdsourcing can provide accurate measurements of the street price of diverted prescription opioid medications. Objective: To assess the possibility of crowdsourcing black market drug price data by cross-validation with law enforcement officer reports. Methods: Using a crowdsourcing research website (StreetRx), we solicited data about the price that site visitors paid for diverted prescription opioid analgesics during the first half of 2012. These results were compared with a survey of law enforcement officers in the Researched Abuse, Diversion, and Addiction-Related Surveillance (RADARS) System, and actual transaction prices on a “dark Internet” marketplace (Silk Road). Geometric means and 95% confidence intervals were calculated for comparing prices per milligram of drug in US dollars. In a secondary analysis, we compared prices per milligram of morphine equivalent using standard equianalgesic dosing conversions. Results: A total of 954 price reports were obtained from crowdsourcing, 737 from law enforcement, and 147 from the online marketplace. Correlations between the 3 data sources were highly linear, with Spearman rho of 0.93 (P<.001) between crowdsourced and law enforcement, and 0.98 (P<.001) between crowdsourced and online marketplace. On StreetRx, the mean prices per milligram were US$3.29 hydromorphone, US$2.13 buprenorphine, US$1.57 oxymorphone, US$0.97 oxycodone, US$0.96 methadone, US$0.81 hydrocodone, US$0.52 morphine, and US$0.05 tramadol. The only significant difference between data sources was morphine, with a Drug Diversion price of US$0.67/mg (95% CI 0.59-0.75) and a Silk Road price of US$0.42/mg (95% CI 0.37-0.48). Street prices generally followed clinical equianalgesic potency. Conclusions: Crowdsourced data provide a valid estimate of the street price of diverted prescription opioids. The (ostensibly free) black market was able to accurately predict the relative pharmacologic potency of opioid molecules. %M 23956042 %R 10.2196/jmir.2810 %U http://www.jmir.org/2013/8/e178/ %U https://doi.org/10.2196/jmir.2810 %U http://www.ncbi.nlm.nih.gov/pubmed/23956042 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 15 %N 7 %P e144 %T User Evaluation of the Effects of a Text Simplification Algorithm Using Term Familiarity on Perception, Understanding, Learning, and Information Retention %A Leroy,Gondy %A Endicott,James E %A Kauchak,David %A Mouradi,Obay %A Just,Melissa %+ Information Systems and Technology, Claremont Graduate University, ACB 225, 130 E Ninth Street, Claremont, CA, 91711, United States, 1 909 607 3270, gondy.leroy@cgu.edu %K text simplification %K health literacy %K consumer health information %K natural language processing %K evaluation study %D 2013 %7 31.07.2013 %9 Original Paper %J J Med Internet Res %G English %X Background: Adequate health literacy is important for people to maintain good health and manage diseases and injuries. Educational text, either retrieved from the Internet or provided by a doctor’s office, is a popular method to communicate health-related information. Unfortunately, it is difficult to write text that is easy to understand, and existing approaches, mostly the application of readability formulas, have not convincingly been shown to reduce the difficulty of text. Objective: To develop an evidence-based writer support tool to improve perceived and actual text difficulty. To this end, we are developing and testing algorithms that automatically identify difficult sections in text and provide appropriate, easier alternatives; algorithms that effectively reduce text difficulty will be included in the support tool. This work describes the user evaluation with an independent writer of an automated simplification algorithm using term familiarity. Methods: Term familiarity indicates how easy words are for readers and is estimated using term frequencies in the Google Web Corpus. Unfamiliar words are algorithmically identified and tagged for potential replacement. Easier alternatives consisting of synonyms, hypernyms, definitions, and semantic types are extracted from WordNet, the Unified Medical Language System (UMLS), and Wiktionary and ranked for a writer to choose from to simplify the text. We conducted a controlled user study with a representative writer who used our simplification algorithm to simplify texts. We tested the impact with representative consumers. The key independent variable of our study is lexical simplification, and we measured its effect on both perceived and actual text difficulty. Participants were recruited from Amazon’s Mechanical Turk website. Perceived difficulty was measured with 1 metric, a 5-point Likert scale. Actual difficulty was measured with 3 metrics: 5 multiple-choice questions alongside each text to measure understanding, 7 multiple-choice questions without the text for learning, and 2 free recall questions for information retention. Results: Ninety-nine participants completed the study. We found strong beneficial effects on both perceived and actual difficulty. After simplification, the text was perceived as simpler (P<.001) with simplified text scoring 2.3 and original text 3.2 on the 5-point Likert scale (score 1: easiest). It also led to better understanding of the text (P<.001) with 11% more correct answers with simplified text (63% correct) compared to the original (52% correct). There was more learning with 18% more correct answers after reading simplified text compared to 9% more correct answers after reading the original text (P=.003). There was no significant effect on free recall. Conclusions: Term familiarity is a valuable feature in simplifying text. Although the topic of the text influences the effect size, the results were convincing and consistent. %M 23903235 %R 10.2196/jmir.2569 %U http://www.jmir.org/2013/7/e144/ %U https://doi.org/10.2196/jmir.2569 %U http://www.ncbi.nlm.nih.gov/pubmed/23903235 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 15 %N 6 %P e108 %T Crowdsourcing Participatory Evaluation of Medical Pictograms Using Amazon Mechanical Turk %A Yu,Bei %A Willis,Matt %A Sun,Peiyuan %A Wang,Jun %+ School of Information Studies, Syracuse University, Hinds Hall, Syracuse University, Syracuse, NY, 13244, United States, 1 3154433614, byu@syr.edu %K crowdsourcing %K Amazon Mechanical Turk %K participatory design %K medical instruction %K pictogram %K patient communication %K readability %K health literacy %D 2013 %7 03.06.2013 %9 Original Paper %J J Med Internet Res %G English %X Background: Consumer and patient participation proved to be an effective approach for medical pictogram design, but it can be costly and time-consuming. We proposed and evaluated an inexpensive approach that crowdsourced the pictogram evaluation task to Amazon Mechanical Turk (MTurk) workers, who are usually referred to as the “turkers”. Objective: To answer two research questions: (1) Is the turkers’ collective effort effective for identifying design problems in medical pictograms? and (2) Do the turkers’ demographic characteristics affect their performance in medical pictogram comprehension? Methods: We designed a Web-based survey (open-ended tests) to ask 100 US turkers to type in their guesses of the meaning of 20 US pharmacopeial pictograms. Two judges independently coded the turkers’ guesses into four categories: correct, partially correct, wrong, and completely wrong. The comprehensibility of a pictogram was measured by the percentage of correct guesses, with each partially correct guess counted as 0.5 correct. We then conducted a content analysis on the turkers’ interpretations to identify misunderstandings and assess whether the misunderstandings were common. We also conducted a statistical analysis to examine the relationship between turkers’ demographic characteristics and their pictogram comprehension performance. Results: The survey was completed within 3 days of our posting the task to the MTurk, and the collected data are publicly available in the multimedia appendix for download. The comprehensibility for the 20 tested pictograms ranged from 45% to 98%, with an average of 72.5%. The comprehensibility scores of 10 pictograms were strongly correlated to the scores of the same pictograms reported in another study that used oral response–based open-ended testing with local people. The turkers’ misinterpretations shared common errors that exposed design problems in the pictograms. Participant performance was positively correlated with their educational level. Conclusions: The results confirmed that crowdsourcing can be used as an effective and inexpensive approach for participatory evaluation of medical pictograms. Through Web-based open-ended testing, the crowd can effectively identify problems in pictogram designs. The results also confirmed that education has a significant effect on the comprehension of medical pictograms. Since low-literate people are underrepresented in the turker population, further investigation is needed to examine to what extent turkers’ misunderstandings overlap with those elicited from low-literate people. %M 23732572 %R 10.2196/jmir.2513 %U http://www.jmir.org/2013/6/e108/ %U https://doi.org/10.2196/jmir.2513 %U http://www.ncbi.nlm.nih.gov/pubmed/23732572 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 15 %N 5 %P e100 %T Crowdsourcing a Normative Natural Language Dataset: A Comparison of Amazon Mechanical Turk and In-Lab Data Collection %A Saunders,Daniel R %A Bex,Peter J %A Woods,Russell L %+ Schepens Eye Research Institute, 20 Staniford Street, Boston, MA, 02114, United States, 1 617 912 2590, daniel_saunders@meei.harvard.edu %K Internet %K web %K crowdsourcing %K free recall %D 2013 %7 20.05.2013 %9 Original Paper %J J Med Internet Res %G English %X Background: Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. Objective: To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. Methods: We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. Results: In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (P<.001), and they used a less varied vocabulary. However, there was strong similarity in the words used to describe a particular clip between the two datasets, as a cross-dataset count of shared words showed (P<.001). Within both datasets, responses contained substantial relevant content, with more words in common with responses to the same clip than to other clips (P<.001). There was evidence that responses from female and older crowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01). Conclusions: Crowdsourcing is an effective approach to quickly and economically collect a large reliable dataset of normative natural language responses. %M 23689038 %R 10.2196/jmir.2620 %U http://www.jmir.org/2013/5/e100/ %U https://doi.org/10.2196/jmir.2620 %U http://www.ncbi.nlm.nih.gov/pubmed/23689038 %0 Journal Article %@ 14388871 %I JMIR Publications Inc. %V 15 %N 4 %P e73 %T Web 2.0-Based Crowdsourcing for High-Quality Gold Standard Development in Clinical Natural Language Processing %A Zhai,Haijun %A Lingren,Todd %A Deleger,Louise %A Li,Qi %A Kaiser,Megan %A Stoutenborough,Laura %A Solti,Imre %+ Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH, 45229, United States, 1 513 636 1020, imre.solti@cchmc.org %K clinical informatics %K natural language processing %K named entity %K reference standards %K crowdsourcing %K user computer interface %K quality control %D 2013 %7 02.04.2013 %9 Original Paper %J J Med Internet Res %G English %X Background: A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. Objective: Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. Methods: To build the gold standard for evaluating the crowdsourcing workers’ performance, 1042 clinical trial announcements (CTAs) from the ClinicalTrials.gov website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd’s work and tested the statistical significance (P<.001, chi-square test) to detect differences between the crowdsourced and traditionally-developed annotations. Results: The agreement between the crowd’s annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names; 0.73, medication types), (2) correction of previous annotations (0.90, medication names; 0.76, medication types), and excellent for (3) linking medications with their attributes (0.96). Simple voting provided the best judgment aggregation approach. There was no statistically significant difference between the crowd and traditionally-generated corpora. Our results showed a 27.9% improvement over previously reported results on medication named entity annotation task. Conclusions: This study offers three contributions. First, we proved that crowdsourcing is a feasible, inexpensive, fast, and practical approach to collect high-quality annotations for clinical text (when protected health information was excluded). We believe that well-designed user interfaces and rigorous quality control strategy for entity annotation and linking were critical to the success of this work. Second, as a further contribution to the Internet-based crowdsourcing field, we will publicly release the JavaScript and CrowdFlower Markup Language infrastructure code that is necessary to utilize CrowdFlower’s quality control and crowdsourcing interfaces for named entity annotations. Finally, to spur future research, we will release the CTA annotations that were generated by traditional and crowdsourced approaches. %M 23548263 %R 10.2196/jmir.2426 %U http://www.jmir.org/2013/4/e73/ %U https://doi.org/10.2196/jmir.2426 %U http://www.ncbi.nlm.nih.gov/pubmed/23548263 %0 Journal Article %@ 1438-8871 %I Gunther Eysenbach %V 14 %N 6 %P e167 %T Crowdsourcing Malaria Parasite Quantification: An Online Game for Analyzing Images of Infected Thick Blood Smears %A Luengo-Oroz,Miguel Angel %A Arranz,Asier %A Frean,John %+ Biomedical Image Technologies group, DIE, ETSI Telecomunicación, Universidad Politécnica de Madrid, CEI Moncloa UPM-UCM, ETSIT, Av. Complutense 30, Madrid, 28040, Spain, 34 913366827, maluengo@die.upm.es %K Crowdsourcing %K Malaria %K Image Analysis %K Games for Health %K Telepathology %D 2012 %7 29.11.2012 %9 Original Paper %J J Med Internet Res %G English %X Background: There are 600,000 new malaria cases daily worldwide. The gold standard for estimating the parasite burden and the corresponding severity of the disease consists in manually counting the number of parasites in blood smears through a microscope, a process that can take more than 20 minutes of an expert microscopist’s time. Objective: This research tests the feasibility of a crowdsourced approach to malaria image analysis. In particular, we investigated whether anonymous volunteers with no prior experience would be able to count malaria parasites in digitized images of thick blood smears by playing a Web-based game. Methods: The experimental system consisted of a Web-based game where online volunteers were tasked with detecting parasites in digitized blood sample images coupled with a decision algorithm that combined the analyses from several players to produce an improved collective detection outcome. Data were collected through the MalariaSpot website. Random images of thick blood films containing Plasmodium falciparum at medium to low parasitemias, acquired by conventional optical microscopy, were presented to players. In the game, players had to find and tag as many parasites as possible in 1 minute. In the event that players found all the parasites present in the image, they were presented with a new image. In order to combine the choices of different players into a single crowd decision, we implemented an image processing pipeline and a quorum algorithm that judged a parasite tagged when a group of players agreed on its position. Results: Over 1 month, anonymous players from 95 countries played more than 12,000 games and generated a database of more than 270,000 clicks on the test images. Results revealed that combining 22 games from nonexpert players achieved a parasite counting accuracy higher than 99%. This performance could be obtained also by combining 13 games from players trained for 1 minute. Exhaustive computations measured the parasite counting accuracy for all players as a function of the number of games considered and the experience of the players. In addition, we propose a mathematical equation that accurately models the collective parasite counting performance. Conclusions: This research validates the online gaming approach for crowdsourced counting of malaria parasites in images of thick blood films. The findings support the conclusion that nonexperts are able to rapidly learn how to identify the typical features of malaria parasites in digitized thick blood samples and that combining the analyses of several users provides similar parasite counting accuracy rates as those of expert microscopists. This experiment illustrates the potential of the crowdsourced gaming approach for performing routine malaria parasite quantification, and more generally for solving biomedical image analysis problems, with future potential for telediagnosis related to global health challenges. %M 23196001 %R 10.2196/jmir.2338 %U http://www.jmir.org/2012/6/e167/ %U https://doi.org/10.2196/jmir.2338 %U http://www.ncbi.nlm.nih.gov/pubmed/23196001 %0 Journal Article %@ 1438-8871 %I Gunther Eysenbach %V 14 %N 3 %P e79 %T Using Crowdsourcing Technology for Testing Multilingual Public Health Promotion Materials %A Turner,Anne M %A Kirchhoff,Katrin %A Capurro,Daniel %+ Department of Health Services, University of Washington, 1107 NE 45th Street, Suite 400, Seattle, WA, 98105, United States, 1 2066851130, amturner@uw.edu %K Crowdsourcing %K health promotion %K public health informatics %K limited English proficiency populations %D 2012 %7 04.06.2012 %9 Tutorial %J J Med Internet Res %G English %X Background: Effective communication of public health messages is a key strategy for health promotion by public health agencies. Creating effective health promotion materials requires careful message design and feedback from representatives of target populations. This is particularly true when the target audiences are hard to reach as limited English proficiency groups. Traditional methods of soliciting feedback—such as focus groups and convenience sample interviews—are expensive and time consuming. As a result, adequate feedback from target populations is often insufficient due to the time and resource constraints characteristic to public health. Objective: To describe a pilot study investigating the use of crowdsourcing technology as a method to gather rapid and relevant feedback on the design of health promotion messages for oral health. Our goal was to better describe the demographics of participants responding to a crowdsourcing survey and to test whether crowdsourcing could be used to gather feedback from English-speaking and Spanish-speaking participants in a short period of time and at relatively low costs. Methods: We developed health promotion materials on pediatric dental health issues in four different formats and in two languages (English and Spanish). We then designed an online survey to elicit feedback on format preferences and made it available in both languages via the Amazon Mechanical Turk crowdsourcing platform. Results: We surveyed 236 native English-speaking and 163 native Spanish-speaking participants in less than 12 days, at a cost of US $374. Overall, Spanish-speaking participants originated from a wider distribution of countries than the overall Latino population in the United States. Most participants were in the 18- to 29-year age range and had some college or graduate education. Participants provided valuable input for the health promotion material design. Conclusions: Our results indicate that crowdsourcing can be an effective method for recruiting and gaining feedback from English-speaking and Spanish-speaking people. Compared with traditional methods, crowdsourcing has the potential to reach more diverse populations than convenience sampling, while substantially reducing the time and cost of gathering participant feedback. More widespread adoption of this method could streamline the development of effective health promotion materials in multiple languages. %M 22664384 %R 10.2196/jmir.2063 %U http://www.jmir.org/2012/3/e79/ %U https://doi.org/10.2196/jmir.2063 %U http://www.ncbi.nlm.nih.gov/pubmed/22664384 %0 Journal Article %@ 1438-8871 %I Gunther Eysenbach %V 14 %N 2 %P e46 %T Crowdsourced Health Research Studies: An Important Emerging Complement to Clinical Trials in the Public Health Research Ecosystem %A Swan,Melanie %+ MS Futures Group, PO Box 61258, Palo Alto, CA, 94306, United States, 1 6506819482, m@melanieswan.com %K Community-Based Participatory Research %K Preventive Medicine %K Personalized Medicine %K Individualized Medicine %K Consumer Participation %K Health Services Research %K Health Care Research %K Public Health %K Genomics %K Medicine %D 2012 %7 07.03.2012 %9 Viewpoint %J J Med Internet Res %G English %X Background: Crowdsourced health research studies are the nexus of three contemporary trends: 1) citizen science (non-professionally trained individuals conducting science-related activities); 2) crowdsourcing (use of web-based technologies to recruit project participants); and 3) medicine 2.0 / health 2.0 (active participation of individuals in their health care particularly using web 2.0 technologies). Crowdsourced health research studies have arisen as a natural extension of the activities of health social networks (online health interest communities), and can be researcher-organized or participant-organized. In the last few years, professional researchers have been crowdsourcing cohorts from health social networks for the conduct of traditional studies. Participants have also begun to organize their own research studies through health social networks and health collaboration communities created especially for the purpose of self-experimentation and the investigation of health-related concerns. Objective: The objective of this analysis is to undertake a comprehensive narrative review of crowdsourced health research studies. This review will assess the status, impact, and prospects of crowdsourced health research studies. Methods: Crowdsourced health research studies were identified through a search of literature published from 2000 to 2011 and informal interviews conducted 2008-2011. Keyword terms related to crowdsourcing were sought in Medline/PubMed. Papers that presented results from human health studies that included crowdsourced populations were selected for inclusion. Crowdsourced health research studies not published in the scientific literature were identified by attending industry conferences and events, interviewing attendees, and reviewing related websites. Results: Participatory health is a growing area with individuals using health social networks, crowdsourced studies, smartphone health applications, and personal health records to achieve positive outcomes for a variety of health conditions. PatientsLikeMe and 23andMe are the leading operators of researcher-organized, crowdsourced health research studies. These operators have published findings in the areas of disease research, drug response, user experience in crowdsourced studies, and genetic association. Quantified Self, Genomera, and DIYgenomics are communities of participant-organized health research studies where individuals conduct self-experimentation and group studies. Crowdsourced health research studies have a diversity of intended outcomes and levels of scientific rigor. Conclusions: Participatory health initiatives are becoming part of the public health ecosystem and their rapid growth is facilitated by Internet and social networking influences. Large-scale parameter-stratified cohorts have potential to facilitate a next-generation understanding of disease and drug response. Not only is the large size of crowdsourced cohorts an asset to medical discovery, too is the near-immediate speed at which medical findings might be tested and applied. Participatory health initiatives are expanding the scope of medicine from a traditional focus on disease cure to a personalized preventive approach. Crowdsourced health research studies are a promising complement and extension to traditional clinical trials as a model for the conduct of health research. %M 22397809 %R 10.2196/jmir.1988 %U http://www.jmir.org/2012/2/e46/ %U https://doi.org/10.2196/jmir.1988 %U http://www.ncbi.nlm.nih.gov/pubmed/22397809