TY - JOUR AU - Xue, Jia AU - Chen, Junxiang AU - Hu, Ran AU - Chen, Chen AU - Zheng, Chengda AU - Su, Yue AU - Zhu, Tingshao PY - 2020 DA - 2020/11/25 TI - Twitter Discussions and Emotions About the COVID-19 Pandemic: Machine Learning Approach JO - J Med Internet Res SP - e20550 VL - 22 IS - 11 KW - machine learning KW - Twitter data KW - COVID-19 KW - infodemic KW - infodemiology KW - infoveillance KW - public discussion KW - public sentiment KW - Twitter KW - social media KW - virus AB - Background: It is important to measure the public response to the COVID-19 pandemic. Twitter is an important data source for infodemiology studies involving public response monitoring. Objective: The objective of this study is to examine COVID-19–related discussions, concerns, and sentiments using tweets posted by Twitter users. Methods: We analyzed 4 million Twitter messages related to the COVID-19 pandemic using a list of 20 hashtags (eg, “coronavirus,” “COVID-19,” “quarantine”) from March 7 to April 21, 2020. We used a machine learning approach, Latent Dirichlet Allocation (LDA), to identify popular unigrams and bigrams, salient topics and themes, and sentiments in the collected tweets. Results: Popular unigrams included “virus,” “lockdown,” and “quarantine.” Popular bigrams included “COVID-19,” “stay home,” “corona virus,” “social distancing,” and “new cases.” We identified 13 discussion topics and categorized them into 5 different themes: (1) public health measures to slow the spread of COVID-19, (2) social stigma associated with COVID-19, (3) COVID-19 news, cases, and deaths, (4) COVID-19 in the United States, and (5) COVID-19 in the rest of the world. Across all identified topics, the dominant sentiments for the spread of COVID-19 were anticipation that measures can be taken, followed by mixed feelings of trust, anger, and fear related to different topics. The public tweets revealed a significant feeling of fear when people discussed new COVID-19 cases and deaths compared to other topics. Conclusions: This study showed that Twitter data and machine learning approaches can be leveraged for an infodemiology study, enabling research into evolving public discussions and sentiments during the COVID-19 pandemic. As the situation rapidly evolves, several topics are consistently dominant on Twitter, such as confirmed cases and death rates, preventive measures, health authorities and government policies, COVID-19 stigma, and negative psychological reactions (eg, fear). Real-time monitoring and assessment of Twitter discussions and concerns could provide useful data for public health emergency responses and planning. Pandemic-related fear, stigma, and mental health concerns are already evident and may continue to influence public trust when a second wave of COVID-19 occurs or there is a new surge of the current pandemic. SN - 1438-8871 UR - http://www.jmir.org/2020/11/e20550/ UR - https://doi.org/10.2196/20550 UR - http://www.ncbi.nlm.nih.gov/pubmed/33119535 DO - 10.2196/20550 ID - info:doi/10.2196/20550 ER -