%0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e63755 %T Characterizing Public Sentiments and Drug Interactions in the COVID-19 Pandemic Using Social Media: Natural Language Processing and Network Analysis %A Li,Wanxin %A Hua,Yining %A Zhou,Peilin %A Zhou,Li %A Xu,Xin %A Yang,Jie %+ School of Public Health, the Second Affiliated Hospital, Zhejiang University School of Medicine, No. 866, Yuhangtang Road, Hangzhou, 310058, China, 86 13575760802, xuxinsummer@zju.edu.cn %K COVID-19 %K natural language processing %K drugs %K social media %K pharmacovigilance %K public health %D 2025 %7 5.3.2025 %9 Original Paper %J J Med Internet Res %G English %X Background: While the COVID-19 pandemic has induced massive discussion of available medications on social media, traditional studies focused only on limited aspects, such as public opinions, and endured reporting biases, inefficiency, and long collection times. Objective: Harnessing drug-related data posted on social media in real-time can offer insights into how the pandemic impacts drug use and monitor misinformation. This study aimed to develop a natural language processing (NLP) pipeline tailored for the analysis of social media discourse on COVID-19–related drugs. Methods: This study constructed a full pipeline for COVID-19–related drug tweet analysis, using pretrained language model–based NLP techniques as the backbone. This pipeline is architecturally composed of 4 core modules: named entity recognition and normalization to identify medical entities from relevant tweets and standardize them to uniform medication names for time trend analysis, target sentiment analysis to reveal sentiment polarities associated with the entities, topic modeling to understand underlying themes discussed by the population, and drug network analysis to dig potential adverse drug reactions (ADR) and drug-drug interactions (DDI). The pipeline was deployed to analyze tweets related to the COVID-19 pandemic and drug therapies between February 1, 2020, and April 30, 2022. Results: From a dataset comprising 169,659,956 COVID-19–related tweets from 103,682,686 users, our named entity recognition model identified 2,124,757 relevant tweets sourced from 1,800,372 unique users, and the top 5 most-discussed drugs: ivermectin, hydroxychloroquine, remdesivir, zinc, and vitamin D. Time trend analysis revealed that the public focused mostly on repurposed drugs (ie, hydroxychloroquine and ivermectin), and least on remdesivir, the only officially approved drug among the 5. Sentiment analysis of the top 5 most-discussed drugs revealed that public perception was predominantly shaped by celebrity endorsements, media hot spots, and governmental directives rather than empirical evidence of drug efficacy. Topic analysis obtained 15 general topics of overall drug-related tweets, with “clinical treatment effects of drugs” and “physical symptoms” emerging as the most frequently discussed topics. Co-occurrence matrices and complex network analysis further identified emerging patterns of DDI and ADR that could be critical for public health surveillance like better safeguarding public safety in medicines use. Conclusions: This study shows that an NLP-based pipeline can be a robust tool for large-scale public health monitoring and can offer valuable supplementary data for traditional epidemiological studies concerning DDI and ADR. The framework presented here aspires to serve as a cornerstone for future social media–based public health analytics. %M 40053730 %R 10.2196/63755 %U https://www.jmir.org/2025/1/e63755 %U https://doi.org/10.2196/63755 %U http://www.ncbi.nlm.nih.gov/pubmed/40053730