Published on in Vol 25 (2023)

Preprints (earlier versions) of this paper are available at, first published .
Deep Learning Analysis of COVID-19 Vaccine Hesitancy and Confidence Expressed on Twitter in 6 High-Income Countries: Longitudinal Observational Study

Deep Learning Analysis of COVID-19 Vaccine Hesitancy and Confidence Expressed on Twitter in 6 High-Income Countries: Longitudinal Observational Study

Deep Learning Analysis of COVID-19 Vaccine Hesitancy and Confidence Expressed on Twitter in 6 High-Income Countries: Longitudinal Observational Study

Original Paper

1School of Public Health, Fudan University, Shanghai, China

2Global Health Institute, Fudan University, Shanghai, China

3Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States

4Department of Health Policy and Management, College of Public Health, University of Georgia, Athens, GA, United States

*these authors contributed equally

Corresponding Author:

Zhiyuan Hou, PhD

School of Public Health

Fudan University

130 Dong’an Road

Shanghai, 200032


Phone: 86 2133563935


Background: An ongoing monitoring of national and subnational trajectory of COVID-19 vaccine hesitancy could offer support in designing tailored policies on improving vaccine uptake.

Objective: We aim to track the temporal and spatial distribution of COVID-19 vaccine hesitancy and confidence expressed on Twitter during the entire pandemic period in major English-speaking countries.

Methods: We collected 5,257,385 English-language tweets regarding COVID-19 vaccination between January 1, 2020, and June 30, 2022, in 6 countries—the United States, the United Kingdom, Australia, New Zealand, Canada, and Ireland. Transformer-based deep learning models were developed to classify each tweet as intent to accept or reject COVID-19 vaccination and the belief that COVID-19 vaccine is effective or unsafe. Sociodemographic factors associated with COVID-19 vaccine hesitancy and confidence in the United States were analyzed using bivariate and multivariable linear regressions.

Results: The 6 countries experienced similar evolving trends of COVID-19 vaccine hesitancy and confidence. On average, the prevalence of intent to accept COVID-19 vaccination decreased from 71.38% of 44,944 tweets in March 2020 to 34.85% of 48,167 tweets in June 2022 with fluctuations. The prevalence of believing COVID-19 vaccines to be unsafe continuously rose by 7.49 times from March 2020 (2.84% of 44,944 tweets) to June 2022 (21.27% of 48,167 tweets). COVID-19 vaccine hesitancy and confidence varied by country, vaccine manufacturer, and states within a country. The democrat party and higher vaccine confidence were significantly associated with lower vaccine hesitancy across US states.

Conclusions: COVID-19 vaccine hesitancy and confidence evolved and were influenced by the development of vaccines and viruses during the pandemic. Large-scale self-generated discourses on social media and deep learning models provide a cost-efficient approach to monitoring routine vaccine hesitancy.

J Med Internet Res 2023;25:e49753



The COVID-19 pandemic has been around for a long time, with impacts on all sectors of society [1]. Sustaining a high level of immunity through vaccination and booster shots among the public, especially the at-risk population, is a key strategy for controlling the impact of the pandemic. COVID-19 vaccine hesitancy has gradually become a critical barrier to the uptake of the vaccine [2]. Although a high level of vaccination rate was observed in many sectors of the world, the unequal global distribution of vaccination and the formation of antivaccine groups on platforms such as Twitter continue to contribute to localized transmissions of COVID-19 and public health burden [3-5]. Therefore, understanding and maintaining the public’s adherence to vaccination still pose a challenge for policy makers worldwide [6-11]. Current evidence reported a number of factors associated with COVID-19 vaccine hesitancy, including sociodemographic factors [4,6] and vaccine confidence, providing strategies to combat vaccine hesitancy in local contexts [7,12,13]. On the other hand, ongoing monitoring of the national and subnational trajectory of COVID-19 vaccine hesitancy in multiple countries could still offer support in designing tailored policies on improving vaccine uptake.

As a conventional approach, previous studies conducted surveys to investigate COVID-19 vaccine hesitancy, confidence, and the associated barriers [8,9]. However, the high cost of surveys restricts the ability to draw conclusions about the longitudinal changes in vaccine hesitancy and confidence over time. In recent years, social media has become a popular platform for individuals to express their experiences and viewpoints on various topics, including vaccination. Thus, social media mining has gained recognition as a supplement for understanding and responding to public attitudes and behaviors, particularly during public health emergencies like the COVID-19 pandemic [14-16]. The vast number of posts on social media platforms can provide extensive and up-to-date longitudinal data, starting from the onset of the pandemic to the present. Those social media data can assist in monitoring the trajectory of public hesitancy and confidence toward vaccines throughout the progress of COVID-19 vaccine development, authorization, and deployment, which may support policy making, health communication strategies, and the prediction of the responses to new vaccines [14]. Besides, recent advances in deep learning models reached state-of-the-art performance in various natural language processing (NLP) tasks, making it possible to predict vaccine hesitancy and confidence using massive social media data close to real time [17-19]. Some previous studies monitored the sentiments and discussion topics around COVID-19 vaccines using social media data [20-23]. However, most of them were about a specific country, covering a relatively short time period, or did not fine-tune state-of-the-art deep learning models using task-specific manually annotated data sets.

Thus, our study aimed to (1) monitor the national and subnational spatiotemporal trends in COVID-19 vaccine hesitancy and confidence in 6 high-income countries during the entire pandemic period using deep learning analysis of Twitter data; (2) examine the disparities in vaccine hesitancy and confidence by country, manufacturer, and states within a country; and (3) identify the potential sociodemographic factors associated with the hesitancy and confidence in COVID-19 vaccination.

Ethical Considerations

This study is exempt from ethics approval because it used deidentified Twitter data, which is available to the public, does not include identifiable health information, and ensures anonymity.

Data Collection

We collected COVID-19 vaccine–related tweets and identified the tweets in 6 English-language high-income countries, namely the United States, the United Kingdom, Canada, Australia, New Zealand, and Ireland. Similar to previous studies [24,25], data were collected using TweetScraper, a Python tool for collecting Twitter search data [26]. Geo-locations were identified based on the user’s profile location and Carmen [27], a tool for geolocating tweets. Duplicated tweets and tweets without standardized locations or outside the 6 countries were excluded. In total, we collected 5,257,385 English-language tweets containing keywords “(covid OR coronavirus OR covid19 OR covid-19) AND (vaccine OR vaccination)” which were posted on Twitter in the 6 countries from January 1, 2020, through June 30, 2022.

Fine-Tuning Deep Learning Model for Annotating COVID-19 Vaccine–Related Tweets

Proposed in 2018, BERT (Bidirectional Encoder Representations from Transformers) is a groundbreaking deep learning model for NLP. It can perform a wide array of NLP tasks, such as question answering, language generation, and text classification [18]. BERT was pretrained with the Toronto Book Corpus and Wikipedia data set, which contains billions of pieces of text and provides the model with a comprehensive understanding of language [18].

While the BERT model is pretrained on the text data for more than a billion words, the corpus mainly consists of information on mixed domains, with no inclination to any subdomains [18]. In order to have better language interpretation results on COVID-19–related subdomains on Twitter, we adopt the COVID-Twitter-BERT (CT-BERT) model, which is specially pretrained on top of BERT for COVID-19–related tweets [19]. The CT-BERT model adopts the same model structure as BERTLARGE, while it was further pretrained on a corpus of 160 million tweets about COVID-19 and evaluated with 4 Twitter data sets—COVID-19 category, vaccine sentiment, maternal vaccine stance, and sentiment evaluation (SemEval-2016 task 4 subtask A), along with a stand-alone data set specialized for sentiment analysis [19]. It improves BERTLARGE’s marginal performance in reviewing COVID-19–related tweets by 9%-26% [19].

To use CT-BERT for evaluating COVID-19 vaccine hesitancy and confidence on Twitter, it needs to be further finetuned using a manually annotated data set on this topic. Our research team manually labeled 8073 tweets on COVID-19 vaccines. Each tweet was annotated by 2 annotators according to the framework of vaccine hesitancy proposed by the World Health Organization [28], and a third annotator resolved their disagreements. We finetuned CT-BERT models using 8073 tweets to generalize our analysis to all COVID-19 vaccine–related tweets we collected [16,29]. These initial, manually labeled tweets were divided into a training set (80%), a development set (10%), and a test set (10%). With the training set and development set, hyperparameters are chosen, and deep learning models are finetuned. The performance of the finetuned deep learning models was then evaluated with the test set.

Deep Learning Prediction of COVID-19 Vaccine Hesitancy and Confidence

With the deep learning models, we first identified the tweets sent most likely by humans, where the models attained a performance with a precision of 0.89 and an F1-score of 0.86. A total of 3,348,746 tweets sent most likely by humans were identified and contained country-level geo-locations. Among these tweets, 2,601,672 tweets contain state-level geo-locations. The deep learning models further labeled them with the 4 predefined categories as the outcome variables of interest in our study concerning hesitancy and confidence in COVID-19 vaccination. Although our manually annotated data set was labeled into more categories, to ensure optimal model performance, we analyzed the following four categories in this study using deep learning: (1) intent to accept COVID-19 vaccination (precision=0.88, F1-score=0.86), (2) intent to reject COVID-19 vaccination (precision=0.78, F1-score=0.75), (3) belief that COVID-19 vaccines are effective (precision=0.81, F1-score=0.73), and (4) belief that COVID-19 vaccines are unsafe (precision=0.86, F1-score=0.75). Each tweet could contain 1 label, multiple labels, or no label at all. (1) Vaccine acceptance and (2) vaccine rejection are mutually exclusive with each other, whereas (3) vaccine effectiveness and (4) vaccine unsafe are not mutually exclusive with any other categories. Table S1 in Multimedia Appendix 1 presents the annotation categories and prediction performance of deep learning models for each category, and Table S2 in Multimedia Appendix 1 displays the number of tweets in each category and hyperparameters for model fine-tuning.

Statistical Analysis

We first measured the individual-level vaccine hesitancy and confidence using the average deep learning prediction of all his or her tweets. For example, if an individual sent 2 tweets on COVID-19 vaccines during January 2022, among which 1 tweet indicates acceptance of vaccine, then his or her acceptance in January 2022 would be 50%. If he or she sent another 2 tweets indicating vaccine acceptance in February 2022, his or her vaccine acceptance would be 100% during February 2022, and his or her overall vaccine acceptance would be 75% during January-February 2022.

The spatiotemporal trends were then calculated as the average of all individuals (in a specific time period, place, or mentioning a specific vaccine manufacturer). For example, 100 people in the United Kingdom sent tweets on COVID-19 vaccines during January 2022; 60% of them expressed 100% acceptance toward COVID-19 vaccines, and 40% of them expressed 0% acceptance toward COVID-19 vaccines, then vaccine acceptance in the United Kingdom in January 2022 would be 60%×100% + 40%×0% = 60%. Due to data insufficiency during the early pandemic, January and February 2020 were not included in the temporal trend analysis. Vaccine manufacturers mentioned in tweets were detected with a keyword matching strategy using the keywords in Table S3 in Multimedia Appendix 1. The “overall” temporal trends are the country-level average of the 6 high-income countries.

To explore the variation within a country and the factors associated with hesitancy and confidence in COVID-19 vaccination, we conducted a state-level analysis in the United States, as there are sufficient data (1,812,398 tweets sent by 700,773 users) available for the analysis there. Bivariate and multivariable linear regressions were further used to examine the sociodemographic factors associated with hesitancy and confidence in COVID-19 vaccination across the states in the United States. The sociodemographic variables included political affiliation (Republican as the reference or Democrat party) [30], population density (number of people per square mile), percentage of people aged ≥65 years, and log-transformed gross domestic product per capita [31]. Current evidence shows that Democrats were estimated to have a higher vaccination rate, lower hesitancy toward COVID-19 vaccination, and fewer COVID-19 cases and deaths [6,32-35]. Population density was estimated to be associated with risks of infection, with a higher density catalyzing the spread of COVID-19 [36], which may lead to a higher willingness to COVID-19 vaccination. Older adults are more likely to be at higher risks for severe COVID-19 cases [37], and policies were encouraging the aged people to take COVID-19 vaccination, thus, more older adults may be associated with a higher acceptance rate and a lower rejection rate. Higher socioeconomic status (ie, greater gross domestic product per capita) has been reported to be associated with higher vaccination coverage [38]. Table S4 in Multimedia Appendix 1 describes state-level sociodemographic characteristics. For multivariable linear regressions, Model 1 included all sociodemographic variables, and Model 2 additionally adjusted belief in effectiveness and unsafety of COVID-19 vaccines. A variance inflation factor was estimated in each model, and every variance inflation factor was less than 10, indicating that multicollinearity didn’t exist. All analyses were performed with Python (version 3; Python Software Foundation), except for the factor analysis being carried out with STATA/SE (version 17; Stata Corporation).

Figure 1 shows the temporal trends in COVID-19 vaccine hesitancy and confidence according to the prediction of deep learning models. Similar trends in the 6 countries were observed from March 2020 to June 2022. The average vaccine acceptance among the 6 countries decreased from 71.38% of 44,944 tweets in March 2020 to 47.46% of 47,327 tweets in August 2020. Subsequently, while there were some fluctuations, the rate slowly rose to 59.03% of 153,419 tweets in May 2021. This uptick coincided with the period when the results from clinical trials of COVID-19 vaccines were being published, showcasing the high efficacy and safety of the vaccines (Figure 1A). Notably, a slight dip was observed in April 2021 (54.93% of 216,667 tweets), coinciding with an adverse event reported on April 13 about a rare and severe type of blood clot experienced by 6 US women after receiving the Johnson & Johnson vaccine [39]. After May 2021, with the new variants of SARS-CoV-2 (eg, Delta in May 2021 and Omicron in November 2021) occurring and spreading across the globe [40,41], vaccine acceptance rate continuously decreased to 34.85% of 48,167 tweets in June 2022. Meanwhile, the trends in rejection rate basically mirrored the inverse acceptance rate’s trajectory (Figure 1B). The rejection rate increased from 1.02% of 44,944 tweets in March 2020 to 6.30% of 47,327 tweets in August 2020, and then fluctuated downward to 3.50% of 153,419 tweets in May 2021, ascending again to 6.14% of 48,167 tweets in June 2022.

Figure 1. Temporal trends in COVID-19 vaccination hesitancy and confidence, March 2020-June 2022.

The trend regarding the belief that COVID-19 vaccines is effective basically paralleled the trajectory in vaccine acceptance rate, but with a time lag (Figure 1C). After reaching the peak in July 2020 (17.11% of 49,176 tweets), the belief in vaccine effectiveness started to wane, reaching its lowest level of 7.96% of 239,882 tweets in January 2021, which was approximately 5 months after the nadir in the vaccine acceptance rate observed in August 2020. The belief in vaccine effectiveness continuously rose from January 2021 to October 2021, when the country-level average reached 12.89% of 150,128 tweets. As the Omicron variant emerged and spread in November 2021, the belief in vaccine effectiveness started to decline, dropping to 7.48% of 37,842 tweets in May 2022. On the other hand, throughout this study’s period, the belief that COVID-19 vaccines are not safe continuously rose by 7.49 times, from 2.84% of 44,944 tweets in March 2020 to 21.27% of 48,167 tweets in June 2022.

Figure 2 compares the prevalence of hesitancy and confidence in COVID-19 vaccination across the 6 countries. The overall vaccine acceptance was slightly different across the 6 countries, with the highest acceptance rate in Ireland (60.66% of 46,732 tweets) and the lowest rate at 50.43% of 2,228,907 tweets in the United States (Figure 2A). In contrast, the highest rejection rate was observed in the United States (5.75% of 2,228,907 tweets), while the other 5 countries exhibited similar rejection rates, hovering around 3% (Figure 2B). Meanwhile, the 6 countries had similar rates of belief in vaccine effectiveness (ranging from 9.83% of 2,228,907 tweets in the United States to 11.06% of 46,732 tweets in Ireland) and vaccine unsafety (ranging from 8.10% of 46,732 tweets in Ireland to 10.33% of 2,228,907 tweets in the United States; Figures 2C and D).

Figure 2. The prevalence of hesitancy and confidence in COVID-19 vaccination by country.

Variations in hesitancy and confidence toward COVID-19 vaccines by different manufacturers were observed across countries (Table 1). In Ireland, the United States, and Canada, Moderna and Pfizer vaccines reached higher acceptance rates (>50%), whereas Australia and New Zealand recorded lower acceptance rates (about 40%). The acceptance rate of the AstraZeneca vaccine was the highest in the United Kingdom (50.48% of 9185 tweets), surpassing the other 5 countries by 6%-19%. The Johnson & Johnson vaccine was most accepted in Ireland (60.04% of 92 tweets), marking it 15%-29% higher than the other 5 countries. Patterns in rejection rates and confidence toward vaccine safety and effectiveness were aligned with those in the acceptance rates.

Table 1. The prevalence of hesitancy and confidence in COVID-19 vaccination by country and vaccine manufacturer. Some statistics were calculated based on an insufficient amount of data.
CountryPfizerModernaAstraZenecaJohnson & Johnson
Intent to accept COVID-19 vaccination, %

United States49.5853.1333.4037.10

United Kingdom49.2345.5250.4844.74


New Zealand42.7343.01a31.11a44.44a


Intent to reject COVID-19 vaccination, %

United States3.062.294.634.46

United Kingdom2.262.311.563.07


New Zealand1.680.22a2.67a11.11a


Belief that COVID-19 vaccines are effective, %

United States12.3013.479.569.45

United Kingdom10.7212.1812.3210.04


New Zealand9.6115.23a5.33a0a


Belief that COVID-19 vaccines are not safe, %

United States13.3813.7023.9522.70

United Kingdom15.0415.6414.5615.98


New Zealand17.1221.04a22.53a33.33a



aBased on an insufficient amount of data (less than 100 Twitter users).

In the United States, the prevalence of hesitancy and confidence in COVID-19 vaccination varied across the states (Figure 3). The acceptance rate of COVID-19 vaccination ranged from 44.50% to 59.90% across the states. Generally, northern states reported higher acceptance rates, whereas southern states showed lower acceptance. Specifically, Florida (44.50% of 121,996 tweets), Nevada (45.41% of 22,045 tweets), and Wyoming (45.53% of 1495 tweets) recorded the lowest acceptance rates. In contrast, Vermont (59.90% of 2903 tweets), Massachusetts (58.88% of 47,894 tweets), and the District of Columbia (57.81% of 53,627 tweets) had the highest acceptance rates (Figure 3A). States with higher acceptance rates typically have lower rejection rates (Figure 3B). Furthermore, confidence in COVID-19 vaccination mirrored the distribution of vaccine hesitancy (Figures 3C and D).

Figure 3. The prevalence of hesitancy and confidence in COVID-19 vaccination across the states in the United States.

Table 2 presents the associations between state-level sociodemographic indicators and attitudes about COVID-19 vaccination among tweets in the United States. In multivariable associations, compared to the Republican party, the Democrat party was associated with a lower rejection rate of COVID-19 vaccination by 0.939 percentage points (95% CI –1.673 to –0.206, P=.01), after adjusting for the other sociodemographic factors (model 1). Notably, in model 2, after additionally adjusting for vaccine confidence, associations of the Democrat party with the rejection rate attenuated at 0.416 (95% CI –0.822 to –0.011, P=.04) percentage points. Indicators of vaccine hesitancy and confidence were highly correlated with each other. Consistent associations were also observed in bivariate analyses.

Table 2. Factors associated with hesitancy and confidence in COVID-19 vaccination across the states in the United States from bivariate and multivariable linear regressions (using state-level statistics, based on 1,812,398 tweets from all 50 US states and Washington DC).
CharacteristicsIntent to accept COVID-19 vaccinationIntent to reject COVID-19 vaccinationBelief that COVID-19 vaccines are effectiveBelief that COVID-19 vaccines are unsafe

BivariateModel 1aModel 2bBivariateModel 1Model 2BivariateModel 1Model 2BivariateModel 1Model 2
Aged ≥65 years


95% CI–0.522 to 0.502–0.492 to 0.588–0.355 to 0.152–0.188 to 0.180–0.212 to 0.154–0.078 to 0.124–0.145 to 0.121–0.161 to 0.134–0.177 to 0.029–0.241 to 0.125–0.285 to 0.080–0.240 to 0.011

P value.
Density (people per square mile/100)


95% CI–0.004 to 0.123–0.047 to 0.146–0.053 to 0.041–0.044 to 0.002–0.039 to 0.027–0.006 to 0.032–0.011 to 0.023–0.025 to 0.028–0.036 to 0.002–0.052 to –0.008–0.065 to 0.0005–0.053 to –0.008

P value.
Gross domestic product (GDP) per capita (ln)


95% CI–0.470 to 7.320–6.610 to 6.436–2.779 to 3.141–2.992 to –0.261–2.980 to 1.446–2.047 to 0.318–0.597 to 1.475–1.707 to 1.866–1.013 to 1.448–2.813 to –0.057–1.969 to 2.434–1.213 to 1.821

P value.
Democrat party (Republican as the reference)


95% CI0.267 to 4.168–0.269 to 4.057–0.716 to 1.313–1.794 to –0.471–1.673 to –0.206–0.822 to –0.011–0.172 to 0.874–0.262 to 0.923–0.489 to 0.355–1.568 to –0.188–1.401 to 0.060–0.881 to 0.137

P value.
Belief that COVID-19 vaccines are effective


95% CI2.459 to 3.7620.770 to 2.173–1.298 to –0.771–0.711 to –0.150–1.252 to –0.690–1.148 to –0.657

P value<.001<.001<.001.003<.001<.001
Belief that COVID-19 vaccines are unsafe


95% CI–2.818 to –2.013–2.224 to –1.0850.672 to 0.9940.340 to 0.795–0.659 to –0.363–0.755 to –0.432

P value<.001<.001<.001<.001<.001<.001

aMultivariable linear regressions with the following covariates: political party (Democrat vs Republican as the reference) [30], population density, percentage of people aged ≥65 years, and log-transformed gross domestic product per capita.

bSimilar to model 1 but additionally adjusted for belief in effectiveness or unsafety of COVID-19 vaccines.

cDetermined using bivariate and multivariable linear regression analysis.

d—: not available.

Principal Findings

This study, using social media data, monitored the trajectories of COVID-19 vaccine hesitancy and confidence in 6 high-income countries throughout the pandemic from January 2020 to June 2022. The 6 countries experienced similar evolving trends of COVID-19 vaccine hesitancy and confidence. Since the pandemic began, there has been growing hesitancy toward the COVID-19 vaccine. However, this hesitancy lessened from late 2020 to May 2021, when safety and effectiveness data for the vaccine was released. After May 2021, vaccine hesitancy started to rise again. On average, the prevalence of intent to accept COVID-19 vaccination decreased from 71% of 44,944 tweets in March 2020 to 35% of 48,167 tweets in June 2022, with fluctuations. The COVID-19 vaccine hesitancy and confidence varied by country, vaccine manufacturer, and subregion within a country. In the United States, a higher proportion of Democratic-leaning residents and higher vaccine confidence were significantly associated with lower vaccine hesitancy.

By finetuning the CT-BERT model, we conducted a cross-country social media listening study, which complements traditional public health surveillance approaches such as surveys in tackling global health challenges. During the COVID-19 pandemic, a rapidly growing body of literature has used social media listening methods to assess public attitudes toward COVID-19 vaccines. They mainly used data from Twitter to analyze public sentiment, acceptance, and topics in antivax and provax discourse toward COVID-19 vaccines [42-45]. These studies provide a solid foundation for social media listening studies on vaccines. Future studies should explore the vast potential of social media listening and how it can be integrated into existing public health surveillance systems to inform near–real-time intervention and address a wide array of global health issues.

Overall, increased COVID-19 vaccine hesitancy and decreased confidence in vaccine effectiveness and safety on Twitter were observed during 2020 and 2022. Such trends were aligned with a number of previous cross-country studies on COVID-19 vaccines [6,29,44,46], suggesting the reliability of our findings. Prior to the COVID-19 vaccine rollout, concerns and conspiracy theories surrounding COVID-19 vaccines proliferated, in part due to the rapidness of the vaccine development and the scarcity of clinical trials [47]. This might account for the rise in vaccine hesitancy during the early pandemic. Vaccine hesitancy then decreased with the release of clinical trial results in late 2020, which demonstrated vaccine effectiveness of up to 95% [48-50]. However, with the ongoing mutations of the coronavirus, vaccination hasn't been able to completely shield individuals from COVID-19 infections, nor effectively halt the virus's spread within communities. Such limitation of COVID-19 vaccines might foster distrust and cultivate conspiracy theories. When waning immunity was observed as Delta and Omicron variants started to spread in May and November 2021, respectively, vaccine hesitancy increased again [41]. On the other hand, with large-scale COVID-19 vaccination campaigns in 2021, adverse events following immunizations (AEFIs) were widely experienced, vocalized, and reported [39,51]. Google search trends on vaccine adverse events skyrocketed, indicating the rising public’s concern on vaccine safety [52]. Misinformation and rumors were also widespread, especially on social media platforms, which may exacerbate the concerns about vaccine safety and effectiveness [7,47,53,54]. Preparedness for AEFIs during mass vaccination rollout and rapid responses to misinformation could be essential to reducing the public’s vaccine hesitancy and boosting confidence, not only during the COVID-19 pandemic but also in future ones.

Despite a noticeable decline in vaccine acceptance on Twitter in 2021, real-life daily COVID-19 vaccine uptake stayed consistently high and seemed largely unaffected. This discrepancy might be partly attributed to mandatory COVID-19 vaccination policies. Under such policies, some vaccinated individuals might still harbor and express negative sentiments on social media. It may also stem from gaps between the general population and Twitter users. Vocal antivaccine groups on Twitter might have formed a tight-knit circle, continuously congregating and reinforcing their messages, which could be a significant factor in the perceived decline in vaccine acceptance observed in our data set [55].

The differential prevalence of vaccine hesitancy and confidence by country and manufacturer were also aligned with previous studies [2,4,7,56,57]. Since AstraZeneca and Johnson & Johnson vaccination paused after related AEFI reports [39,58,59], vaccine acceptance rates for these 2 vaccines were lower in most of, but not all, the 6 high-income countries. The UK Twitter users held a higher acceptance rate of the AstraZeneca vaccine manufactured based in the United Kingdom, suggesting that the location of the manufacturer might play an important role in vaccine hesitancy. A survey also showed that the French population reported the lowest hesitancy for vaccines manufactured in the European Union, but higher hesitancy for vaccines manufactured in the United States or China [56].

We found that in the United States, Democrat-leaning states were estimated to be significantly lower in vaccine hesitancy, which was consistent with a previous survey indicating that Republicans exhibited more negative sentiments toward COVID-19 vaccination than Democrats [6]. The party gap in vaccine hesitancy further led to the excess mortality gap between Republicans and Democrats following the deployment of COVID-19 vaccination [60]. Differential exposure to media channels and public figures may explain the observed gap in vaccine hesitancy between self-identified Democrats and Republicans. Democrat-leaning states had a higher percentage of COVID-19 cases in early pandemic [61], and also due to different sources of information, Democrats perceived the COVID-19 threat to be greater than Republicans [6]. Compared to the Democrats, the trust in the media decreased significantly during the pandemic among the Republicans [6], whereas misinformation on COVID-19 vaccinations may be more likely to spread in people who don’t trust the information source (ie, Republicans). Our study highlights the necessity to bridge the vaccine confidence gap and prevent death due to political affiliation, particularly addressing the rising trends of vaccine hesitancy among Republicans.

Our study is subject to several limitations. Twitter is more commonly used by the younger generations [62], literate people, and those with access to the internet. Therefore, discourse on Twitter might not reflect the broader population, and our ecological analysis might be biased. However, by analyzing all English-language tweets containing COVID-19 vaccine–related keywords with a state-of-the-art deep learning model, our results are valid and robust enough to represent the opinion of Twitter users in the 6 high-income countries, and are aligned with previous survey studies [8,46]. Furthermore, our correlation analysis was also restricted to state-level statistics in the United States. Examining other factors associated with COVID-19 vaccination using a larger sample size should be considered in further studies. Finally, future analyses should be conducted to explain the trajectories in COVID-19 vaccine hesitancy and confidence in the 6 countries.

Our study also has several notable strengths. To begin with, we continuously monitored vaccine hesitancy and confidence throughout the entire pandemic in 6 countries, which is often infeasible for survey studies. This longitudinal analysis enhances our understanding of evolving vaccine attitudes and strengthens the preparedness for future pandemics. Second, compared to survey studies, social media data provides a cost-efficient approach to tracking the trajectory of vaccine hesitancy and confidence, facilitating near–real-time public health interventions. Third, this study provides a pathway to monitor the subsequent “infodemics” of misinformation, rumor, and distrust using advanced machine learning approaches. While social media is a predominant place for the breeding of the “infodemic” that poses a challenge to public health response during the pandemic, studies that target the infodemic and its linkages to factors such as sociodemographic indicators are limited. Future studies may leverage more in-depth analyses using social media data for detecting, tracking, and addressing the “infodemic.” Overall, this study finetuned state-of-the-art deep learning models using a task-specific data set as a rapid and effective approach to monitoring vaccine hesitancy and confidence, which provides a more reliable estimation of vaccine hesitancy and confidence.


With an advanced deep learning model and large-scale social media data, this study tracked the trajectory of COVID-19 vaccine acceptance and confidence on Twitter in 6 high-income countries. We highlighted the similarity in the temporal trends in hesitancy and confidence across countries, which were influenced by the development of vaccines and the evolution of viruses throughout the pandemic. This study also revealed the discrepancy across regions and vaccine manufacturers and that the spatial variation may be associated with political ideology. This surveillance study highlights the importance of deep learning-based social media monitoring to detect emerging trends to inform timely interventions and provide insight not yet covered in previous surveys. Future studies should leverage deep learning models as a rapid and effective approach to monitor public hesitancy toward varying kinds of vaccines in real time, with data from multiple social media platforms.


ZH acknowledges financial support from the Soft Science Research Project of Shanghai Science and Technology Innovation Action Plan (22692107600). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of this paper. We thank Yixing Tong, Fanxing Du, Linyao Lu, Sihong Zhao, and Kexin Yu from the School of Public Health, Fudan University for their help with data annotation, and Chen Wang from the Software School of Fudan University for his help with machine learning.

Data Availability

Results and Python code for data analysis can be available on request by contacting the corresponding author (ZH). However, the original tweet data cannot be shared according to Twitter's Terms of Service.

Authors' Contributions

ZH supervised this study. ZH and XZ conceived and designed this study. XZ acquired Twitter data, and YZ collected sociodemographic data in the United States. XZ implemented data preprocessing, deep learning analysis, and data visualization. SS conducted statistical analyses. SS, ZH, and XZ drafted this paper. ZH provided administrative, technical, and material support. All authors contributed to data interpretation and approved this paper.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Details of methods, Tables S1-S5, and Figures S1-S4.

DOCX File , 3025 KB

  1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20(5):533-534. [FREE Full text] [CrossRef] [Medline]
  2. Murphy J, Vallières F, Bentall RP, Shevlin M, McBride O, Hartman TK, et al. Psychological characteristics associated with COVID-19 vaccine hesitancy and resistance in Ireland and the United Kingdom. Nat Commun. 2021;12(1):29. [FREE Full text] [CrossRef] [Medline]
  3. Understanding vaccination progress. Johns Hopkins University and Medicine Coronavirus Resource Center. 2022. URL: [accessed 2023-10-11]
  4. Aw J, Seng JJB, Seah SSY, Low LL. COVID-19 vaccine Hesitancy-A scoping review of literature in high-income countries. Vaccines (Basel). 2021;9(8):900. [FREE Full text] [CrossRef] [Medline]
  5. Muqattash R, Niankara I, Traoret RI. Survey data for COVID-19 vaccine preference analysis in the United Arab Emirates. Data Brief. 2020;33:106446. [FREE Full text] [CrossRef] [Medline]
  6. Fridman A, Gershon R, Gneezy A. COVID-19 and vaccine hesitancy: a longitudinal study. PLoS One. 2021;16(4):e0250123. [FREE Full text] [CrossRef] [Medline]
  7. Loomba S, de Figueiredo A, Piatek SJ, de Graaf K, Larson HJ. Measuring the impact of COVID-19 vaccine misinformation on vaccination intent in the UK and USA. Nat Hum Behav. 2021;5(3):337-348. [FREE Full text] [CrossRef] [Medline]
  8. Szilagyi PG, Thomas K, Shah MD, Vizueta N, Cui Y, Vangala S, et al. National trends in the US public's likelihood of getting a COVID-19 vaccine-April 1 to December 8, 2020. JAMA. 2020;325(4):396-398. [FREE Full text] [CrossRef] [Medline]
  9. Malik AA, McFadden SM, Elharake J, Omer SB. Determinants of COVID-19 vaccine acceptance in the US. EClinicalMedicine. 2020;26:100495. [FREE Full text] [CrossRef] [Medline]
  10. Song S, Zang S, Gong L, Xu C, Lin L, Francis MR, et al. Willingness and uptake of the COVID-19 testing and vaccination in urban China during the low-risk period: a cross-sectional study. BMC Public Health. 2022;22(1):556. [FREE Full text] [CrossRef] [Medline]
  11. Hou Z, Song S, Du F, Shi L, Zhang D, Lin L, et al. The influence of the COVID-19 epidemic on prevention and vaccination behaviors among Chinese children and adolescents: cross-sectional online survey study. JMIR Public Health Surveill. 2021;7(5):e26372. [FREE Full text] [CrossRef] [Medline]
  12. Karafillakis E, Van Damme P, Hendrickx G, Larson HJ. COVID-19 in Europe: new challenges for addressing vaccine hesitancy. Lancet. 2022;399(10326):699-701. [FREE Full text] [CrossRef] [Medline]
  13. Soares P, Rocha JV, Moniz M, Gama A, Laires PA, Pedro AR, et al. Factors associated with COVID-19 vaccine hesitancy. Vaccines. 2021;9(3):300. [FREE Full text] [CrossRef]
  14. Karafillakis E, Martin S, Simas C, Olsson K, Takacs J, Dada S, et al. Methods for social media monitoring related to vaccination: systematic scoping review. JMIR Public Health Surveill. 2021;7(2):e17149. [FREE Full text] [CrossRef] [Medline]
  15. Du J, Cunningham RM, Xiang Y, Li F, Jia Y, Boom JA, et al. Leveraging deep learning to understand health beliefs about the human papillomavirus vaccine from social media. NPJ Digit Med. 2019;2:27. [FREE Full text] [CrossRef] [Medline]
  16. Hou Z, Tong Y, Du F, Lu L, Zhao S, Yu K, et al. Assessing COVID-19 vaccine hesitancy, confidence, and public engagement: a global social listening study. J Med Internet Res. 2021;23(6):e27632. [FREE Full text] [CrossRef] [Medline]
  17. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A, et al. Attention is all you need. ArXiv Preprint posted online on 02 Aug 2023. [FREE Full text]
  18. Devlin J, Chang MW, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. ArXiv Preprint posted online on 24 May 2019. [FREE Full text]
  19. Müller M, Salathé M, Kummervold PE. COVID-Twitter-BERT: a natural language processing model to analyse COVID-19 content on Twitter. ArXiv Preprint posted online on 15 May 2020. ;6 [FREE Full text] [CrossRef]
  20. Hussain A, Tahir A, Hussain Z, Sheikh Z, Gogate M, Dashtipour K, et al. Artificial intelligence-enabled analysis of public attitudes on Facebook and Twitter toward COVID-19 vaccines in the United Kingdom and the United States: observational study. J Med Internet Res. 2021;23(4):e26627. [FREE Full text] [CrossRef] [Medline]
  21. Lyu H, Wu W, Wang J, Duong V, Zhang X, Luo J, et al. Social media study of public opinions on potential COVID-19 vaccines: informing dissent, disparities, and dissemination. ArXiv Preprint posted online on 09 Aug 2021. [FREE Full text] [CrossRef]
  22. Criss S, Nguyen TT, Norton S, Virani I, Titherington E, Tillmanns EL, et al. Advocacy, hesitancy, and equity: exploring U.S. race-related discussions of the COVID-19 vaccine on Twitter. Int J Environ Res Public Health. 2021;18(11):5693. [FREE Full text] [CrossRef] [Medline]
  23. Lyu JC, Han EL, Luli GK. COVID-19 vaccine-related discussion on Twitter: topic modeling and sentiment analysis. J Med Internet Res. 2021;23(6):e24435. [FREE Full text] [CrossRef] [Medline]
  24. Bacsu JD, O'Connell ME, Cammer A, Azizi M, Grewal K, Poole L, et al. Using Twitter to understand the COVID-19 experiences of people with dementia: infodemiology study. J Med Internet Res. 2021;23(2):e26254. [FREE Full text] [CrossRef] [Medline]
  25. Swanson K, Ravi A, Saleh S, Weia B, Pleasants E, Arvisais-Anhalt S. Effect of recent abortion legislation on Twitter user engagement, sentiment, and expressions of trust in clinicians and privacy of health information: content analysis. J Med Internet Res. 2023;25:e46655. [FREE Full text] [CrossRef] [Medline]
  26. jonbakerfish/TweetScraper. GitHub. 2021. URL: [accessed 2023-10-11]
  27. Dredze M, Paul MJ, Bergsma S, Tran H. Carmen: a Twitter geolocation system with applications to public health. AAAI Workshop—Technical Report. 2013:20-24. [FREE Full text]
  28. MacDonald NE, SAGE Working Group on Vaccine Hesitancy. Vaccine hesitancy: definition, scope and determinants. Vaccine. 2015;33(34):4161-4164. [FREE Full text] [CrossRef] [Medline]
  29. Zhou X, Zhang X, Larson HJ, de Figueiredo A, Jit M, Fodeh S, et al. Global spatiotemporal trends and determinants of COVID-19 vaccine acceptance on Twitter: a multilingual deep learning study in 135 countries and territories. medRxiv. 2022 [FREE Full text] [CrossRef]
  30. Map of the presidential election of 2020 between Joe Biden and Donald Trump. Wikimedia Commons. 2022. URL: [accessed 2023-10-11]
  31. GDP by State. The US Bureau of Economic Analysis. Bureau of Economic Analysis. 2023. URL: [accessed 2023-10-11]
  32. Abbas KM, Kang GJ, Chen D, Werre SR, Marathe A. Demographics, perceptions, and socioeconomic factors affecting influenza vaccination among adults in the United States. PeerJ. 2018;6:e5171. [FREE Full text] [CrossRef] [Medline]
  33. Ye X. Exploring the relationship between political partisanship and COVID-19 vaccination rate. J Public Health (Oxf). 2023;45(1):91-98. [FREE Full text] [CrossRef] [Medline]
  34. Eden J, Salas J, Rutschman AS, Prener CG, Niemotka SL, Wiemken TL. Associations of presidential voting preference and gubernatorial control with county-level COVID-19 case and death rates in the continental United States. Public Health. 2021;198:161-163. [FREE Full text] [CrossRef] [Medline]
  35. Albrecht D. Vaccination, politics and COVID-19 impacts. BMC Public Health. 2022;22(1):96. [FREE Full text] [CrossRef] [Medline]
  36. Rocklöv J, Sjödin H. High population densities catalyse the spread of COVID-19. J Travel Med. 2020;27(3):taaa038. [FREE Full text] [CrossRef] [Medline]
  37. Chen Y, Klein SL, Garibaldi BT, Li H, Wu C, Osevala NM, et al. Aging in COVID-19: vulnerability, immunity and intervention. Ageing Res Rev. 2021;65:101205. [FREE Full text] [CrossRef] [Medline]
  38. Hughes MM, Wang A, Grossman MK, Pun E, Whiteman A, Deng L, et al. County-level COVID-19 vaccination coverage and social vulnerability—United States, December 14, 2020-March 1, 2021. MMWR Morb Mortal Wkly Rep. 2021;70(12):431-436. [FREE Full text] [CrossRef] [Medline]
  39. Joint CDC and FDA Statement on Johnson & Johnson COVID-19 Vaccine. Centers for Disease Control and Prevention. 2021. URL: [accessed 2023-10-11]
  40. Variants of the virus. Centers for Disease Control and Prevention. 2021. URL: [accessed 2023-10-11]
  41. Tracking SARS-CoV-2 variants. World Health Organization. 2022. URL: [accessed 2023-10-11]
  42. Zaidi Z, Ye M, Samon F, Jama A, Gopalakrishnan B, Gu C, et al. Topics in antivax and provax discourse: yearlong synoptic study of COVID-19 vaccine tweets. J Med Internet Res. 2023;25:e45069. [FREE Full text] [CrossRef] [Medline]
  43. Ye J, Hai J, Wang Z, Wei C, Song J. Leveraging natural language processing and geospatial time series model to analyze COVID-19 vaccination sentiment dynamics on tweets. JAMIA Open. 2023;6(2):ooad023. [FREE Full text] [CrossRef] [Medline]
  44. To QG, To KG, Huynh VAN, Nguyen NT, Ngo DT, Alley S, et al. Anti-vaccination attitude trends during the COVID-19 pandemic: a machine learning-based analysis of tweets. Digit Health. 2023;9:20552076231158033. [FREE Full text] [CrossRef] [Medline]
  45. Cheng T, Han B, Liu Y. Exploring public sentiment and vaccination uptake of COVID-19 vaccines in England: a spatiotemporal and sociodemographic analysis of Twitter data. Front Public Health. 2023;11:1193750. [FREE Full text] [CrossRef] [Medline]
  46. KAP COVID trend analysis for 23 countries. Johns Hopkins Center for Communication Programs. URL: [accessed 2021-05-31]
  47. Hotez PJ, Cooney RE, Benjamin RM, Brewer NT, Buttenheim AM, Callaghan T, et al. Announcing the lancet commission on vaccine refusal, acceptance, and demand in the USA. Lancet. 2021;397(10280):1165-1167. [FREE Full text] [CrossRef] [Medline]
  48. Mulligan MJ, Lyke KE, Kitchin N, Absalon J, Gurtman A, Lockhart S, et al. Phase I/II study of COVID-19 RNA vaccine BNT162b1 in adults. Nature. 2020;586(7830):589-593. [FREE Full text] [CrossRef] [Medline]
  49. Mahase E. Covid-19: Oxford vaccine is up to 90% effective, interim analysis indicates. BMJ. 2020;371:m4564. [FREE Full text] [CrossRef] [Medline]
  50. Pfizer and BioNTech conclude phase 3 study of COVID-19 vaccine candidate, meeting all primary efficacy endpoints. Pfizer. 2020. URL: https:/​/www.​​news/​press-release/​press-release-detail/​pfizer-and-biontech-conclude-phase-3-study-covid-19-vaccine [accessed 2023-10-11]
  51. Hacisuleyman E, Hale C, Saito Y, Blachere NE, Bergh M, Conlon EG, et al. Vaccine breakthrough infections with SARS-CoV-2 variants. N Engl J Med. 2021;384(23):2212-2218. [FREE Full text] [CrossRef] [Medline]
  52. Vaccine adverse event. Google Trends. 2023. URL: [accessed 2023-10-11]
  53. Hou Z, Du F, Zhou X, Jiang H, Martin S, Larson H, et al. Cross-country comparison of public awareness, rumors, and behavioral responses to the COVID-19 epidemic: infodemiology study. J Med Internet Res. 2020;22(8):e21143. [FREE Full text] [CrossRef] [Medline]
  54. Depoux A, Martin S, Karafillakis E, Preet R, Wilder-Smith A, Larson H. The pandemic of social media panic travels faster than the COVID-19 outbreak. J Travel Med. 2020;27(3):taaa031. [FREE Full text] [CrossRef] [Medline]
  55. Zang S, Zhang X, Xing Y, Chen J, Lin L, Hou Z. Applications of social media and digital technologies in COVID-19 vaccination: scoping review. J Med Internet Res. 2023;25:e40057. [FREE Full text] [CrossRef] [Medline]
  56. Schwarzinger M, Watson V, Arwidson P, Alla F, Luchini S. COVID-19 vaccine hesitancy in a representative working-age population in France: a survey experiment based on vaccine characteristics. Lancet Public Health. 2021;6(4):e210-e221. [FREE Full text] [CrossRef] [Medline]
  57. Self WH, Tenforde MW, Rhoads JP, Gaglani M, Ginde AA, Douin DJ, et al. MMWR Morb Mortal Wkly Rep. 2021;70(38):1337-1343. [FREE Full text] [CrossRef] [Medline]
  58. Vogel G, Kupferschmidt K. 'It's a very special picture. Why vaccine safety experts put the brakes on AstraZeneca's COVID-19 vaccine. Science. 2021. URL: https:/​/www.​​content/​article/​it-s-very-special-picture-why-vaccine-safety-experts-put-brakes-astrazeneca-s-covid-19 [accessed 2023-10-11]
  59. Johnson & Johnson Vaccinations Paused After Rare Clotting Cases Emerge. The New York Times. 2021. URL: [accessed 2023-10-11]
  60. She hunts viral rumors about real viruses. The New York Times. 2020. URL: [accessed 2023-10-11]
  61. Coronavirus has come to Trump country. The Washington Post. 2021. URL: [accessed 2023-10-11]
  62. Sloan L, Morgan J, Burnap P, Williams M. Who tweets? Deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data. PLoS One. 2015;10(3):e0115545. [FREE Full text] [CrossRef] [Medline]

AEFI: adverse events following immunization
BERT: Bidirectional Encoder Representations from Transformer
CT-BERT: COVID-Twitter-Bidirectional Encoder Representations from Transformer
NLP: natural language processing

Edited by T de Azevedo Cardoso; submitted 07.06.23; peer-reviewed by A Jarynowski, S Lin; comments to author 24.07.23; revised version received 17.09.23; accepted 03.10.23; published 06.11.23.


©Xinyu Zhou, Suhang Song, Ying Zhang, Zhiyuan Hou. Originally published in the Journal of Medical Internet Research (, 06.11.2023.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.