Spatiotemporal Trends in Self-Reported Mask-Wearing Behavior in the United States: Analysis of a Large Cross-sectional Survey

Background Face mask wearing has been identified as an effective strategy to prevent the transmission of SARS-CoV-2, yet mask mandates were never imposed nationally in the United States. This decision resulted in a patchwork of local policies and varying compliance, potentially generating heterogeneities in the local trajectories of COVID-19 in the United States. Although numerous studies have investigated the patterns and predictors of masking behavior nationally, most suffer from survey biases and none have been able to characterize mask wearing at fine spatial scales across the United States through different phases of the pandemic. Objective Urgently needed is a debiased spatiotemporal characterization of mask-wearing behavior in the United States. This information is critical to further assess the effectiveness of masking, evaluate the drivers of transmission at different time points during the pandemic, and guide future public health decisions through, for example, forecasting disease surges. Methods We analyzed spatiotemporal masking patterns in over 8 million behavioral survey responses from across the United States, starting in September 2020 through May 2021. We adjusted for sample size and representation using binomial regression models and survey raking, respectively, to produce county-level monthly estimates of masking behavior. We additionally debiased self-reported masking estimates using bias measures derived by comparing vaccination data from the same survey to official records at the county level. Lastly, we evaluated whether individuals’ perceptions of their social environment can serve as a less biased form of behavioral surveillance than self-reported data. Results We found that county-level masking behavior was spatially heterogeneous along an urban-rural gradient, with mask wearing peaking in winter 2021 and declining sharply through May 2021. Our results identified regions where targeted public health efforts could have been most effective and suggest that individuals’ frequency of mask wearing may be influenced by national guidance and disease prevalence. We validated our bias correction approach by comparing debiased self-reported mask-wearing estimates with community-reported estimates, after addressing issues of a small sample size and representation. Self-reported behavior estimates were especially prone to social desirability and nonresponse biases, and our findings demonstrated that these biases can be reduced if individuals are asked to report on community rather than self behaviors. Conclusions Our work highlights the importance of characterizing public health behaviors at fine spatiotemporal scales to capture heterogeneities that may drive outbreak trajectories. Our findings also emphasize the need for a standardized approach to incorporating behavioral big data into public health response efforts. Even large surveys are prone to bias; thus, we advocate for a social sensing approach to behavioral surveillance to enable more accurate estimates of health behaviors. Finally, we invite the public health and behavioral research communities to use our publicly available estimates to consider how bias-corrected behavioral estimates may improve our understanding of protective behaviors during crises and their impact on disease dynamics.

In addition, when dichotomizing CTIS responses we assume that people who "sometimes" wore a face mask in public were not masking which may lead us to underestimate self-reported masking levels. We believe the effects of this assumption are minimal because the proportion of "sometimes" responses was small compared to other options (Figs. S14).

Estimation of CTIS bias
We compare CTIS responses about vaccination to true vaccination estimates for the period from April 1, 2021 through May 31, 2021. We chose this period when nearly all adults were eligible for vaccination in the U.S. so that the sample population responding to the masking and vaccination questions would be most similar. In this time period, the differences between the survey and ground-truth vaccination data have also stabilized (Fig. S18). We use a (frequentist) binomial generalized linear mixed-effects model to estimate p i , the proportion of respondents who were vaccinated at the county-level each week. If V i is the number of (partially) vaccinated respondents in each county i out of N i respondents, then the model is as follows: where t and t 2 are orthogonal polynomials of degree 1 and 2, respectively, generated by the poly function in R from the rank of the weeks in which the vaccination data were observed. Therefore, 1 and 2 are covariates that describe the trend of time in expected reported vaccination across counties, while u i describes systematic difference in vaccination in county i relative to the mean trend. This model was implemented using glmer in the lme4 package [36].
Using these modeled CTIS county-level vaccination proportions, we compared them with the true vaccination data to calculate the expected bias in reported survey responses relative to ground truth data in county i. There were 45 counties that had missing bias estimates due to either a lack of weekly CTIS vaccination survey responses between 0 and 1 (cannot use logits of p = 0, 1), or missing true vaccination estimates for weeks with survey responses between 0 and 1 (therefore, cannot calculate difference between true and CTIS vaccination estimates).

CTIS model coefficients
Model coefficients for z-score(log 10 (population density)) ranged from 0.45 to 0.55 (Fig. S22), meaning a one unit change in z-score(log 10 (population density)) is correlated with the expected odds of masking multiplying by e 0.5 ⇡ 1.65. The coefficient of the population density covariate in our binomial regression model is consistent over time, indicating that the relationship between population density and masking behavior is stable across months.

CTIS mixed effects model specifications
In addition to the models presented in the main text, we ran two models to estimate mask-wearing at the county-month level using state or county-level random effects. For both models, we define M i as the number of respondents masking in county i (e.g., respondents that masked most or all of the time in the past 5-7 days), N i as the total number of respondents in county i (M i  N i ), and p i as the county-level probability of a response consistent with masking. We ran both models using brms [29] with the cmdstanR [30] backend and we ran the sampler with 4 chains for 3000 iterations per chain. We use the following model to estimatê p i andM i with state-level effects: where D i = log 10 (population density i ) for county i. TheR values were frequently 1.02 and n ef f < 500 for the intercept population-level effect and group-level effects, indicating lack of convergence. The population density coefficient did show convergence (R = 1 and n ef f > 1800) and remained consistent over time and close to the values observed in the original model. Only approximately 1% of observations had Pareto k values > 0.7 indicating that the model was robust to the influence of individual observations. For county-level effects we used the following model to estimatep i andM i : where D i = log 10 (population density i ) for county i. In this case sampler diagnostics indicated reasonable model convergence withR  1.01 and n ef f > 600 for all months except April and May 2021 when n ef f > 250. However, over 50% of observations in each month had Pareto k values > 0.7, indicating the influence of each data point and potential overfitting. As observed with the state-level random effects model, the population density coefficient remained consistent over time with values near those from the original model.

Outbreaks Near Me survey details & results
The Outbreaks Near Me (ONM) survey was created by scientists at Harvard and Boston Children's Hospital and distributed through a partnership with SurveyMonkey. Following the completion of another survey on SurveyMonkey, a random representative sample of users across the United States were invited to take the Outbreaks Near Me survey. The survey was released in June 2020 and asked respondents how likely they were to wear a mask in several different environments: while grocery shopping, visiting with friends and family in their homes, exercising outside, and in the workplace. Answer options were (1) Very, (2) Somewhat, (3) Not so, or (4) Not likely at all (Fig. S23). We focus on the responses to the grocery shopping scenario, as this setting is most comparable to the "in public" scenario described in the CTIS question. To dichotomize these responses for an analysis of the proportion of respondents wearing masks, we consider "very likely" responses as masking and all other responses as not masking. This choice allows for comparison with CTIS data and makes sense given the small percentage of "somewhat likely" responses ( Fig. S24). We aggregate responses at the zipcode-month scale and crosswalk these estimates to the county-month level using HUD files [61]. The Outbreaks Near Me survey dropped responses from individuals who reported their age as less than 13 or more than 100 years old. We additionally drop respondents who did not respond to the grocery store portion of the masking question or who have an invalid zip code that cannot be crosswalked to a fips code; this process leaves us with 1,042,685 valid responses. Survey weights were provided for individual responses at the weekly-state and daily-national scales, though we do not to use them because our analysis focuses on the county-month scale.
Due to small sample sizes, we use binomial regression models to estimate masking proportions for each county-month, as described in 'Bayesian binomial regression model' in the Methods (Figs. S25, S26). We compare estimates from these models to CTIS values calculated using only the binomial regression model, i.e., no raking/resampling or debiasing. We calculate the ratio of ONM to CTIS masking proportions for each county each month and take the average across all months in the survey, excluding counties that have estimates for fewer than five of nine months. Additionally, we visualize the time series of individual counties from the two surveys side by side.      Figure S10. Community reported masking gives a good estimate of bias-corrected self-reported masking even when influential fips code are removed from the model. Fips codes with pareto k values >= 0.7 were excluded from the Bayesian binomial regression model with bias offsets (specifically fips 4019, 6037, 6071, 12071, 12103, 36005, 40143, 41039, 45045, 48201, 48439, 53033). Community reported masking refers to the CTIS question where individuals report how many people in their community are masking, which may decrease non-response and social desirability bias compared to asking individuals to self-report their masking behavior. Point color denotes urban-rural classes. Figure S11. Self-reported masking estimates generally exceed community-reported estimates without biascorrection, but the data are noisy without the binomial regression model. Both masking estimates are calculated from raked and resampled observations but are not run through a binomial regression model to correct for small sample size. Additionally, self-reported masking estimates are not debiased. Recall that community reported masking refers to the CTIS question where individuals report how many people in their community are masking, which may decrease non-response and social desirability bias compared to asking individuals to self-report their masking behavior. Point color denotes urban-rural classes.  Figure S12. Non-bias-corrected self-reported masking estimates exceed community-reported estimates to a greater degree in rural counties. Both masking estimates are calculated from raked and resampled observations run through a binomial regression model to correct for small sample size. Self-reported masking estimates are not debiased. Recall that community reported masking refers to the CTIS question where individuals report how many people in their community are masking, which may decrease non-response and social desirability bias compared to asking individuals to self-report their masking behavior. Point color denotes urban-rural classes.