Published on in Vol 27 (2025)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/66097, first published .
Association Between COVID-19 During Pregnancy and Preterm Birth by Trimester of Infection: Retrospective Cohort Study Using Large-Scale Social Media Data

Association Between COVID-19 During Pregnancy and Preterm Birth by Trimester of Infection: Retrospective Cohort Study Using Large-Scale Social Media Data

Association Between COVID-19 During Pregnancy and Preterm Birth by Trimester of Infection: Retrospective Cohort Study Using Large-Scale Social Media Data

1Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

2Department of Genetics, University of Pennsylvania, Philadelphia, PA, United States

3Department of Health Sciences, University of York, York, United Kingdom

4Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States

5Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, United States

6Department of Computational Biomedicine, Cedars-Sinai Medical Center, Pacific Design Center, Ste. G549F, 700 N. San Vicente Blvd., West Hollywood, CA, United States

Corresponding Author:

Graciela Gonzalez-Hernandez, PhD


Background: Preterm birth, defined as birth at <37 weeks of gestation, is the leading cause of neonatal death globally and the second leading cause of infant mortality in the United States. There is mounting evidence that COVID-19 infection during pregnancy is associated with an increased risk of preterm birth; however, data remain limited by trimester of infection. The ability to study COVID-19 infection during the earlier stages of pregnancy has been limited by available sources of data.

Objective: The objective of this study was to use self-reports in large-scale social media data to assess the association between the trimester of COVID-19 infection and preterm birth.

Methods: In this retrospective cohort study, we used natural language processing and machine learning, followed by manual validation, to identify self-reports of pregnancy on Twitter and to search these users’ collection of publicly available tweets for self-reports of COVID-19 infection during pregnancy and, subsequently, a preterm birth or term birth outcome. Among the users who reported their pregnancy on Twitter, we also identified a 1:1 age-matched control group, consisting of users with a due date before January 1, 2020—that is, without COVID-19 infection during pregnancy. We calculated the odds ratios (ORs) with 95% CIs to compare the frequency of preterm birth for pregnancies with and without COVID-19 infection and by the timing of infection: first trimester (1‐13 weeks), second trimester (14‐27 weeks), or third trimester (28‐36 weeks).

Results: Through August 2022, we identified 298 Twitter users who reported COVID-19 infection during pregnancy, a preterm birth or term birth outcome, and maternal age: 94 (31.5%) with first-trimester infection, 110 (36.9%) with second-trimester infection, and 95 (31.9%) with third-trimester infection. In total, 26 (8.8%) of these 298 users reported preterm birth: 8 (8.5%) with first-trimester infection, 7 (6.4%) with second-trimester infection, and 12 (12.6%) with third-trimester infection. In the 1:1 age-matched control group, 13 (4.4%) of the 298 users reported preterm birth. Overall, the odds of preterm birth were significantly higher for pregnancies with COVID-19 infection compared to those without (OR 2.08, 95% CI 1.06‐4.28; P=.046). In particular, the odds of preterm birth were significantly higher for pregnancies with COVID-19 infection during the third trimester (OR 3.16, 95% CI 1.36‐7.29; P=.007). The odds of preterm birth were not significantly higher for pregnancies with COVID-19 infection during the first trimester (OR 2.05, 95% CI 0.78‐5.08; P=.12) or second trimester (OR 1.50, 95% CI 0.54‐3.82; P=.44) compared to those without infection.

Conclusions: Based on self-reports in large-scale social media data, the results of our study suggest that COVID-19 infection particularly during the third trimester is associated with higher odds of preterm birth.

J Med Internet Res 2025;27:e66097

doi:10.2196/66097

Keywords



Preterm birth, defined as birth at <37 weeks of gestation, is the leading cause of neonatal death globally [Liu L, Oza S, Hogan D, et al. Global, regional, and national causes of child mortality in 2000-13, with projections to inform post-2015 priorities: an updated systematic analysis. Lancet. Jan 31, 2015;385(9966):430-440. [CrossRef] [Medline]1] and the second leading cause of infant mortality in the United States [Murphy SL, Kochanek KD, Xu J, et al. Mortality in the United States, 2020. NCHS Data Brief. Dec 2021;427(427):1-8. [Medline]2]. According to recent systematic reviews and meta-analyses [Pathirathna ML, Samarasekara BPP, Dasanayake TS, et al. Adverse perinatal outcomes in COVID-19 infected pregnant women: a systematic review and meta-analysis. Healthcare (Basel). Jan 20, 2022;10(2):203. [CrossRef] [Medline]3-Jeong Y, Kim MA. The coronavirus disease 2019 infection in pregnancy and adverse pregnancy outcomes: a systematic review and meta-analysis. Obstet Gynecol Sci. Jul 2023;66(4):270-289. [CrossRef] [Medline]13], there is mounting evidence that COVID-19 infection during pregnancy is associated with an increased risk of preterm birth; however, data remain limited on the risk of preterm birth by trimester of infection. As a limitation of their review, Allotey et al [Allotey J, Stallings E, Bonet M, et al. Clinical manifestations, risk factors, and maternal and perinatal outcomes of coronavirus disease 2019 in pregnancy: living systematic review and meta-analysis. BMJ. Sep 1, 2020;370:m3320. [CrossRef] [Medline]5] wrote, “Not many studies reported outcomes by trimester for symptom onset.” Jeong and Kim [Jeong Y, Kim MA. The coronavirus disease 2019 infection in pregnancy and adverse pregnancy outcomes: a systematic review and meta-analysis. Obstet Gynecol Sci. Jul 2023;66(4):270-289. [CrossRef] [Medline]13] also noted that they “did not analyze infections according to pregnancy trimesters.” Among studies that did report the trimester of infection, most of the infections were limited to the third trimester. Sturrock et al [Sturrock S, Ali S, Gale C, et al. Neonatal outcomes and indirect consequences following maternal SARS-CoV-2 infection in pregnancy: a systematic review. BMJ Open. Mar 15, 2023;13(3):e063052. [CrossRef] [Medline]12] “were unable to examine the impact by trimester of maternal SARS-CoV-2 infection due to a paucity of studies examining offspring of first or second trimester infection.” Likewise, Smith et al [Smith ER, Oakley E, Grandner GW, et al. Adverse maternal, fetal, and newborn outcomes among pregnant women with SARS-CoV-2 infection: an individual participant data meta-analysis. BMJ Glob Health. Jan 2023;8(1):e009495. [CrossRef] [Medline]10] wrote, “There were relatively few instances of SARS-CoV-2 infection identified during the first trimester; the majority of cases were identified during the third trimester.” As Marchand et al [Marchand G, Patil AS, Masoud AT, et al. Systematic review and meta-analysis of COVID-19 maternal and neonatal clinical features and pregnancy outcomes up to June 3, 2021. AJOG Glob Rep. Feb 2022;2(1):100049. [CrossRef] [Medline]4] further pointed out, this lack of data affects the generalizability of the available evidence on the association between COVID-19 infection during pregnancy and preterm birth, arguing that “most of the included pregnant women were in the third trimester, so the results of this meta-analysis cannot be generalized to pregnant women in the first and second trimesters.” The ability to study COVID-19 infection during the first and second trimesters has been limited by available sources of data.

Most studies of COVID-19 infection during pregnancy have been limited to the third trimester because their data were collected in the hospital around the time of delivery. Allotey et al [Allotey J, Stallings E, Bonet M, et al. Clinical manifestations, risk factors, and maternal and perinatal outcomes of coronavirus disease 2019 in pregnancy: living systematic review and meta-analysis. BMJ. Sep 1, 2020;370:m3320. [CrossRef] [Medline]5] noted that studies “primarily reported on pregnant women who required visits to hospital, including for childbirth.” Hospital-based data not only limits the timing of COVID-19 infection primarily to the third trimester but also fails to account for potential exposure during the earlier stages of pregnancy in the comparator group of those admitted to the hospital without infection. As Allotey et al [Allotey J, Stallings E, Bonet M, et al. Clinical manifestations, risk factors, and maternal and perinatal outcomes of coronavirus disease 2019 in pregnancy: living systematic review and meta-analysis. BMJ. Sep 1, 2020;370:m3320. [CrossRef] [Medline]5] have suggested, “collection of maternal data by trimester of exposure, including the periconception period, is required.” To complement pregnancy registries [Hernández-Díaz S, Smith LH, Dollinger C, et al. International Registry of Coronavirus Exposure in Pregnancy (IRCEP): cohort description and methodological considerations. Am J Epidemiol. May 20, 2022;191(6):967-979. [CrossRef] [Medline]14], our previous work [Golder S, Chiuve S, Weissenbacher D, et al. Pharmacoepidemiologic evaluation of birth defects from health-related postings in social media during pregnancy. Drug Saf. Mar 2019;42(3):389-400. [CrossRef] [Medline]15] has demonstrated that Twitter can also be used as a source of self-report data to assess the association between medication exposure during pregnancy and birth defects, including by trimester of exposure. Our ability to retrospectively collect users’ tweets enabled us to observe reports of exposures in the early stages of pregnancy—before prenatal care [Nettleman MD, Brewer J, Stafford M. Scheduling the first prenatal visit: office-based delays. Am J Obstet Gynecol. Sep 2010;203(3):207. [CrossRef] [Medline]16] and even knowledge of being pregnant—while reducing recall bias. While Twitter data has been used for a wide range of studies related to COVID-19 [Tsao SF, Chen H, Tisseverasinghe T, et al. What social media told us in the time of COVID-19: a scoping review. Lancet Digit Health. Mar 2021;3(3):e175-e194. [CrossRef] [Medline]17], it has not been used to study COVID-19 infection during pregnancy. The objective of this study was to use large-scale social media data to assess the association between COVID-19 infection during pregnancy—specifically, by trimester of infection—and preterm birth.


Ethical Considerations

The institutional review boards of the University of Pennsylvania and Cedars-Sinai Medical Center reviewed this study and deemed it exempt. The publicly available data used in this study were collected and analyzed in accordance with the Twitter Terms of Service. The sample tweets in Table 1 have been lightly edited to deidentify the Twitter user.

Table 1. Sample tweets posted by a single Twitter user, illustrating self-reports of pregnancy (tweet 1), COVID-19 infection (tweet 2), preterm birth (tweet 3), and maternal age (tweet 4).
TweetTimestamp
1@username @username Genuine question. I’m 20 weeks pregnant. Should I get the booster? Am I high risk or should I wait until I get my invite from the government?12-31-2021
2@username I mean, I got the vaccines, use the passport, wear a mask and sanitize my hands. I haven’t been in a large group in two weeks, and I work in an office with my own private space. But today, I tested positive for COVID. So what’s the point?03-02-2022
3@username My first had muscle weakness on one side of her mouth. I had to pump & bottle feed. I thought this baby would be different but shes a preemie and can’t latch. I’m pumping again & supplementing with formula.05-21-2022
4@username @username I’m 26 and already there. I just don’t care anymore. I’m going to do what I want and what makes me happy.02-20-2021

Study Design and Participants

This retrospective cohort study used the Twitter timelines—all of the tweets posted over time—of users who reported their pregnancy on Twitter. We identified these users via an automated natural language processing (NLP) tool, Pregex [Klein AZ, Kunatharaju S, O’Connor K, et al. Pregex: rule-based detection and extraction of Twitter data in pregnancy. J Med Internet Res. Feb 9, 2023;25:e40569. [CrossRef] [Medline]18], that detects English-language tweets from the Twitter streaming application programming interface (API) that self-report a gestational age or due date of an ongoing pregnancy and, based on the tweets’ timestamp, extracts dates marking the 40-week prenatal period. For example, the first tweet in Table 1 self-reported a gestational age of 20 weeks, and, based on its timestamp, Pregex extracted the start date of pregnancy as August 13, 2021, and the due date as May 20, 2022. Although we did not limit the users by geographic location, our previous work [Golder S, Chiuve S, Weissenbacher D, et al. Pharmacoepidemiologic evaluation of birth defects from health-related postings in social media during pregnancy. Drug Saf. Mar 2019;42(3):389-400. [CrossRef] [Medline]15,Golder S, McRobbie-Johnson ACE, Klein A, et al. Social media and COVID-19 vaccination hesitancy during pregnancy: a mixed methods analysis. BJOG. Jun 2023;130(7):750-758. [CrossRef] [Medline]19] suggests that most of the users in this study are in the United States. At the time each user was identified, we collected all of their available past tweets and began collecting all of their subsequent tweets on an ongoing basis. The timelines used in this study included tweets that were posted through August 2022. We redeployed Pregex on each user’s full timeline in order to detect additional matching tweets and used the tweets containing the most precise unit of time as the basis for further analysis.

Exposures

The exposure for this study was COVID-19 infection during pregnancy. Therefore, we automatically excluded users with an extracted due date before March 11, 2020—the date that the World Health Organization (WHO) declared COVID-19 a pandemic. We deployed a deep neural network classifier to automatically detect tweets in the users’ timelines that self-reported COVID-19 infection [Klein AZ, Kunatharaju S, O’Connor K, et al. Automatically identifying self-reports of COVID-19 diagnosis on Twitter: an annotated data set, deep neural network classifiers, and a large-scale cohort. J Med Internet Res. Jul 3, 2023;25:e46484. [CrossRef] [Medline]20]. The classifier was designed to identify only tweets that indicate an actual diagnosis, including a positive test, clinical diagnosis, or hospitalization. We manually validated the tweets that were automatically classified as a COVID-19 diagnosis, such as the second tweet in Table 1, excluding false positives. For true positives, we also manually validated the corresponding users’ tweets that were automatically detected by Pregex. We then used the timestamp and temporal textual features (eg, today, now, just, yesterday, on Friday, 3 weeks ago, on April 24, at 36 weeks pregnant) of the COVID-19 tweets to manually determine the timing of infection—first trimester (1‐13 weeks), second trimester (14‐27 weeks), or third trimester (≥28 weeks)—and exclude users with an infection that was not during pregnancy. For example, the timestamp and text of the second tweet in Table 1 indicate that the infection was during pregnancy—in particular, the third trimester. For tweets lacking explicit temporal features, we used the timestamp itself as an approximation of infection onset if we could infer from the text a temporal reference to the present. We excluded users with COVID-19 infection that was during their pregnancy but for which we could not determine the specific timing.

Outcomes

The outcome for this study was preterm birth. Therefore, we excluded users with COVID-19 infection at ≥37 weeks of gestation—that is, beyond when preterm birth could occur. Among the manually validated users who were infected at <37 weeks of gestation, we deployed regular expressions—patterns of characters to search for matching text strings—to automatically detect tweets in their timelines that reported preterm birth [Klein AZ, Cai H, Weissenbacher D, et al. A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes. J Biomed Inform. 2020;112S:100076. [CrossRef] [Medline]21], such as the third tweet in Table 1. For users who did not post matching tweets, we manually analyzed their timelines for other evidence of preterm birth, such as birth announcements with a timestamp or temporal textual features indicating that the baby was born more than 3 weeks before the due date. We excluded users who reported preterm birth before COVID-19 infection. Also, instead of assuming that the lack of tweets reporting preterm birth indicated that the pregnancy had reached term, we sought to mitigate this potential reporting bias by including term birth outcomes only for users with evidence of a gestational age ≥37 weeks. We automatically gathered some of this evidence by calculating the time difference between the due date and timestamp of the tweets that were detected by Pregex. For users who did not post a matching tweet within 21 days of their due date, we manually analyzed their timelines for other evidence of a gestational age ≥37 weeks.

Covariates and Control Group

A covariate for this study that potentially could be identified on Twitter was maternal age. We deployed an automated NLP pipeline, ReportAGE [Klein AZ, Magge A, Gonzalez-Hernandez G. ReportAGE: Automatically extracting the exact age of Twitter users based on self-reports in tweets. PLoS ONE. 2022;17(1):e0262087. [CrossRef] [Medline]22], to extract users’ age from explicit self-reports in their tweets, such as the fourth tweet in Table 1. We manually validated the extracted ages and used the timestamp of the tweets to determine the users’ age at the time of their due date. For tweets that did not indicate the users’ birthday, we used a heuristic of rounding to the next year if the age was reported more than 6 months apart from the due date. For example, the user in Table 1 was aged 27 years on February 20, 2022, which was less than 6 months away from a due date of May 20, 2022, so we estimated a maternal age of 27. To account for this covariate, we excluded users who did not post a tweet that was detected by ReportAGE, and we identified a 1:1 age-matched control group, consisting of users without COVID-19 infection during pregnancy. To select users for inclusion in the control group, we deployed ReportAGE on the timelines of users with an extracted due date before January 1, 2020. We manually validated the age matches among users with more recent due dates. We followed the same procedure for identifying outcomes in their timelines as we did for those who reported COVID-19 infection, excluding users without evidence of preterm birth or term birth. We re-sampled age-matched users until we identified an outcome for each one. In addition to maternal age, we accounted for the availability of COVID-19 vaccines by identifying the subsets of users with COVID-19 infection before or after December 8, 2020—that is, the earliest availability of COVID-19 vaccines outside of clinical trials.

Statistical Analysis

To assess the association between COVID-19 infection during pregnancy and preterm birth, we used the frequencies of these 2 categorical variables in a 2 × 2 contingency table to calculate the odds ratio (OR) with a 95% CI, where OR = (A/B) / (C/D) (Table 2). We also performed a subgroup analysis, calculating the ORs to compare the frequency of preterm birth in the control group with the frequency of preterm birth by the timing of infection: first trimester (weeks 1‐13), second trimester (weeks 14‐27), or third trimester (weeks 28‐36). We performed a sensitivity analysis to assess the potential impact of the availability of COVID-19 vaccines. We used the Fisher exact test to assess statistically significant differences in preterm birth, with P<.05 considered statistically significant.

Table 2. 2 × 2 Contingency table for categorical variables of COVID-19 infection during pregnancy (exposure) and preterm birth (outcome).
Preterm birthTerm birth
Pregnancy with COVID-19 infectionAB
Pregnancy without COVID-19 infectionCD

Through August 2022, we automatically identified 153,038 users who self-reported their pregnancy on Twitter [Klein AZ, Kunatharaju S, O’Connor K, et al. Pregex: rule-based detection and extraction of Twitter data in pregnancy. J Med Internet Res. Feb 9, 2023;25:e40569. [CrossRef] [Medline]18], and collected 1.1 billion tweets posted by these users. For 67,671 of these users, we automatically extracted a due date that was during the COVID-19 pandemic (Figure 1). Searching the timelines of these 67,671 users, we automatically detected 2075 tweets (1728 users) that self-reported a COVID-19 diagnosis [Golder S, McRobbie-Johnson ACE, Klein A, et al. Social media and COVID-19 vaccination hesitancy during pregnancy: a mixed methods analysis. BJOG. Jun 2023;130(7):750-758. [CrossRef] [Medline]19]. Through manual validation, we identified 501 users who were infected at <37 weeks of gestation. We identified a report of preterm birth for 38 (7.6%) of these 501 users [Klein AZ, Kunatharaju S, O’Connor K, et al. Automatically identifying self-reports of COVID-19 diagnosis on Twitter: an annotated data set, deep neural network classifiers, and a large-scale cohort. J Med Internet Res. Jul 3, 2023;25:e46484. [CrossRef] [Medline]20], but excluded 2 of them who reported preterm birth before COVID-19 infection. Among the other 465 users, we identified 336 (72.3%) of them with evidence of term birth. Searching the timelines of the 372 users with evidence of preterm birth or term birth, we automatically detected and manually validated an age for 298 (80.1%) of them [Klein AZ, Magge A, Gonzalez-Hernandez G. ReportAGE: Automatically extracting the exact age of Twitter users based on self-reports in tweets. PLoS ONE. 2022;17(1):e0262087. [CrossRef] [Medline]22]. The median maternal age for these 298 users was 26 years. Among these 298 users, 268 (89.9%) reported a positive test, 21 (7%) reported a hospitalization, and 9 (3%) reported a diagnosis. For our age-matched control group, we searched for outcomes in the timelines of 363 users with a due date before January 1, 2020, in order to identify 298 users with evidence of preterm birth or term birth (

Multimedia Appendix 1

Preterm birth and term birth outcomes for maternal age-matched Twitter users with and without COVID-19 infection during pregnancy, based on manually validated self-reports in tweets.

DOCX File, 59 KBMultimedia Appendix 1).

Among the 298 users with COVID-19 infection during pregnancy, 94 (31.5%) were infected during the first trimester (1‐13 weeks), 110 (36.9%) were infected during the second trimester (14‐27 weeks), and 95 (31.9%) were infected during the third trimester (28‐36 weeks), with one user reporting separate COVID-19 infections in both the first and second trimesters. In total, 26 (8.8%) of these 298 users reported preterm birth: 8 (8.5%) who were infected during the first trimester, 7 (6.4%) who were infected during the second trimester, and 12 (12.6%) who were infected during the third trimester. In the control group, 13 (4.4%) of the 298 users reported preterm birth. In general, the odds of preterm birth were significantly higher for pregnancies with COVID-19 infection compared to those without (OR 2.08, 95% CI 1.06‐4.28; P=.046). In particular, the odds of preterm birth were significantly higher for pregnancies with COVID-19 infection during the third trimester (OR 3.16, 95% CI 1.36‐7.29; P=.007). The odds of preterm birth were not significantly higher for pregnancies with COVID-19 infection during the first trimester (OR 2.05, 95% CI 0.78‐5.08; P=.12) or second trimester (OR 1.50, 95% CI 0.54‐3.82; P=.44) compared to the 298 pregnancies without infection in our control group (Table 3).

Among the 298 users with COVID-19 infection during pregnancy, only 35 (11.7%) were infected before the availability of COVID-19 vaccines: 3 (1%) who were infected during the first trimester, 16 (5.4%) who were infected during the second trimester, and 16 (5.4%) who were infected during the third trimester. Among these 35 users, 2 (5.7%) reported preterm birth: 1 (6.3%) who was infected during the second trimester, and 1 (6.3%) who was infected during the third trimester. Excluding these 35 users for a sensitivity analysis, the odds of preterm birth remained significantly higher for pregnancies with COVID-19 infection compared to those without (OR 2.19, 95% CI 1.10‐4.54; P=.03). Likewise, the odds of preterm birth remained significantly higher particularly for pregnancies with COVID-19 infection during the third trimester (OR 3.54, 95% CI 1.48‐8.32; P=.007). The odds of preterm birth were not significantly higher for pregnancies with COVID-19 infection during the first trimester (OR 2.12, 95% CI 0.81‐5.27; P=.11) or second trimester (OR 1.51, 95% CI 0.51‐4.00; P=.42) compared to the 298 pregnancies without infection in our control group (Table 4).

Figure 1. Study population selection.
Table 3. Odds ratios with 95% CIs comparing preterm birth for pregnancies with (N=298) and without (N=298) COVID-19 infection, by trimester of infection.
TrimesterInfections, n (%)Preterm birth, n (%)Odds ratio (95% CI)P value
First94 (31.5)8 (8.5)2.05 (0.78‐5.08).12
Second110 (36.9)7 (6.4)1.50 (0.54‐3.82).44
Third95 (31.9)12 (12.6)3.16 (1.36‐7.29).007
Totala298 (100)26 (8.8)2.08 (1.06‐4.28).046

aThe total number of pregnancies with COVID-19 infection (N=298) is less than the sum of infections by trimester (N=299) because one user reported separate infections in both the first and second trimesters.

Table 4. Odds ratios with 95% CIs comparing preterm birth for pregnancies with COVID-19 infection before the availability of COVID-19 vaccines (N=263) and pregnancies without COVID-19 infection (N=298), by trimester of infection.
TrimesterInfections, n (%)Preterm birth, n (%)Odds ratio (95% CI)P value
First91 (34.6)8 (8.8)2.12 (0.81‐5.27).11
Second94 (35.7)6 (6.4)1.51 (0.51‐4.00).42
Third79 (30)11 (13.9)3.54 (1.48‐8.32).007
Totala263 (100)24 (9.1)2.19 (1.10‐4.54).03

aThe total number of pregnancies with COVID-19 infection before the availability of COVID-19 vaccines (N=263) is less than the sum of infections by trimester (N=264) because one user reported separate infections in both the first and second trimesters.


Principal Results

Based on self-reports in large-scale social media data, the results of our study support that, in general, COVID-19 infection during pregnancy is associated with preterm birth. In particular, our results suggest that COVID-19 infection during the third trimester is associated with higher odds of preterm birth. We did not observe significantly higher odds of preterm birth for pregnancies with COVID-19 infection during the first or second trimester. Our findings did not appear to be impacted by the availability of COVID-19 vaccines. The observed association between COVID-19 infection during pregnancy and preterm birth in our social media data is consistent with recent systematic reviews and meta-analyses [Pathirathna ML, Samarasekara BPP, Dasanayake TS, et al. Adverse perinatal outcomes in COVID-19 infected pregnant women: a systematic review and meta-analysis. Healthcare (Basel). Jan 20, 2022;10(2):203. [CrossRef] [Medline]3-Jeong Y, Kim MA. The coronavirus disease 2019 infection in pregnancy and adverse pregnancy outcomes: a systematic review and meta-analysis. Obstet Gynecol Sci. Jul 2023;66(4):270-289. [CrossRef] [Medline]13]. Our results suggest, however, that the risk of preterm birth observed in previous studies may be driven by the preponderance of data on third-trimester COVID-19 infection [Marchand G, Patil AS, Masoud AT, et al. Systematic review and meta-analysis of COVID-19 maternal and neonatal clinical features and pregnancy outcomes up to June 3, 2021. AJOG Glob Rep. Feb 2022;2(1):100049. [CrossRef] [Medline]4,Smith ER, Oakley E, Grandner GW, et al. Adverse maternal, fetal, and newborn outcomes among pregnant women with SARS-CoV-2 infection: an individual participant data meta-analysis. BMJ Glob Health. Jan 2023;8(1):e009495. [CrossRef] [Medline]10,Sturrock S, Ali S, Gale C, et al. Neonatal outcomes and indirect consequences following maternal SARS-CoV-2 infection in pregnancy: a systematic review. BMJ Open. Mar 15, 2023;13(3):e063052. [CrossRef] [Medline]12] and may not be generalized to infection during the first and second trimesters. While the total number of COVID-19 infections in our study may be relatively small (N=298), to the best of our knowledge, our study included one of the largest samples of first-trimester infection (n=94). While the total number of COVID-19 infections in our study was coincidentally the same as that in Seif et al’s [Seif KE, Tadbiri H, Mangione M, et al. The impact of trimester of COVID-19 infection on pregnancy outcomes after recovery. J Perinat Med. Sep 26, 2023;51(7):868-873. [CrossRef] [Medline]23] study, the infections in our study were more evenly distributed across the trimesters and included nearly twice as many during the first trimester as theirs (n=48). Even among the 2352 pregnancies with COVID-19 infection in Metz et al’s [Metz TD, Clifton RG, Hughes BL, et al. Association of SARS-CoV-2 infection with serious maternal morbidity and mortality from obstetric complications. JAMA. Feb 22, 2022;327(8):748-759. [CrossRef] [Medline]24] study of one of the largest cohorts, only 54 (2%) of the infections were during the first trimester—approximately half the number of those in our study. Similarly, among the 1942 pregnancies with COVID-19 infection in Smith et al’s [Smith ER, Oakley E, Grandner GW, et al. Adverse maternal, fetal, and newborn outcomes among pregnant women with SARS-CoV-2 infection: an individual participant data meta-analysis. BMJ Glob Health. Jan 2023;8(1):e009495. [CrossRef] [Medline]10] meta-analysis of 12 studies, only 38 (2%) of the infections were known to be during the first trimester—less than half the number of those in our study.

Comparison With Previous Work

Our study was one of very few to assess the association of first-trimester COVID-19 infection with preterm birth. In one of the few other studies that did so, Fallach et al [Fallach N, Segal Y, Agassy J, et al. Pregnancy outcomes after SARS-CoV-2 infection by trimester: a large, population-based cohort study. PLoS One. 2022;17(7):e0270893. [CrossRef] [Medline]25] also found that pregnancies with third-trimester infection were at a higher risk of preterm birth, whereas pregnancies with first-trimester or second-trimester infection did not appear to be. In addition to supporting this finding, our study also complements theirs by addressing some of their noted limitations. As a limitation of their study, Fallach et al [Fallach N, Segal Y, Agassy J, et al. Pregnancy outcomes after SARS-CoV-2 infection by trimester: a large, population-based cohort study. PLoS One. 2022;17(7):e0270893. [CrossRef] [Medline]25] wrote that, in their noninfected group, “women without positive SARS-CoV-2 test results did not necessarily test negative.” Therefore, it is possible that COVID-19 infections without test results were included in the noninfected group. In contrast, our control group ensured comparison with a noninfected group because it consisted of pregnancies with a due date before the COVID-19 pandemic. Because their study population was limited to Israel, Fallach et al [Fallach N, Segal Y, Agassy J, et al. Pregnancy outcomes after SARS-CoV-2 infection by trimester: a large, population-based cohort study. PLoS One. 2022;17(7):e0270893. [CrossRef] [Medline]25] also noted that “[their] findings may have limited generalizability to countries populated with several races since there is evidence of disproportionate burden of COVID-19-related outcomes among different races.” While our study did not explicitly account for race, our previous work [Golder S, Chiuve S, Weissenbacher D, et al. Pharmacoepidemiologic evaluation of birth defects from health-related postings in social media during pregnancy. Drug Saf. Mar 2019;42(3):389-400. [CrossRef] [Medline]15] demonstrated that there are diverse races/ethnicities represented in our cohort of users who reported their pregnancy on Twitter, including White, Black, Asian, and Hispanic.

In contrast, Seif et al [Seif KE, Tadbiri H, Mangione M, et al. The impact of trimester of COVID-19 infection on pregnancy outcomes after recovery. J Perinat Med. Sep 26, 2023;51(7):868-873. [CrossRef] [Medline]23] found that COVID-19 infection during the first trimester was more likely to result in preterm birth than infection during the second or third trimester. They suggested that one possible explanation for their contrary findings could be related to the earlier time period of Fallach et al’s [Fallach N, Segal Y, Agassy J, et al. Pregnancy outcomes after SARS-CoV-2 infection by trimester: a large, population-based cohort study. PLoS One. 2022;17(7):e0270893. [CrossRef] [Medline]25] study: “Also, their study was conducted before July 2021 and, therefore, before the emergence of Delta and Omicron variants. Furthermore, due to the limited understanding of the disease process and how to manage it at the time of Fallach’s study, clinicians may have been more inclined to deliver COVID-19-infected patients nearing the end of their pregnancy.” However, the time period of our study—March 2020 to August 2022—extended even beyond that of Seif et al’s [Seif KE, Tadbiri H, Mangione M, et al. The impact of trimester of COVID-19 infection on pregnancy outcomes after recovery. J Perinat Med. Sep 26, 2023;51(7):868-873. [CrossRef] [Medline]23]. Therefore, their other explanation is more plausible: “One explanation could be that the study by Fallach et al included all pregnant patients with a positive SARS-CoV-2 test. In contrast, we only included those who recovered from COVID-19 at the delivery time.” More generally, Seif et al’s [Seif KE, Tadbiri H, Mangione M, et al. The impact of trimester of COVID-19 infection on pregnancy outcomes after recovery. J Perinat Med. Sep 26, 2023;51(7):868-873. [CrossRef] [Medline]23] findings may not be directly comparable because our and Fallach et al’s [Fallach N, Segal Y, Agassy J, et al. Pregnancy outcomes after SARS-CoV-2 infection by trimester: a large, population-based cohort study. PLoS One. 2022;17(7):e0270893. [CrossRef] [Medline]25] studies used a noninfected group for comparison, while Seif et al’s [Seif KE, Tadbiri H, Mangione M, et al. The impact of trimester of COVID-19 infection on pregnancy outcomes after recovery. J Perinat Med. Sep 26, 2023;51(7):868-873. [CrossRef] [Medline]23] study compared the trimesters to one another.

Limitations

Our study has several limitations related to the use of Twitter as a complementary data source to assess the association between COVID-19 infection during pregnancy and preterm birth. Some users reported their gestational age or due date only in terms of months. While some of these users may have been referring to a precise measure of time, others may have been approximating. In the case of the latter, the dates extracted by Pregex [Klein AZ, Kunatharaju S, O’Connor K, et al. Pregex: rule-based detection and extraction of Twitter data in pregnancy. J Med Internet Res. Feb 9, 2023;25:e40569. [CrossRef] [Medline]18] may have led us to misclassify the trimester of COVID-19 infection for some of these users. Similarly, because some users posted tweets reporting their age but not their birthday, using a heuristic to determine their age at the time of their due date involved a one-year margin of error for the age matches in our control group. While our control group selection—pregnancies with a due date before the COVID-19 pandemic—ensured comparison with a noninfected group, it did not allow us to match users based on the timing of their pregnancies and, in particular, account for potential pandemic-related confounders, such as increased maternal stress. Control group selection based on negative test results, for example, would have been more problematic, though, because of a potential reporting bias—namely, that users may have also tested positive but either did not report it or our methods did not detect it. Our data source also limited our ability to account for other potential confounders, such as illness severity, socioeconomic status, underlying diseases, parity, and whether the preterm birth was medically indicated (ie, infection was the reason for delivering preterm) or spontaneous. Finally, the rate of preterm birth reported in our control group (4.4%) was substantially lower than that in the general population around that time (10.2%) [Martin JA, Hamilton BE, Osterman MJK, et al. Births: final data for 2019. Natl Vital Stat Rep. Apr 2021;70(2):1-51. [Medline]26]. Although this discrepancy might lead one to argue that Twitter does not represent the general population, nearly half of adults aged 18‐29 years in the United States use Twitter [Auxier B, Anderson M. Social media use in 2021. Pew Research Center. URL: https://www.pewresearch.org/internet/2021/04/07/social-media-use-in-2021/ [Accessed 2024-09-03] 27], and approximately half of the births in the United States are to mothers in this age group [Martin JA, Hamilton BE, Osterman MJK, et al. Births: final data for 2019. Natl Vital Stat Rep. Apr 2021;70(2):1-51. [Medline]26]. Therefore, we believe that this discrepancy is more likely to reflect an underreporting of preterm birth on Twitter, and that this discrepancy did not differentially affect the infected and control groups.

Conclusions

In summary, we used large-scale social media data in a retrospective cohort study to assess the association between the trimester of COVID-19 infection and preterm birth. The results of our study suggest that COVID-19 infection particularly during the third trimester is associated with higher odds of preterm birth.

Acknowledgments

This work was supported by the National Library of Medicine (R01LM011176). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors thank Ivan Flores for contributing to software applications, and Kaelen Spiegel for contributing to data analysis.

Data Availability

The user IDs, pregnancy outcomes, and maternal ages of the 596 Twitter users that were included in this study are made available in

Multimedia Appendix 1

Preterm birth and term birth outcomes for maternal age-matched Twitter users with and without COVID-19 infection during pregnancy, based on manually validated self-reports in tweets.

DOCX File, 59 KBMultimedia Appendix 1. The user IDs can be used to request the users’ timelines through the Twitter application programming interface, if their accounts remain public. According to the Twitter Terms of Service, we cannot make the content (eg, text) of tweet objects publicly available, and the number of tweets that were analyzed in this study exceeds the total that we can make publicly available as tweet IDs. A limited number of tweet objects are permitted to be shared directly, and an unlimited number of tweet IDs are permitted to be shared on behalf of an academic institution for noncommercial research. Requests for tweet objects or tweet IDs can be sent to AZK (ariklein@pennmedicine.upenn.edu) or GGH (Graciela.GonzalezHernandez@csmc.edu).

Authors' Contributions

AZK and GGH conceptualized the study. AZK, JCF, and GGH designed the study. AZK collected the data. AZK and SK analyzed the data. AZK and LDL interpreted the results. AZK and SG performed the statistical analysis. SG performed the literature search. AZK drafted the manuscript. All authors reviewed the manuscript.

Conflicts of Interest

None declared.

Multimedia Appendix 1

Preterm birth and term birth outcomes for maternal age-matched Twitter users with and without COVID-19 infection during pregnancy, based on manually validated self-reports in tweets.

DOCX File, 59 KB

  1. Liu L, Oza S, Hogan D, et al. Global, regional, and national causes of child mortality in 2000-13, with projections to inform post-2015 priorities: an updated systematic analysis. Lancet. Jan 31, 2015;385(9966):430-440. [CrossRef] [Medline]
  2. Murphy SL, Kochanek KD, Xu J, et al. Mortality in the United States, 2020. NCHS Data Brief. Dec 2021;427(427):1-8. [Medline]
  3. Pathirathna ML, Samarasekara BPP, Dasanayake TS, et al. Adverse perinatal outcomes in COVID-19 infected pregnant women: a systematic review and meta-analysis. Healthcare (Basel). Jan 20, 2022;10(2):203. [CrossRef] [Medline]
  4. Marchand G, Patil AS, Masoud AT, et al. Systematic review and meta-analysis of COVID-19 maternal and neonatal clinical features and pregnancy outcomes up to June 3, 2021. AJOG Glob Rep. Feb 2022;2(1):100049. [CrossRef] [Medline]
  5. Allotey J, Stallings E, Bonet M, et al. Clinical manifestations, risk factors, and maternal and perinatal outcomes of coronavirus disease 2019 in pregnancy: living systematic review and meta-analysis. BMJ. Sep 1, 2020;370:m3320. [CrossRef] [Medline]
  6. Karaçam Z, Kizilca-Çakaloz D, Güneş-Öztürk G, et al. Maternal and perinatal outcomes of pregnancy associated with COVID-19: systematic review and meta-analysis. Eur J Midwifery. 2022;6:42. [CrossRef] [Medline]
  7. Ramirez Ubillus GC, Sedano Gelvet EE, Neira Montoya CR. Gestational complications associated with SARS-CoV-2 infection in pregnant women during 2020-2021: systematic review of longitudinal studies. J Perinat Med. Mar 28, 2023;51(3):291-299. [CrossRef] [Medline]
  8. Deng J, Ma Y, Liu Q, et al. Association of infection with different SARS-CoV-2 variants during pregnancy with maternal and perinatal outcomes: a systematic review and meta-analysis. Int J Environ Res Public Health. Nov 29, 2022;19(23):15932. [CrossRef] [Medline]
  9. Wang X, Chen X, Zhang K. Maternal infection with COVID-19 and increased risk of adverse pregnancy outcomes: a meta-analysis. J Matern Fetal Neonatal Med. Dec 2022;35(25):9368-9375. [CrossRef] [Medline]
  10. Smith ER, Oakley E, Grandner GW, et al. Adverse maternal, fetal, and newborn outcomes among pregnant women with SARS-CoV-2 infection: an individual participant data meta-analysis. BMJ Glob Health. Jan 2023;8(1):e009495. [CrossRef] [Medline]
  11. Simbar M, Nazarpour S, Sheidaei A. Evaluation of pregnancy outcomes in mothers with COVID-19 infection: a systematic review and meta-analysis. J Obstet Gynaecol. Dec 2023;43(1):2162867. [CrossRef] [Medline]
  12. Sturrock S, Ali S, Gale C, et al. Neonatal outcomes and indirect consequences following maternal SARS-CoV-2 infection in pregnancy: a systematic review. BMJ Open. Mar 15, 2023;13(3):e063052. [CrossRef] [Medline]
  13. Jeong Y, Kim MA. The coronavirus disease 2019 infection in pregnancy and adverse pregnancy outcomes: a systematic review and meta-analysis. Obstet Gynecol Sci. Jul 2023;66(4):270-289. [CrossRef] [Medline]
  14. Hernández-Díaz S, Smith LH, Dollinger C, et al. International Registry of Coronavirus Exposure in Pregnancy (IRCEP): cohort description and methodological considerations. Am J Epidemiol. May 20, 2022;191(6):967-979. [CrossRef] [Medline]
  15. Golder S, Chiuve S, Weissenbacher D, et al. Pharmacoepidemiologic evaluation of birth defects from health-related postings in social media during pregnancy. Drug Saf. Mar 2019;42(3):389-400. [CrossRef] [Medline]
  16. Nettleman MD, Brewer J, Stafford M. Scheduling the first prenatal visit: office-based delays. Am J Obstet Gynecol. Sep 2010;203(3):207. [CrossRef] [Medline]
  17. Tsao SF, Chen H, Tisseverasinghe T, et al. What social media told us in the time of COVID-19: a scoping review. Lancet Digit Health. Mar 2021;3(3):e175-e194. [CrossRef] [Medline]
  18. Klein AZ, Kunatharaju S, O’Connor K, et al. Pregex: rule-based detection and extraction of Twitter data in pregnancy. J Med Internet Res. Feb 9, 2023;25:e40569. [CrossRef] [Medline]
  19. Golder S, McRobbie-Johnson ACE, Klein A, et al. Social media and COVID-19 vaccination hesitancy during pregnancy: a mixed methods analysis. BJOG. Jun 2023;130(7):750-758. [CrossRef] [Medline]
  20. Klein AZ, Kunatharaju S, O’Connor K, et al. Automatically identifying self-reports of COVID-19 diagnosis on Twitter: an annotated data set, deep neural network classifiers, and a large-scale cohort. J Med Internet Res. Jul 3, 2023;25:e46484. [CrossRef] [Medline]
  21. Klein AZ, Cai H, Weissenbacher D, et al. A natural language processing pipeline to advance the use of Twitter data for digital epidemiology of adverse pregnancy outcomes. J Biomed Inform. 2020;112S:100076. [CrossRef] [Medline]
  22. Klein AZ, Magge A, Gonzalez-Hernandez G. ReportAGE: Automatically extracting the exact age of Twitter users based on self-reports in tweets. PLoS ONE. 2022;17(1):e0262087. [CrossRef] [Medline]
  23. Seif KE, Tadbiri H, Mangione M, et al. The impact of trimester of COVID-19 infection on pregnancy outcomes after recovery. J Perinat Med. Sep 26, 2023;51(7):868-873. [CrossRef] [Medline]
  24. Metz TD, Clifton RG, Hughes BL, et al. Association of SARS-CoV-2 infection with serious maternal morbidity and mortality from obstetric complications. JAMA. Feb 22, 2022;327(8):748-759. [CrossRef] [Medline]
  25. Fallach N, Segal Y, Agassy J, et al. Pregnancy outcomes after SARS-CoV-2 infection by trimester: a large, population-based cohort study. PLoS One. 2022;17(7):e0270893. [CrossRef] [Medline]
  26. Martin JA, Hamilton BE, Osterman MJK, et al. Births: final data for 2019. Natl Vital Stat Rep. Apr 2021;70(2):1-51. [Medline]
  27. Auxier B, Anderson M. Social media use in 2021. Pew Research Center. URL: https://www.pewresearch.org/internet/2021/04/07/social-media-use-in-2021/ [Accessed 2024-09-03]


API: application programming interface
NLP: natural language processing
OR: odds ratio
WHO: World Health Organization


Edited by Amaryllis Mavragani; submitted 04.09.24; peer-reviewed by Chun-Hsiang Chan, Nidhi Gupta, Patrick J Arena; final revised version received 01.05.25; accepted 02.05.25; published 09.07.25.

Copyright

© Ari Z Klein, Shriya Kunatharaju, Su Golder, Lisa D Levine, Jane C Figueiredo, Graciela Gonzalez-Hernandez. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 9.7.2025.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.