TY - JOUR AU - Bhavnani, Suresh K AU - Zhang, Weibin AU - Bao, Daniel AU - Raji, Mukaila AU - Ajewole, Veronica AU - Hunter, Rodney AU - Kuo, Yong-Fang AU - Schmidt, Susanne AU - Pappadis, Monique R AU - Smith, Elise AU - Bokov, Alex AU - Reistetter, Timothy AU - Visweswaran, Shyam AU - Downer, Brian PY - 2025 DA - 2025/2/11 TI - Subtyping Social Determinants of Health in the "All of Us" Program: Network Analysis and Visualization Study JO - J Med Internet Res SP - e48775 VL - 27 KW - social determinants of health KW - All of Us KW - bipartite networks KW - financial resources KW - health care KW - health outcomes KW - precision medicine KW - decision support KW - health industry KW - clinical implications KW - machine learning methods AB - Background: Social determinants of health (SDoH), such as financial resources and housing stability, account for between 30% and 55% of people’s health outcomes. While many studies have identified strong associations between specific SDoH and health outcomes, little is known about how SDoH co-occur to form subtypes critical for designing targeted interventions. Such analysis has only now become possible through the All of Us program. Objective: This study aims to analyze the All of Us dataset for addressing two research questions: (1) What are the range of and responses to survey questions related to SDoH? and (2) How do SDoH co-occur to form subtypes, and what are their risks for adverse health outcomes? Methods: For question 1, an expert panel analyzed the range of and responses to SDoH questions across 6 surveys in the full All of Us dataset (N=372,397; version 6). For question 2, due to systematic missingness and uneven granularity of questions across the surveys, we selected all participants with valid and complete SDoH data and used inverse probability weighting to adjust their imbalance in demographics. Next, an expert panel grouped the SDoH questions into SDoH factors to enable more consistent granularity. To identify the subtypes, we used bipartite modularity maximization for identifying SDoH biclusters and measured their significance and replicability. Next, we measured their association with 3 outcomes (depression, delayed medical care, and emergency room visits in the last year). Finally, the expert panel inferred the subtype labels, potential mechanisms, and targeted interventions. Results: The question 1 analysis identified 110 SDoH questions across 4 surveys covering all 5 domains in Healthy People 2030. As the SDoH questions varied in granularity, they were categorized by an expert panel into 18 SDoH factors. The question 2 analysis (n=12,913; d=18) identified 4 biclusters with significant biclusteredness (Q=0.13; random-Q=0.11; z=7.5; P<.001) and significant replication (real Rand index=0.88; random Rand index=0.62; P<.001). Each subtype had significant associations with specific outcomes and had meaningful interpretations and potential targeted interventions. For example, the Socioeconomic barriers subtype included 6 SDoH factors (eg, not employed and food insecurity) and had a significantly higher odds ratio (4.2, 95% CI 3.5-5.1; P<.001) for depression when compared to other subtypes. The expert panel inferred implications of the results for designing interventions and health care policies based on SDoH subtypes. Conclusions: This study identified SDoH subtypes that had statistically significant biclusteredness and replicability, each of which had significant associations with specific adverse health outcomes and with translational implications for targeted SDoH interventions and health care policies. However, the high degree of systematic missingness requires repeating the analysis as the data become more complete by using our generalizable and scalable machine learning code available on the All of Us workbench. SN - 1438-8871 UR - https://www.jmir.org/2025/1/e48775 UR - https://doi.org/10.2196/48775 UR - http://www.ncbi.nlm.nih.gov/pubmed/39932771 DO - 10.2196/48775 ID - info:doi/10.2196/48775 ER -