%0 Journal Article %@ 1438-8871 %I JMIR Publications %V 27 %N %P e59101 %T Predicting the Risk of HIV Infection and Sexually Transmitted Diseases Among Men Who Have Sex With Men: Cross-Sectional Study Using Multiple Machine Learning Approaches %A Lin,Bing %A Liu,Jiaxiu %A Li,Kangjie %A Zhong,Xiaoni %+ School of Public Health, Chongqing Medical University, No.1 Medical College Road, Yuzhong District, Chongqing, 400016, China, 86 13527545050, zhongxiaoni@cqmu.edu.cn %K HIV %K sexually transmitted diseases %K men who have sex with men %K machine learning %K web application %K risk stratification %D 2025 %7 20.2.2025 %9 Original Paper %J J Med Internet Res %G English %X Background: Men who have sex with men (MSM) are at high risk for HIV infection and sexually transmitted diseases (STDs). However, there is a lack of accurate and convenient tools to assess this risk. Objective: This study aimed to develop machine learning models and tools to predict and assess the risk of HIV infection and STDs among MSM. Methods: We conducted a cross-sectional study that collected individual characteristics of 1999 MSM with negative or unknown HIV serostatus in Western China from 2013 to 2023. MSM self-reported their STD history and were tested for HIV. We compared the accuracy of 6 machine learning methods in predicting the risk of HIV infection and STDs using 7 parameters for a comprehensive assessment, ranking the methods according to their performance in each parameter. We selected data from the Sichuan MSM for external validation. Results: Of the 1999 MSM, 72 (3.6%) tested positive for HIV and 146 (7.3%) self-reported a history of previous STD infection. After taking the results of the intersection of the 3 feature screening methods, a total of 7 and 5 predictors were screened for predicting HIV infection and STDs, respectively, and multiple machine learning prediction models were constructed. Extreme gradient boost models performed optimally in predicting the risk of HIV infection and STDs, with area under the curve values of 0.777 (95% CI 0.639-0.915) and 0.637 (95% CI 0.541-0.732), respectively, demonstrating stable performance in both internal and external validation. The highest combined predictive performance scores of HIV and STD models were 33 and 39, respectively. Interpretability analysis showed that nonadherence to condom use, low HIV knowledge, multiple male partners, and internet dating were risk factors for HIV infection. Low degree of education, internet dating, and multiple male and female partners were risk factors for STDs. The risk stratification analysis showed that the optimal model effectively distinguished between high- and low-risk MSM. MSM were classified into HIV (predicted risk score <0.506 and ≥0.506) and STD (predicted risk score <0.479 and ≥0.479) risk groups. In total, 22.8% (114/500) were in the HIV high-risk group, and 43% (215/500) were in the STD high-risk group. HIV infection and STDs were significantly higher in the high-risk groups (P<.001 and P=.05, respectively), with higher predicted probabilities (P<.001 for both). The prediction results of the optimal model were displayed in web applications for probability estimation and interactive computation. Conclusions: Machine learning methods have demonstrated strengths in predicting the risk of HIV infection and STDs among MSM. Risk stratification models and web applications can facilitate clinicians in accurately assessing the risk of infection in individuals with high risk, especially MSM with concealed behaviors, and help them to self-monitor their risk for targeted, timely diagnosis and interventions to reduce new infections. %R 10.2196/59101 %U https://www.jmir.org/2025/1/e59101 %U https://doi.org/10.2196/59101