The Impact of Prostate-Specific Antigen Screening on Prostate Cancer Incidence and Mortality in China: 13-Year Prospective Population-Based Cohort Study

Background The status of prostate-specific antigen (PSA) screening is unclear in China. Evidence regarding the optimal frequency and interval of serial screening for prostate cancer (PCa) is disputable. Objective This study aimed to depict the status of PSA screening and to explore the optimal screening frequency for PCa in China. Methods A 13-year prospective cohort study was conducted using the Chinese Electronic Health Records Research in Yinzhou study’s data set. A total of 420,941 male participants aged ≥45 years were included between January 2009 and June 2022. Diagnosis of PCa, cancer-specific death, and all-cause death were obtained from the electronic health records and vital statistic system. Hazard ratios (HRs) with 95% CIs were estimated using Cox regression analysis. Results The cumulative rate of ever PSA testing was 17.9% with an average annual percent change (AAPC) of 8.7% (95% CI 3.6%-14.0%) in the past decade in China. People with an older age, a higher BMI, higher waist circumference, tobacco smoking and alcohol drinking behaviors, higher level of physical activity, medication use, and comorbidities were more likely to receive PSA screening, whereas those with a lower education level and a widowed status were less likely to receive the test. People receiving serial screening ≥3 times were at a 67% higher risk of PCa detection (HR 1.67; 95% CI 1.48-1.88) but a 64% lower risk of PCa-specific mortality (HR 0.36; 95% CI 0.18-0.70) and a 28% lower risk of overall mortality (HR 0.72; 95% CI 0.67-0.77). People following a serial screening strategy at least once every 4 years were at a 25% higher risk of PCa detection (HR 1.25; 95% CI 1.13-1.36) but 70% (HR 0.30; 95% CI 0.16-0.57) and 23% (HR 0.77; 95% CI 0.73-0.82) lower risks of PCa-specific and all-cause mortality, respectively. Conclusions This study reveals a low coverage of PSA screening in China and provides the first evidence of its benefits in the general Chinese population. The findings of this study indicate that receiving serial screening at least once every 4 years is beneficial for overall and PCa-specific survival. Further studies based on a nationwide population and with long-term follow-up are warranted to identify the optimal screening interval in China.


Effects of prostate-specific antigen screening on prostate cancer incidence and mortality: a population-based cohort study in China
. Adjusted Cox regression model for PSA screening on overall mortality .Table S8.E-values for the primary effect sizes of PSA screening on PCa incidence and mortality ..

Appendix 1: Data Quality Assessment Report
We use a 3 × 3 matrix data quality assessment (DQA) framework proposed by Weiskopf [1] , to assess the data quality of the present study.This framework allows researchers to easier decide whether the dataset, in particular, the electronic health record (EHR)-based dataset, is of sufficient quality through a logical and coherent check and report.Domain-specific assessments are summarized in Table 1 and detailed as follows.

Medium quality
Not applicable smoking, 46% for drinking, 48% for physical activity, 35% for medication use, and 35% for comorbidity (Table 2).For missing values of the covariates, multiple imputation by chained equations (MICE) method is used to replace them.This method is commonly used for EHR-based studies, and has been proposed as standard imputation methods for missing values in the CHERRY study protocol.

1B: The distribution of values is plausible across patients.
Assessment: High quality Report: As 45-year-old is the initiation age of PCa screening recommended by the National Cancer Center of China [2] , we accordingly include male participants aged ≥45 years in the present study.The median age of the study subjects is 55 ys.The proportion of PSA uptake is 17.9%, which is consistent with prior Asian studies [3,4] .The distributions of other characteristic variables are plausible across patients.

2B: There is concordance between variables.
Assessment: High quality Report: Two criteria are set to examine the concordance between variables in this study: (1) if PSA uptake=ever, then sex=male; (2) if number of PSA testing ≥1, then PSA uptake =ever.After checking, all the records containing PSA testing indicate that the subjects are male.All the records containing PSA testing with one time and above indicate that the subjects have ever PSA uptake.Thus, 100% of subjects have correct and concordance data according to the above criteria.

3B: The progression of data over time is plausible.
Assessment: High quality Report: For this dimension, we do not require each variable to be recorded in different time points.For the overall dataset, we only expected one variable, the first uptake of PSA, to be cumulatively increased across year and Figure 2 shows the increment per expectation.

1C: All data were recorded during the timeframe of interest.
Assessment: High quality Report: The established timeframe for this study is between enrollment (Jan reported [5] .Given a small proportion of such implausible documentation (<0.01‰) in our study, we believe the dataset is reliable and well-documented in general.

3C: Data were recorded with the desired regularity over time. Assessment: N/A
Report: This dimension is not applicable for the present study, since we do not require any variables of interest to be recorded at regular intervals.PSA, prostate-specific antigen; PCa, prostate cancer; LL, E-value for the lower limit of the confidence interval; UL, E-value for the upper limit of the confidence interval

Table 1 .
3×3 DAQ framework for the present study 1A: There are sufficient data for each patient.Assessment: High quality Report: The core variables include uptake and date of prostate-specific antigen (PSA), first onset and date of prostate cancer (PCa), onset and date of death.100% of subjects have data on ever PSA uptake.5‰ of subjects have data on PCa diagnosis date while 95‰ of subjects are recorded for right censoring on PCa onset.5% of subjects have data on mortality while 95% of subjects are recorded for right censoring on death (Figure1).2A:Thereare sufficient data for each variable.Assessment: High quality Report: The percentage of missing values for three core variables is 0%, while for other covariates are 0% for age, 0% for baseline PSA value, 3% for education level, 43% for marital status, 50% for body mass index, 48% for waist circumference, 46% for

Table 2 .
Missing proportions of variables of interests in this study

3A: There are sufficient data for each time.
Such deviation is normal, because the electronic health system in Yinzhou is established before Jan 1, 2009, and subjects whose medical information are consecutively recorded since they consented to be recorded in the system.Thus, we exclude these ineligible subjects according to the inclusion criteria.For criterion 2, we do not detect any anomalies.For criterion 3, we detect 19 (<0.01‰) subjects with additional records (PSA test result date) after death date.After checking these 19 1, 2009)and last follow-up (June 15, 2022).According to the DQA guideline, all demographic variables are not required to be recorded within the date range.Other data are restricted to the timeframe.4% of subjects have recorded date of PSA test prior to the enrolment date.Considering the true exposure time of PSA test for this small proportion of subjects (<5%), we allow such deviation.Cumulative PSA upate rate Figure 2. Cumulative PSA upate rate across year 2C: Variables were recorded in the desired order.Assessment: Medium quality Report: Three criteria are set to examine the desired sequence between variables in this study: (1) enrolment date is prior to death date; (2) PCa diagnosis date is prior to death date; (3) No additional records should be generated follow by death.For criterion 1, 574 (0.1%) subjects whose recorded death date are prior to enrolment date (Jan 1, 2009) of this cohort.subjects, two reasons may explain such anomaly: (a) assuming a lag of 2-week of the lab value of PSA for those admitted endangered patients, PSA test result follows death by <2 weeks is plausible; (b) wrongly documented due to manual input.Data quality issues are nearly inevitable in up to 91.7% of HER-based and real-world study as

Table S2 .
Crude Cox regression model for PSA screening on PCa incidence PSA, prostate-specific antigen; PCa, prostate cancer; PY, person-years; HR, hazard ratio; CI, confidence interval * Statistics were restricted to those ever-having PSA test

Table S3 .
Crude Cox regression model for PSA screening on PCa-specific mortality * Statistics were restricted to those ever-having PSA test

Table S4 .
Crude Cox regression model for PSA screening on overall mortality PSA, prostate-specific antigen; PCa, prostate cancer; PY, person-years; HR, hazard ratio; CI, confidence interval * Statistics were restricted to those ever-having PSA test

Table S5 .
Adjusted Cox regression model for PSA screening on PCa incidence * Statistics were restricted to those ever-having PSA test.Model I: only stratification for age at PCa risk.Model II: Model I + adjusted for education, marital status, BMI, High WC, smoking, drinking, physical activity, medication use, and comorbidity.Model III: Model I + Model II + additionally adjusted for baseline PSA value and age at first PSA test.

Table S6 .
Adjusted Cox regression model for PSA screening on PCa-specific mortality * Statistics were restricted to those ever-having PSA test.Model I: only stratification for age at PCa death risk.Model II: Model I + adjusted for education, marital status, BMI, High WC, smoking, drinking, physical activity, medication use, and comorbidity.Model III: Model I + Model II + additionally adjusted for baseline PSA value and age at first PSA test.

Table S7 .
Adjusted Cox regression model for PSA screening on overall mortality -specific antigen; PCa, prostate cancer; HR, hazard ratio; CI, confidence interval * Statistics were restricted to those ever-having PSA test.Model I: only stratification for age at death risk.Model II: Model I + adjusted for education, marital status, BMI, High WC, smoking, drinking, physical activity, medication use, and comorbidity.Model III: Model I + Model II + additionally adjusted for baseline PSA value and age at first PSA test.

Table S8 .
E-values for the primary effect sizes of PSA screening on PCa incidence and mortality