Mining Real-World Big Data to Characterize Adverse Drug Reaction Quantitatively: Mixed Methods Study

doi:10.2196/48572

Original Paper

¹State Key Laboratory of Cellular Stress Biology, School of Life Sciences, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, China

²National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China

³Fujian Provincial Key Laboratory of Chemical Biology, Xiamen University, Xiamen, China

*these authors contributed equally

Corresponding Author:

Zhi-Liang Ji, PhD

State Key Laboratory of Cellular Stress Biology, School of Life Sciences

Faculty of Medicine and Life Sciences

Xiamen University

Number 4221 Xiang'an South Road

Xiang'an District

Xiamen, 361102

China

Phone: 86 0592 2182897

Email: appo@xmu.edu.cn

Background: Adverse drug reactions (ADRs), which are the phenotypic manifestations of clinical drug toxicity in humans, are a major concern in precision clinical medicine. A comprehensive evaluation of ADRs is helpful for unbiased supervision of marketed drugs and for discovering new drugs with high success rates.

Objective: In current practice, drug safety evaluation is often oversimplified to the occurrence or nonoccurrence of ADRs. Given the limitations of current qualitative methods, there is an urgent need for a quantitative evaluation model to improve pharmacovigilance and the accurate assessment of drug safety.

Methods: In this study, we developed a mathematical model, namely the Adverse Drug Reaction Classification System (ADReCS) severity-grading model, for the quantitative characterization of ADR severity, a crucial feature for evaluating the impact of ADRs on human health. The model was constructed by mining millions of real-world historical adverse drug event reports. A new parameter called Severity_score was introduced to measure the severity of ADRs, and upper and lower score boundaries were determined for 5 severity grades.

Results: The ADReCS severity-grading model exhibited excellent consistency (99.22%) with the expert-grading system, the Common Terminology Criteria for Adverse Events. Hence, we graded the severity of 6277 standard ADRs for 129,407 drug-ADR pairs. Moreover, we calculated the occurrence rates of 6272 distinct ADRs for 127,763 drug-ADR pairs in large patient populations by mining real-world medication prescriptions. With the quantitative features, we demonstrated example applications in systematically elucidating ADR mechanisms and thereby discovered a list of drugs with improper dosages.

Conclusions: In summary, this study represents the first comprehensive determination of both ADR severity grades and ADR frequencies. This endeavor establishes a strong foundation for future artificial intelligence applications in discovering new drugs with high efficacy and low toxicity. It also heralds a paradigm shift in clinical toxicity research, moving from qualitative description to quantitative evaluation.

J Med Internet Res 2024;26:e48572

doi:10.2196/48572

Keywords

clinical drug toxicity; adverse drug reaction; ADR severity; ADR frequency; mathematical model

In recent years, the issue of drug toxicity has emerged as a serious concern in the fields of clinical medicine, pharmacology, and sociology. As a result, drug regulatory agencies worldwide are making continuous efforts to monitor marketed drugs and assess their potential risks in large populations [1,2]. The precision medicine projects launched around the world have drawn significant attention to the adverse effects of drug therapy. As a result, high-risk drugs have been subject to warnings or even market withdrawal. In the field of new drug discovery, the focus of drug safety evaluation has shifted from early-stage cell or animal toxicity to later-stage clinical toxicity [3].

Adverse drug reactions (ADR) represent the clinical manifestations of drug toxicity in humans. Fundamentally, information on ADRs can be obtained from the 4-phase clinical trials, primarily through postmarketing surveillance. To evaluate the benefits and risks of medicines, monitoring and reporting systems, such as the US Food and Drug Administration Adverse Event Reporting System (FAERS) and the European database of suspected adverse drug reaction reports (EudraVigilance), were established by regulatory authorities. These systems enable researchers to access real-world individual responses to drug therapy. Intelligent tools can be compiled to mine drug-ADR associations, illustrate drug toxicity mechanisms, and predict novel ADRs. In addition, some leading-edge projects like Tox21 and eTRANSAFE have been launched to develop integrative data infrastructure and innovative computational methods. These projects aim to enhance translational safety assessment during the drug development process. High-end applications often use machine learning or artificial intelligence algorithms to automatically correlate chemicals with toxicity, even in cases where the exact molecular mechanisms underlying toxicity are not known [4-6]. For example, Kuang et al [7] built machine learning models using topological information from the drug-ADR associations network, drug chemical structures, and drug Anatomical Therapeutic Chemical (ATC) classification information to discover new drug-ADR associations. Anjani and colleagues [8] constructed a convolutional neural network model solely using drug chemical structures for the prediction of ADR occurrence. However, due to the absence of high-dimensional toxicity information such as ADR severity and frequency, current ADR prediction models are insufficient to comprehensively assess the true impact of drug toxicity on human health [9]. ADR severity is a critical indicator that manifests the seriousness of the ADR’s impact on human health, while ADR frequency (occurrence rate) is a quantitative parameter that reflects how often the ADR occurs in the population receiving drug therapy. These 2 parameters are crucial for accurately characterizing drug toxicity in humans. Previously, Hartwig and his colleagues [10] proposed 7 levels of ADR severity and graded 367 ADRs in 1992. Gottlieb et al [11] used nonprofessional crowdsourcing to rank the severity of 2929 ADRs. In 2010, a total of 11 French hospitals collaborated to investigate and determine the severity and frequency of ADRs resulting from self-medication among emergency department patients [12]. Ferreira et al [13] conducted a cross-sectional study to determine the severity and frequency of ADRs based on reports from drug treatments for Alzheimer disease in a Brazilian city. In 2017, the National Cancer Institute released version 5.0 of the Common Terminology Criteria for Adverse Events (CTCAE) [14], which includes 837 distinct adverse event terms observed in cancer therapy. Regrettably, many of these efforts were qualitative in nature.

Undoubtedly, qualitative descriptions of ADRs have become insufficient to support advanced computational algorithms and reduce the use of animals in preclinical toxicity testing. It also hampers the widespread adoption of quantitative methods for cautious pharmacovigilance and prospective assessment of clinical drug toxicity. Therefore, in this study, we aimed to develop mathematical models for estimating ADR severity and frequency by mining millions of historical medical reports over a 10-year period at the population level. Additionally, we created a benchmark data set of drug-ADR relations with quantitative features to support advanced computational applications in drug toxicity research.

Rationale

The rationale for quantifying ADR severity was graphically illustrated in Figure 1, and we briefly described the principles here. A cohort of patients (M_Di_–_Aj ∈ M) takes the drug D_i (D_i ∈ D, D = {D₁, D₂, …, D_n}) and thus induce the ADR A_j (A_j ∈ A, A = {A₁, A₂, …, A_m}) (the D_i-A_j pair). Treating the patients with A_j may have 5 clinical outcomes in FAERS: recovered, recovering, not recovered, resolved with sequelae, and fatal (corresponding to O = {O₁, O₂, O₃, O₄, O₅}, respectively). While clinical outcomes in response to ADR treatment are influenced by multiple factors, such as primary medical status, treatment location (in or outside the hospital), individual genetic variation, and others, the primary determinant remains the severity of the ADR itself. This inspires us to develop a penalized model that uses a statistical score to quantitatively characterize the primary outcome, which is the most frequently occurring, of ADR treatment in large patient populations. In the model, various outcomes of ADR treatment will be penalized differently, with the “recovered” outcome receiving the highest penalty and the “fatal” outcome receiving the lowest. The model calculates the cumulative effects of ADR treatment through penalties, with the primary outcome contributing the most. Summing up the penalties across large populations M_Di_–_Aj would then reveal the primary treatment outcomes of ADR A_j induced by the drug D_i. The penalty score (later referred to as Severity_score) can also be used to indicate the severity of the ADR in the D_i-A_j pair. In most cases, an ADR (A_j ∈ A) can be induced by multiple drugs (D_n ∈ D); hence, A_j may have several scores (corresponding to different D_i-A_j pairs), indicating varying severities for A_j.

To grade ADR severity, we assume a congruent relationship between ADR treatment outcomes and ADR severity. Therefore, we divide the penalty scores of all D_i-A_j pairs in large populations M into 5 zones, corresponding to 5 severity grades. The upper and lower score boundaries for the 5 zones in the entire score distribution can be theoretically determined by calculating the scores for the extreme cases where only 1 of the 5 treatment outcomes occurs. Accordingly, severity grades for A_j can be assigned by determining the zones where the penalty scores of A_j fall.

**Figure 1.** Schematic illustration of ADR severity quantification process. (A) Mining and processing of drug-ADR relations; (B) Estimate the ADR *Severity_score*; (C) Determine boundarise of *Severity_score*; (D) Assignment of ADR severity grade. ADR: adverse drug reaction.

Mining, Processing, and Normalization of Drug-ADR Relations

The drug-ADR associations used for model construction were extracted from the extensive real-world historical adverse drug event (ADE) reports in the FAERS. The FAERS is a shared database maintained by the US Food and Drug Administration. It contains reports of adverse events, including ADRs, medication errors, and drug product problems. These reports are submitted by health care professionals, consumers, and manufacturers during the postmarketing use of drug and therapeutic biologic products. Since 2012, information on ADR treatment outcomes has been recorded in the FAERS. Therefore, data since 2012 can be used for the quantitative study of ADR. The FAERS data files, dating from the first quarter of 2012 to the first quarter of 2020, were downloaded in JSON format from openFDA [15]. These files included 9,146,439 ADEs.

To ensure the reliability of drug-ADR associations, the following operations were performed: (1) qualified ADE reports were obtained by excluding redundant, unreliable, and unrelated reports. Reports submitted by nonprofessionals, such as consumers and patients, were considered unreliable. Additionally, nondrug-induced ADE reports under the System Organ Class (SOC) categories of “congenital, familial and genetic disorders,” “surgical and medical procedures,” “social circumstances,” “product issues,” and “injury, poisoning, and procedural complications” were considered unrelated. Only ADE reports involving single-ingredient small molecule drugs were included in this study; (2) to reduce the data bias caused by occasional reports, misinformation, or underreporting, we excluded ADRs that were reported less than two times and drugs that were reported less than 26 times in all reports; (3) drug-ADR pairs reported in less than two separate reports were also discarded; and (4) the reporting odds ratio (ROR) for ADR A_j induced by drug D_i was calculated based on the two-by-two contingency table. In this study, the ROR was determined using the formula:

Where a represents the number of reports in which the A_j is caused by the D_i, b represents the number of reports in which other ADRs are caused by the D_i, c represents the number of reports in which A_j is induced by other drugs, and d represents the number of reports in which other ADRs are caused by other drugs. A value of ROR>1.0 typically indicates that the drug is a risk factor for ADR, suggesting a reliable drug-ADR association. A 95% CI was used for statistical analysis, and a significance level of P<.05 was applied.

Before constructing the model, the reliable drug-ADR pairs underwent preprocessing and normalization to eliminate redundancy. The drugs were standardized based on their main active ingredients. Pharmacological and chemical data about the drugs, including drug description, indication, synonyms, structure, PubChem ID, DrugBank ID, KEGG (Kyoto Encyclopedia of Genes and Genomes) ID, ATC code, National Drug Code product code, and targets, were extracted from publicly available medical repositories. These repositories included the Unified Medical Language System [16], DrugBank [17], PubChem [18], KEGG [19], and the ATC classification system [20]. The drug names were consolidated by cross-referencing with the DrugBank database. The ADRs were standardized by associating them with the standard ADR terms in the Adverse Drug Reaction Classification System (ADReCS) [21] using self-coded scripts. When encountering novel ADR terms, we adhered to the ADR standardization protocol outlined in ADReCS (version 1.2). This involved standardizing the ADR terms, establishing an ADR hierarchy, and assigning digital IDs.

Construction of the ADReCS Severity-Grading System for ADR Severity Estimation

Constructing the Penalized Model to Estimate the ADR Severity_score

To quantitatively characterize the primary outcome of ADR treatment in large populations, we developed a penalized model. We represented the drugs in the qualified ADE reports of FAERS as D = {D₁, D₂, …, D_n}, the collection of ADRs as A = {A₁, A₂, …, A_m}, the distinct ADE reports as M (which was nearly equal to the size of patient populations, with different events of the same patient treated as a separate report), and the 5 outcomes of ADR treatment as O = {O₁, O₂, O₃, O₄, O₅}, corresponding to the clinical end points of recovered, recovering, not recovered, resolved with sequelae, and fatal, respectively. The model penalized A_j according to the clinical outcomes in a reciprocal manner. The “recovered” outcome was penalized the most, while the “fatal” outcome was penalized the least. We defined the penalty scheme as Penalty(O_k) = {5, 4, 3, 2, 1}, corresponding to the outcomes O one-by-one in order. Hence, for an ADR A_j (A_j ∈ A) induced by the drug D_i (D_i ∈ D) (the D_i-A_j pair), the overall penalty scores of ADR treatment outcomes can be determined by:

To eliminate potential data bias (eg, cases of A_j-O_k combinations may vary greatly) in FAERS, we introduced 2 parameters: the conditional probability of A_j with treatment outcome O_k (that is P(O_k|A_j) in M which was determined by counting the number of A_j-O_k cases against all A_j-O cases, and the association strength of D_i-A_j pair (ROR) which was further normalized with the sigmoid function to a range of 0.5 to 1.0.

To emphasize severe ADRs, we further introduced a weight parameter w_k (w_k = {1, 2, 3, 4, 5}). Since the primary treatment outcomes contributed most to the sum-up penalty score, we could use the penalty score as the indicator of the primary outcome of A_j and further infer the A_j severity as well. Hence, we defined the ADR severity score, Severity_score_Aj, which can be determined by:

The Severity_score ranges from 0 to 5.0; the larger the score is, the more severe the ADR will be.

Assignment of ADR Severity Grade by the Severity_score

Most of the current expert systems follow the generally accepted principles or rules for ADR severity grading; however, the grades and the rules between systems are different and the rules are sometimes uneasy for experts to follow exactly. In this study, we used the CTCAE’s 5-grade architecture, including mild, moderate, severe, life-threatening, and death. To determine the appropriate Severity_score boundaries for grading ADR severity, we assumed that there was a congruent relationship between ADR treatment outcomes and ADR severity (actually, they were not fully matched). Hence, we calculated the Severity_scores for the extreme cases, for instance, when the ADRs in FAERS have only 1 of 5 treatment outcomes (that is P(O_k|A_j) = 1, ROR → ∞, and 1 + _e^–^log2⁽^ROR⁾ ≈ 1). Accordingly, we can denote the theoretical upper and lower boundary of Severity_scores for ADR severity grades:

Thereby, we determined the threshold scores for each severity grade as given in Table 1.

Table 1. Architecture comparison of ADR^a severity-grading systems.

ADReCS^b severity-grading system^c	Vertigo (ADReCS)^d	CTCAE^e version 5.0	Vertigo (CTCAE version 5.0)^f
Mild (0.000-0.387)	Mild (0.196-0.385)	Grade 1: mild	Mild symptoms
Moderate (0.387-0.861)	Moderate (0.394-0.839)	Grade 2: moderate	Moderate symptoms; limiting instrumental ADL^g
Severe (0.861-1.500)	Severe (0.875-1.179)	Grade 3: severe but not immediately life-threatening	Severe symptoms; limiting self-care ADL
Life-threatening (1.5-2.524)	—^h	Grade 4: life-threatening consequences	—
Death (2.524-5.000)	—	Grade 5: death related to AEⁱ	—

^aADR: adverse drug reaction.

^bADReCS: Adverse Drug Reaction Classification System.

^cThe value in the brackets stands for the upper-lower boundaries of Severity_score. Values are limited to 3 decimal places.

^dUsing “Vertigo” as an example. The value in the brackets represents the range of Severity_score for vertigo estimation using the ADReCS severity-grading system.

^eCTCAE: Common Terminology Criteria for Adverse Events.

^fUsing “Vertigo” as an example.

^gADL: activities of daily living.

^hAn em dash (—) indicates that a grade is not available.

ⁱAE: adverse event.

Evaluation of the ADReCS Severity-Grading System

The applicability of the ADReCS severity-grading system in estimating ADR severity was evaluated by comparing it with the widely recognized expert-based system, CTCAE. CTCAE classifies ADR severity into 5 grades: mild, moderate, severe, life-threatening, and death. As CTCAE grades are primarily designed for cancer therapy, the evaluation focused on the mutually preferred terms (PTs) shared by both grading systems. CTCAE, version 5.0, includes 837 MedDRA Lowest Level Terms, corresponding to 729 PTs, of which 658 terms align with those in ADReCS. However, 71 terms in CTCAE could not be matched with terms in ADReCS. These unmatched ADRs often stem from clinical treatments other than drug therapy, such as pulmonary valve disease, vaccination site lymphadenopathy, and vaccination complications.

The evaluation was conducted on the mutual 658 PTs by comparing the consistency of ADR severity grades assigned by both systems. The correspondence of ADR severity grades between both systems was summarized in Table 1. Considering an ADR may have several grades subject to different drug therapies, we classified the comparison results of severity grades into three states (Figure 2A): (1) “consistent” if the CTCAE grades were fully matched or covered by the ADReCS severity grades, (2) “partially consistent” if the CTCAE grades overlapped with the ADReCS severity grades, and (3) “inconsistent” when the grades of both systems were exclusive to each other.

**Figure 2.** Schematic illustration of the results of ADR severity grading. (A) Evaluation of the ADReCS severity-grading system with the expert system CTCAE, along with illustrated rules and evaluation outcomes. (B) Statistics of ADRs with the information on severity grade and frequency by SOCs. (C) Density distribution of drug-ADR pairs based on Severity_scores. (D) Distribution of drug-ADR pairs categorized by ADR severity grades. ADR: adverse drug reaction; ADReCS: Adverse Drug Reaction Classification System; CTCAE: Common Terminology Criteria for Adverse Events; PT: Preferred Term; SOC: System Organ Class.

Estimation of ADR Frequency by Cross-Mining the Big Data of Historical Medical Reports

Overview

Theoretically, ADR frequency can be determined by dividing the number of reported ADRs by the number of drug prescriptions. However, the FAERS, like many other spontaneous reporting systems, faces challenges related to underreporting and bias in reporting. Typically, serious events are more likely to be reported than nonserious ones [22]. Previous studies have estimated that the average underreporting rate (URR) in FAERS is around 94%, implying that the reporting rate is only 6%. For particularly severe events, the URR drops to 77% [22,23].

Determination of the Average Annual Prescription (AAP)

To address the potential bias resulting from underreporting, we acquired real-world drug prescription data from the Medical Expenditure Panel Survey (MEPS) [24]. The MEPS is a publicly available repository that gathers information on health services and expenditures in the United States through surveys conducted among households and individuals. We downloaded the SAS/XLSX files containing household-reported prescription medicines from the MEPS, which amounted to 2,545,184 records and covered the period from 2012 to 2019. This timeframe closely aligns with the period of FAERS reports analyzed in this study. The MEPS records were preprocessed to consolidate the drug name with the FAERS, intermediated by the DrugBank ID, through three routes (Figure 3A): for each MEPS record, (1) the standard generic name, brand name, and active ingredient name of the drug were retrieved from the US Food and Drug Administration National Drug Code Directory via the drug National Drug Code record (RXNDC). Subsequently, the standard drug name was mapped to DrugBank to obtain the unique DrugBank ID; (2) in cases where the RXNDC retrieval failed, the drug name (RXNAME and RXDRGNAM) in the MEPS record underwent an exact keyword search in DrugBank for direct consolidation; and (3) for the remaining reports that failed in the previous 2 routes, the RXNAME or RXDRGNAM names underwent cleaning by removing extraneous words such as “mg,” “ophthalmic,” “tablets,” “chewable,” etc. The clean drug names were then mapped to DrugBank again. The correspondence of the drug name to the DrugBank ID was manually checked for validation.

For drugs with confirmed DrugBank IDs, the annual prescriptions were calculated based on the respondent person weights in the MEPS records. Subsequently, the AAPs were determined for the period spanning 2012 to 2019. In instances where drugs were not listed in the MEPS, their AAPs were estimated by inferring from drugs in the same ATC class (at the 2nd level, which corresponds to the therapeutic subgroup; Figure 3B).

**Figure 3.** Evaluation of ADR frequency by mining historical medical records data. (A) Schematic illustration of obtaining the AAPs from the MEPS database. (B) Estimation of AAPs for drugs not mapped in the MEPS through ATC code inference. (C) Density distribution of drug-ADR pairs by ADR frequency. (D) Distribution of frequency for common and very common ADRs with ADR severity. AAP: average annual prescription; ADR: adverse drug reaction; ATC: Anatomical Therapeutic Chemical; FDA: Food and Drug Administration; MEPS: Medical Expenditure Panel Survey; NDC: National Drug Code.

Estimation of ADR Frequency

For an ADR A_j (A_j ∈ A) induced by a drug D_i (D_i ∈ D), the frequency Freq_Di_–_Aj can be calculated as follows:

The count(D_i – A_j) represents the number of times that the D_i-A_j pair was reported in the ADE collection M of FAERS. “yr” stands for the period of this estimation, which was 8 years (corresponding to quarter 1, 2012, to quarter 1, 2020, of FEARS). The URR was estimated to be 77% for very serious ADRs (life-threatening and death) and 94% for other ADRs (mild, moderate, and severe).

Association Analysis Between the Targets and the ADRs

The association analysis was conducted on the 129,407 drug-ADR pairs with quantitative features. The information on therapeutic targets was obtained from the DrugBank, and the drug-ADR pairs were associated with therapeutic targets to generate a list of distinct drug-ADR–target entries. Singleton drug-ADR–target entries, where the ADR was associated with only 1 target or vice versa, were excluded from the analysis. The strength of association between the ADR/ADR group (fatal or nonfatal) (A_j) and the target (T_p) was assessed using the odds ratio (OR) through the construction of a two-by-two contingency table:

Where a represents the number of drug-ADR–target entries that involve both A_j and T_p, b represents the number of drug-target–ADR entries that involve T_p but not A_j, c represents the number of drug-target–ADR entries that involve A_j but not T_p, and d represents the number of drug-target–ADR entries that involve neither T_p nor A_j. A 95% CI was used for statistical analysis, and a P value of .05 was determined for significance.

Ethical Considerations

The FAERS and MEPS databases are freely available to the public, and patient information is anonymous and deidentified. Therefore, this study does not require ethical review and informed consent.

Performance Evaluation of the ADReCS Severity-Grading Model

After conducting data preprocessing and normalization, we obtained 1,058,727 qualified ADE reports from FAERS. From these reports, we extracted 129,407 reliable and distinct drug-ADR relations, encompassing 774 drugs and 6277 standard ADR terms (PTs). Using these reports, we developed the ADReCS severity-grading model and introduced a new parameter, Severity_score, to quantify the severity of ADRs. We also determined the boundaries of the Severity_score parameter to classify the ADRs into 5 severity grades (Table 1 and Figure 2C). To evaluate the reliability of the ADReCS severity-grading system, we compared it with the expert system CTCAE (version 5.0). Out of a total of 635 mutual ADR PTs between the 2 systems, 287 (45.20%) terms were assigned identical severity grades, 343 (54.02%) were partially consistent, and only 5 (0.79%) had completely different grade assignments (Figure 2A). This result indicates the reliability of the ADReCS severity-grading system. Furthermore, we conducted additional comparisons of the ADReCS severity-grading system with other severity-grading systems or related works (Table 2). The ADReCS severity-grading system outperformed others in almost all aspects of data size and data width.

After applying the ADReCS severity-grading system, severity grades were determined for 6277 standard ADRs, involving a total of 129,407 drug-ADR pairs and 774 single-active ingredient drugs. The distribution of severity-graded ADRs was summarized by SOCs, as shown in Figure 2B, using all ADR standard terms in ADReCS as the background. Additionally, we counted the drug-ADR pairs based on ADR Severity_scores. The majority of the ADR Severity_scores fell within the range of 0 to 1.5, corresponding to the intervals for mild, moderate, and severe ADRs (Figures 2C and 2D). This outcome offers supporting evidence for the general consensus that the safety of most marketed drugs has undergone meticulous evaluation.

Table 2. Comparison of the ADReCS^a severity-grading system with previous works.

	ADReCS severity-grading system	CTCAE^b version 5.0	Ferreira et al [13]	Gottlieb et al [11]	Hartwig et al’s study [10]
Number of ADRs^c	6277	837	1149	2929	367
Method	Quantitative (model)	Qualitative (expert)	Qualitative (expert)	Qualitative (expert)	Qualitative (expert)
Architecture of grades	5 grades	5 grades	4 grades	Rank without grading	7 grades
Graded by drug-ADR pair	129,407	—^d	—	—	—
Corresponding drugs	774	—	—	—	—
Latest update	March 2023	November 2017	December 2020	March 2015	September 1992

^aADReCS: Adverse Drug Reaction Classification System.

^bCTCAE: Common Terminology Criteria for Adverse Events

^cADR: adverse drug reaction.

^d—: not available.

Estimation of ADR Frequency in Large Patient Populations

Mining the MEPS data, we extracted 2,064,016 qualified prescription records of 743 drugs. Based on these records, we calculated the AAPs for 774 drugs in the FAERS. For 438 drugs, AAPs were directly computed from the MEPS data, while for 315 drugs, AAPs were estimated via ATC inference. However, for 21 drugs, the AAPs could not be estimated due to the absence of ATC classification for 17 drugs and failed ATC mapping for 4 drugs. Consequently, we obtained the frequency of 6272 ADRs, covering 127,763 drug-ADR pairs and 753 drugs.

Following conventional rules [25], we classified the 127,763 drug-ADR pairs into 5 groups based on their estimated frequency: 87.87% were classified as very rare (frequency<0.0001), 10.78% were classified as rare (0.0001<frequency<0.001), 1.27% were classified as uncommon (0.001<frequency<0.01), 0.08% were classified as common (0.01<frequency<0.1), and 0.002% were classified as very common (0.1<frequency; Figure 3C). This distribution is consistent with previous estimates indicating that most ADRs are low-probability events [5,21]. More specifically, among the 127,763 drug-ADR pairs, only 75 distinct ADRs were categorized as common and very common, and they were involved in 98 drug-ADR pairs. The majority of the very common ADRs were mild, such as diarrhea, fatigue, headache, nausea, rash, weight gain, cough, and dizziness. It is noteworthy that 3 drugs, namely paricalcitol, clozapine, and lenalidomide, exhibited a high incidence of death (Figure 3D). Paricalcitol is a vitamin D receptor activator used to treat secondary hyperparathyroidism. Studies have reported that treatment with paricalcitol can induce life-threatening ADRs, including hypercalcemia, hyperphosphatemia, and cardiovascular diseases [26-28]. Clozapine is an antipsychotic medication used to treat treatment-resistant schizophrenia. However, it is known to cause potentially life-threatening side effects such as arrhythmias, agranulocytosis, myocarditis, seizures, and nonsuicidal death in the patient population [29,30]. Lenalidomide is an immunomodulatory and antitumor agent used to treat multiple myeloma. According to reports, lenalidomide has been associated with rare instances of severe acute liver injury or acute liver failure, which can lead to fatal outcomes [31]. In summary, the estimated ADR frequency is reasonable.

Store and Distribution of the Quantitative Features

The ADR data obtained from this study have been integrated into the ADReCS. This effort added 2831 novel ADR terms and 33,271 synonyms to the ADReCS, resulting in a total of 9375 distinct standard ADR terms and 68,067 synonyms. Meanwhile, the number of single-active ingredient drugs increased significantly by approximately 86.4%, from 1355 in the previous ADReCS version to 2526. Additionally, the number of nonredundant drug-ADR relations increased by about 6-fold, from 134,022 to 809,346. More importantly, the quantitative features of ADR severity and frequency were also incorporated. Of all 7570 ADR PTs in the updated ADReCS, approximately 82.92% (6277 ADRs) were assigned a severity grade and 82.85% (6272 ADRs) were estimated for frequency (Figure 4). This enhancement makes ADReCS the most information-rich database of drug-ADR interactions, providing unique quantitative data for multi-scale drug safety assessment and drug discovery purposes.

The quantitative parameters of ADRs can be obtained using the BROWSE or the keyword search function of ADReCS [32] (Figure 4). The complete data set of drug-ADR relations with quantitative features is available for download from the DOWNLOAD page of ADReCS [33].

**Figure 4.** Statistics and retrieval of the quantitative features of ADRs from the ADReCS. ADR: adverse drug reaction; ADReCS: Adverse Drug Reaction Classification System; HLT: High-Level Term; HLGT: High-Level Group Term; PT: Preferred Term; SOC: System Organ Class.

Potential Applications

Elucidation of ADR Mechanism

The quantitative features can aid in revealing potential mechanisms underlying ADRs. For instance, we conducted an association analysis between the ADRs and the therapeutic targets based on 549,670 distinct drug-target–ADR relations, involving 689 drugs, 6050 ADRs with quantitative features, and 1082 therapeutic targets. All ADR-target relations were roughly categorized into 4 zones based on the association strength (the OR) and the frequency of ADRs (Figure 5A). These 4 zones pretty much elucidated 4 types of ADR mechanisms: zone 1 and zone 2 included the majority (about 89.59%) of ADR-target relations, in which the ADRs were most likely induced in an on-target way or via the overdose mechanism. On-target ADRs are predictable by assessing the pharmacological activity of drugs [34]. Usually, a well-designed dosage can prevent the high occurrence of ADRs in large patient populations (zone 1). For example, cinacalcet is a calcium-sensing receptor agonist used to treat secondary hyperparathyroidism; occasionally, cinacalcet can cause hypocalcemia in a dose-dependent manner [35]. In contrast, the improper dosage regimen for normal drug therapy was prone to cause dose-dependent ADRs (zone 2). For instance, zoledronic acid, which was designed to target the farnesyl pyrophosphate synthase for the treatment of osteolytic bone disorders, is often accompanied by osteonecrosis of the jaws [36,37]. In this regard, there is still some space for optimizing the therapy dosage of zoledronic acid to gain the balance between efficacy and toxicity. A selected list of drugs with “improper” dosages is provided in Multimedia Appendix 1. This list will also suggest potential ADR mechanisms by providing the drug-target–ADR associations. Zones 3 and 4 accounted for approximately 10.41% of total ADR-target relations. The ADRs in these 2 zones were likely induced by off-target effects, with unclear underlying mechanisms. Thus, these ADRs were usually unpredictable based on pharmacological principles, and some of them could be direct immune-mediated ADRs [34,38]. For example, clozapine-induced neutropenia has been found to be associated with the carriage of specific human leukocyte antigen risk alleles [34,39] rather than overacting on the anticipated therapeutic target of clozapine, histamine H1 receptor, used to treat psychotic diseases. Here, we also listed the drug-ADR pairs within zone 4 in Multimedia Appendix 2.

Seeking the Risky Factors of Fatal ADRs

Fatal ADRs are of paramount concern in new drug discovery. Identifying the risk factors of fatal ADRs could substantially enhance the success rate of designing “high efficacy and low toxicity” drugs. In this study, we conducted an association analysis to identify the risk factors of fatal ADRs (ADRs graded as life-threatening or death). As a result, we found 104 targets for 94 drugs that were susceptible to fatal ADRs (OR>2). Many of these targets were prioritized to combat complex diseases such as cancers, immunological diseases, blood diseases, and cardiovascular diseases (Figure 5B). The protein-protein interaction network analysis revealed that many of these “toxic” targets were interconnected in a highly dense subnetwork, which centered around several well-known targets such as epidermal growth factor receptor, AKT1 (RAC-alpha serine/threonine-protein kinase), and TP53 (cellular tumor antigen p53; Figure 5C). The functional analysis further specified that these targets were enriched in the PI3K-Akt signaling pathway, one of the major cell signaling pathways involved in regulating various cellular processes such as cell proliferation, growth, cell size, metabolism, and motility (Figure 5D). Actually, some PI3K/Akt/mTOR inhibitors have been reported to often induce severe ADRs such as cardiac toxicity, liver toxicity, immunosuppression, and pneumonia [40,41]. In this regard, the discovery of novel targets or drugs for safe cancer therapy remains a significant challenge. Similar approaches can be applied to discover the potential risk factors for a definite severe ADR.

Principal Findings

ADR is more than a binary issue of occurring or not occurring; instead, it is a multidimensional concept. For instance, a drug may not always trigger a definite ADR in all cases of drug therapy. The occurrence of ADR is influenced by multiple factors such as drug dosage, treatment course, individual genetic variation, physiological and pathological states of patients, and so on. Moreover, different ADRs, or even the same ADR induced in the treatment of different diseases, may exhibit varying impacts on patient health. In drug discovery, common ADRs such as nausea and itching are often considered tolerable as they are not severe. However, when it comes to combating life-threatening diseases for which there are no available drugs, drug candidates that may cause severe but rare ADRs may still have a chance to enter the market. Therefore, information on ADR severity and frequency is essential for fair characterization of drug toxicity in humans, and a simple counting of ADRs would be inadequate and biased for precise drug safety assessment. To break the qualitative constraints, we take the first step in measuring ADR severity quantitatively by learning from the big data of historical ADE reports in this study. Furthermore, we also estimate ADR frequency in large patient populations by cross-mining real-world prescription records. These attempts could be a significant leap for the community of clinical pharmacology and toxicology, surpassing the binary dimension of current ADR research and expanding it to a multidimensional space. Meanwhile, these multidimensional features can enrich the vectorized representation of ADRs, providing machine learning applications with richer input information on ADRs.

Limitations

This work has several limitations. The ADReCS severity-grading system is based on the assumption that the ADEs have been fully and unbiasedly reported to the FAERS. However, in real-world clinical practices, clinicians tend to report severe ADRs rather than mild ADRs. As a consequence, the severity of serious ADRs could be overestimated. For this, the incorporation of more ADE sources such as EudraVigilance for severity grading and frequency estimation will partially rectify the reporting bias. Moreover, optimization of the ADReCS severity-grading model or deployment of new algorithms is also desirable to improve the quantitative characterization of ADRs.

Conclusions

In summary, quantitative estimation of ADR severity and frequency enriches current knowledge of the clinical phenotypes caused by drug toxicity in both depth and width. It also addresses data gaps by providing high-quality data sets of drug-ADR relations for multiscale drug safety assessment and drug discovery using advanced artificial intelligence algorithms. Last but not least, it prompts current drug safety research to shift from qualitative description to quantitative analysis.

Acknowledgments

This study was funded by the National Key Research and Development Program of China (2018YFC1003601) and the National Natural Science Foundation of China (30873159).

Data Availability

The complete data set with quantitative features can be obtained [42].

Authors' Contributions

QXY, RFD, WHC, LYW, and KL conducted data collection, preprocessing, and analysis. QXY and RFD developed the algorithms. QXY and WHC prepared the figures and tables. QXY and ZLJ wrote this paper. ZLJ designed and supervised this whole study. All authors approved the final manuscript.

Conflicts of Interest

None declared.

Multimedia Appendix 1

A selected list of drugs with “improper” dosages.

XLSX File (Microsoft Excel File), 9 KB

Multimedia Appendix 2

The drug-adverse drug reactions pairs within zone IV.

XLSX File (Microsoft Excel File), 13 KB

de Vries ST, van der Sar MJM, Coleman AM, Escudero Y, Pascual AR, Martínez MÁM, et al. Safety communication tools and healthcare professionals' awareness of specific drug safety issues in Europe: a survey study. Drug Saf. 2018;41(7):713-724. [FREE Full text] [CrossRef] [Medline]
Zhang JD, Sach-Peltason L, Kramer C, Wang K, Ebeling M. Multiscale modelling of drug mechanism and safety. Drug Discov Today. 2020;25(3):519-534. [FREE Full text] [CrossRef] [Medline]
Pognan F, Steger-Hartmann T, Díaz C, Blomberg N, Bringezu F, Briggs K, et al. The eTRANSAFE project on translational safety assessment through integrative knowledge management: achievements and perspectives. Pharmaceuticals (Basel). 2021;14(3):237. [FREE Full text] [CrossRef] [Medline]
Minerali E, Foil DH, Zorn KM, Lane TR, Ekins S. Comparing machine learning algorithms for predicting Drug-Induced Liver Injury (DILI). Mol Pharm. 2020;17(7):2628-2637. [FREE Full text] [CrossRef] [Medline]
Xiang YP, Liu K, Cheng XY, Cheng C, Gong F, Pan JB, et al. Rapid assessment of adverse drug reactions by statistical solution of gene association network. IEEE/ACM Trans Comput Biol Bioinform. 2015;12(4):844-850. [CrossRef] [Medline]
Santín EP, Solana RR, García MG, Suárez MDMG, Díaz GDB, Cabal MDC, et al. Toxicity prediction based on artificial intelligence: a multidisciplinary overview. WIREs Comput Mol Sci. 2021;11(5):e1516. [CrossRef]
Kuang Q, Wang M, Li R, Dong Y, Li Y, Li M. A systematic investigation of computation models for predicting Adverse Drug Reactions (ADRs). PLoS One. 2014;9(9):e105889. [FREE Full text] [CrossRef] [Medline]
Mantripragada AS, Teja SP, Katasani RR, Joshi P, Ramesh R, Masilamani V. Prediction of adverse drug reactions using drug convolutional neural networks. J Bioinform Comput Biol. 2021;19(1):2050046. [CrossRef] [Medline]
Lavertu A, Hamamsy T, Altman RB. Quantifying the severity of adverse drug reactions using social media: network analysis. J Med Internet Res. 2021;23(10):e27714. [FREE Full text] [CrossRef] [Medline]
Hartwig SC, Siegel J, Schneider PJ. Preventability and severity assessment in reporting adverse drug reactions. Am J Hosp Pharm. 1992;49(9):2229-2232. [Medline]
Gottlieb A, Hoehndorf R, Dumontier M, Altman RB. Ranking adverse drug reactions with crowdsourcing. J Med Internet Res. 2015;17(3):e80. [FREE Full text] [CrossRef] [Medline]
Asseray N, Ballereau F, Trombert-Paviot B, Bouget J, Foucher N, Renaud B, et al. Frequency and severity of adverse drug reactions due to self-medication: a cross-sectional multicentre survey in emergency departments. Drug Saf. 2013;36(12):1159-1168. [FREE Full text] [CrossRef] [Medline]
Ferreira TR, Lopes LC, de Càssia Bergamaschi C. Frequency and severity of adverse drug reactions to medications prescribed for Alzheimer's disease in a Brazilian city: cross-sectional study. Front Pharmacol. 2020;11:538095. [FREE Full text] [CrossRef] [Medline]
Trotti A, Colevas AD, Setser A, Basch E. Patient-reported outcomes and the evolution of adverse event reporting in oncology. J Clin Oncol. 2007;25(32):5121-5127. [CrossRef] [Medline]
openFDA. URL: https://open.fda.gov/data/downloads [accessed 2020-12-10]
Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32(Database issue):D267-D270. [FREE Full text] [CrossRef] [Medline]
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46(D1):D1074-D1082. [FREE Full text] [CrossRef] [Medline]
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, et al. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res. 2021;49(D1):D1388-D1395. [FREE Full text] [CrossRef] [Medline]
Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45(D1):D353-D361. [FREE Full text] [CrossRef] [Medline]
ATC. Norwegian Institute of Public Health. URL: https://www.whocc.no/atc [accessed 2022-06-05]
Cai MC, Xu Q, Pan YJ, Pan W, Ji N, Li YB, et al. ADReCS: an ontology database for aiding standardization and hierarchical classification of adverse drug reaction terms. Nucleic Acids Res. 2015;43(Database issue):D907-D913. [FREE Full text] [CrossRef] [Medline]
Alatawi YM, Hansen RA. Empirical estimation of under-reporting in the U.S. Food and Drug Administration Adverse Event Reporting System (FAERS). Expert Opin Drug Saf. 2017;16(7):761-767. [CrossRef] [Medline]
Hazell L, Shakir SAW. Under-reporting of adverse drug reactions : a systematic review. Drug Saf. 2006;29(5):385-396. [CrossRef] [Medline]
Medical Expenditure Panel Survey (MEPS). Agency for Healthcare Research and Quality. URL: https://meps.ahrq.gov/mepsweb [accessed 2022-04-08]
Eriksson R, Aagaard L, Jensen LJ, Borisova L, Hørlück D, Brunak S, et al. Discrepancies in listed adverse drug reactions in pharmaceutical product information supplied by the regulatory authorities in Denmark and the USA. Pharmacol Res Perspect. 2014;2(3):e00038. [FREE Full text] [CrossRef] [Medline]
Hansen D, Rasmussen K, Danielsen H, Meyer-Hofmann H, Bacevicius E, Lauridsen TG, et al. No difference between alfacalcidol and paricalcitol in the treatment of secondary hyperparathyroidism in hemodialysis patients: a randomized crossover trial. Kidney Int. 2011;80(8):841-850. [FREE Full text] [CrossRef] [Medline]
Liu Y, Liu LY, Jia Y, Wu MY, Sun YY, Ma FZ. Efficacy and safety of paricalcitol in patients undergoing hemodialysis: a meta-analysis. Drug Des Devel Ther. 2019;13:999-1009. [FREE Full text] [CrossRef] [Medline]
Geng X, Shi E, Wang S, Song Y. A comparative analysis of the efficacy and safety of paricalcitol versus other vitamin D receptor activators in patients undergoing hemodialysis: a systematic review and meta-analysis of 15 randomized controlled trials. PLoS One. 2020;15(5):e0233705. [FREE Full text] [CrossRef] [Medline]
Nielsen J, Correll CU, Manu P, Kane JM. Termination of clozapine treatment due to medical reasons: when is it warranted and how can it be avoided? J Clin Psychiatry. 2013;74(6):603-613. [FREE Full text] [CrossRef] [Medline]
Rose E, Chen S, Turrion C, Jenkins C, Cardinal RN, Fernandez-Egea E. Causes of death in clozapine-treated patients in a catchment area: a 10-year retrospective case-control study. Eur Neuropsychopharmacol. 2020;36:160-166. [CrossRef] [Medline]
Hussain S, Browne R, Chen J, Parekh S. Lenalidomide-induced severe hepatotoxicity. Blood. 2007;110(10):3814. [FREE Full text] [CrossRef] [Medline]
ADReCS. URL: http://www.bio-add.org/ADReCS [accessed 2023-03-30]
ADReCS. URL: http://www.bio-add.org/ADReCS/download.jsp [accessed 2024-04-19]
Redwood AJ, Pavlos RK, White KD, Phillips EJ. HLAs: key regulators of T-cell-mediated drug hypersensitivity. HLA. 2018;91(1):3-16. [FREE Full text] [CrossRef] [Medline]
Liu YJ, Sun LN, Cheng ZP, Qian Y, Ma ZQ, Zhang XH, et al. Pharmacogenetic and safety analysis of cinacalcet hydrochloride in healthy Chinese subjects. Ann Transl Med. 2020;8(21):1385. [FREE Full text] [CrossRef] [Medline]
Günaldi M, Afsar CU, Duman BB, Kara IO, Tatli U, Sahin B. Effect of the cumulative dose of zoledronic acid on the pathogenesis of osteonecrosis of the jaws. Oncol Lett. 2015;10(1):439-442. [FREE Full text] [CrossRef] [Medline]
Marini F, Tonelli P, Cavalli L, Cavalli T, Masi L, Falchetti A, et al. Pharmacogenetics of bisphosphonate-associated osteonecrosis of the jaw. Front Biosci (Elite Ed). 2011;3(1):364-370. [FREE Full text] [CrossRef] [Medline]
White KD, Chung WH, Hung SL, Mallal S, Phillips EJ. Evolving models of the immunopathogenesis of T cell-mediated drug allergy: the role of host, pathogens, and drug response. J Allergy Clin Immunol. 2015;136(2):219-234. [FREE Full text] [CrossRef] [Medline]
Saito T, Ikeda M, Mushiroda T, Ozeki T, Kondo K, Shimasaki A, et al. Pharmacogenomic study of clozapine-induced agranulocytosis/granulocytopenia in a Japanese population. Biol Psychiatry. 2016;80(8):636-642. [FREE Full text] [CrossRef] [Medline]
Lee JJ, Loh K, Yap YS. PI3K/Akt/mTOR inhibitors in breast cancer. Cancer Biol Med. 2015;12(4):342-354. [FREE Full text] [Medline]
Alzahrani AS. PI3K/Akt/mTOR inhibitors in cancer: at the bench and bedside. Semin Cancer Biol. 2019;59:125-132. [CrossRef] [Medline]
ADReCS drug ADR relations quantification. URL: http://www.bio-add.org/ADReCS/download/ADReCS_Drug_ADR_re lations_quantification.txt.gz [accessed 2024-04-18]

‎

AAP: average annual prescription

ADE: adverse drug event

ADR: adverse drug reaction

ADReCS: Adverse Drug Reaction Classification System

AKT1: RAC-alpha serine/threonine-protein kinase

ATC: Anatomical Therapeutic Chemical

CTCAE: Common Terminology Criteria for Adverse Events

FAERS: Food and Drug Administration Adverse Event Reporting System

MEPS: Medical Expenditure Panel Survey

OR: odds ratio

PT: preferred term

ROR: reporting odds ratio

SOC: System Organ Class

URR: underreporting rate

Edited by Z Yin; submitted 02.05.23; peer-reviewed by LA Lee, SK Suthar, I Zakrocka; comments to author 10.03.24; accepted 18.03.24; published 03.05.24.

©Qi-Xuan Yue, Ruo-Fan Ding, Wei-Hao Chen, Lv-Ying Wu, Ke Liu, Zhi-Liang Ji. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 03.05.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

This paper is in the following e-collection/theme issue:

Mining Real-World Big Data to Characterize Adverse Drug Reaction Quantitatively: Mixed Methods Study