Background: Patients with rheumatoid arthritis (RA) treated with biologic and targeted synthetic disease-modifying antirheumatic drugs (b/ts DMARDs) face an increased risk of serious infections, a leading cause of morbidity and mortality [1]. Current evidence on infection risk stratification across different b/ts DMARDs is fragmented, and predictive tools are lacking.
Objectives: This study aimed to develop a machine learning (ML) model leveraging electronic medical records (EMR) to predict serious infection risk among b/ts DMARD users.
Methods: Data from the Hong Kong Clinical Data Analysis and Reporting System (CDARS) were used to develop the model, with external validation performed using the U.S.-based All of Us database. Adult RA patients initiating b/ts DMARDs between 2010 and 2023 were included. Serious infection was defined as infection-related hospitalization requiring anti-infection drugs within ±14 days of admission. Predictive features encompassed demographics, comorbidities, laboratory tests, prior infection history, and medication use. Six ML algorithms, including random forest, XGBoost, support vector machine, neural network, LASSO regression, and logistic regression, were evaluated. The final model was selected based on the highest area under the receiver operating characteristic curve (AUROC). SHAP (Shapley Additive Explanations) values were used for clinical feature interpretation and biomarker thresholds for infection risk were determined to guide high risk patients’ identification.
Results: A total of 3159 and 1845 b/ts DMARD users were identified in the CDARS and All of Us cohorts, respectively. The random forest model performed best in the training cohort (AUROC: 0.84; accuracy: 0.76; sensitivity: 0.73; specificity: 0.77) and achieved robust external validation on All of Us database (AUROC: 0.73; accuracy: 0.61; sensitivity: 0.71; specificity: 0.60). Previous infection, the type of b/ts DMARD, and diabetes mellitus were the top predictors of infection risk. Among the b/ts DMARDs, rituximab was associated with the highest risk of infection, whereas tofacitinib, upadacitinib, and sarilumab were linked to the lowest risk. Elevated levels of CRP (>4.1 mg/dL), creatinine (>92.5 µmol/L), ESR (>60.5 mm/hr) and lower levels of albumin (<30 g/L), lymphocyte proportion (<9.8%), and platelet count (<185 × 10⁹/L) were associated with increased infection risk.
Conclusion: This study presents an effective ML-based tool for serious infection risk prediction among RA patients receiving b/ts DMARDs. The model presented good predictive capacity and external validity, offering the opportunity for personalized b/ts DMARD therapy and improved patient outcomes. Biomarker thresholds identified could refine infection risk stratification and guide early interventions for high-risk patients receiving b/tsDMARDs.
REFERENCES: [1] Ozen G, Pedro S, England BR, Mehta B, Wolfe F, Michaud K. Risk of Serious Infection in Patients With Rheumatoid Arthritis Treated With Biologic Versus Nonbiologic Disease-Modifying Antirheumatic Drugs. ACR Open Rheumatol . 2019;1(7):424-432.
[2] Mok CC, So H, Yim CW, et al. Safety of the JAK and TNF inhibitors in rheumatoid arthritis: real world data from the Hong Kong Biologics Registry. Rheumatology (Oxford ). 2024;63(2):358-365.
Acknowledgements: NIL.
Disclosure of Interests: None declared.
© The Authors 2025. This abstract is an open access article published in Annals of Rheumatic Diseases under the CC BY-NC-ND license (