
Background: Remission is the central treatment goal in rheumatoid arthritis (RA), yet only a minority of patients achieve it under routine care. Predictive tools for remission in RA are limited: most focus on single drug classes, use short follow-up windows, or rely on biomarkers not routinely available. Machine learning (ML) could leverage real-world registry data to provide robust and generalizable prediction models, but its value compared to conventional approaches remains uncertain.
Objectives: To develop, externally validate, and interpret ML models predicting RA remission between six and 24 months following initiation of TNF inhibitors, JAK inhibitors, IL-6 inhibitors, abatacept, or rituximab, using real-world registry data from the international JAK-pot collaboration. We also compared model performance against traditional statistical approaches and explored the clinical plausibility of predictors.
Methods: The JAK-pot collaboration contributed 28,900 treatment courses from 11 national registries. Patients were adults with RA and at least one CDAI measurement between 6–24 months post-treatment start. Data were split into training (21,675), internal validation (5418), and external validation (1807, Switzerland) cohorts. An Extreme Gradient Boosting (XGBoost) model using 63 baseline demographic and clinical variables was trained with stratified cross-validation. Discrimination was assessed using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-scores. Calibration was evaluated using fixed thresholds derived from training. Predictor contributions were interpreted with Shapley additive explanations (SHAP). We also tested a simplified 10-predictor model and logistic regression benchmark.
Results: Remission prevalence was 29.3% in training and internal validation, and 26.6% in the external cohort. In the internal validation dataset, the ML model achieved an AUC of 0.759 (95% CI 0.744–0.773), sensitivity 0.608, specificity 0.773, PPV 0.526, NPV 0.826, and F1-scores of 0.564 (remission) and 0.799 (non-remission) (Figure 1A ). In the external validation dataset, performance was comparable, with AUC 0.797 (95% CI 0.774–0.821), sensitivity 0.804, specificity 0.650, PPV 0.454, NPV 0.902, and F1-scores of 0.580 and 0.755 ( Figure 1B ). Calibration remained acceptable across probability ranges.The simplified 10-predictor model achieved an AUC of 0.731 in internal validation and 0.802 in external validation, with high NPV (0.916) but limited PPV (0.432). Logistic regression performed similarly (external AUC 0.809), though with trade-offs in sensitivity vs. specificity. SHAP analysis identified patient global assessment, tender joint count, prior b/tsDMARD exposure, HAQ-DI, physician global assessment, age, CDAI, CRP, pain, and anti-CCP as most influential ( Figure 2 ). These predictors align with established remission literature, reinforcing model plausibility.
Conclusions: In the largest international ML study of RA remission prediction to date, we show that models trained on routine baseline data achieve moderate discrimination, with greatest value in ruling out remission (NPV >0.90 across cohorts). However, simplified and logistic models performed similarly, highlighting the limited incremental value of ML without richer inputs. These findings establish a benchmark for registry-based remission prediction, underline the need for integrating biomarkers, imaging, or early treatment dynamics, and demonstrate the feasibility of externally validated, interpretable ML in RA.
Receiver operating characteristic (ROC) curves showing the performance of models for predicting the achievement of RA remission between six and 24 months after treatment initiation, using 63 predictors. A) Internal Validation Dataset. B) External Validation Dataset. Red Points show the corresponding True Positive Rate (TPR) and False Positive Rate (FPR) for the selected thresholds. AUC: Area Under Curve.
Top 10 predictors of remission probability ranked by mean absolute SHAP (SHapley Additive exPlanations) values from the fine-tuned XGBoost model. Higher values indicate greater overall contribution of the predictor to the output of the model, irrespective of whether the effect increases or decreases remission probability. Prev b/ts-DMARDs: Previous biologic or targeted synthetic DMARDs (number of prior treatments); HAQ-DI: Health Assessment Questionnaire Disability Index; CDAI: Clinical Disease Activity Index; VAS: Visual Analogue Scale; Anti-CCP: Anti-cyclic citrullinated peptide antibodies.
REFERENCES: NIL.
Acknowledgments: NIL.
Disclosure of Interests: Zubeyir Salis ZS owns 50% of the shares in Zuman International Pty Ltd, which receives royalties and payments for educational resources and services in adult weight management and research
methodology., Denis Mongin: None declared, Romain Aymon: None declared, Denis Choquette: None declared, Louis Coupal: None declared, Catalin Codreanu: None declared, Florenzo Iannone FI reports speaking fees from AbbVie, Alfasigma, Amgen, AstraZeneca, CSL-Vifor, GSK, Janssen, Novartis, Eli Lilly, and UCB, FI reports consulting or advisory fees from AbbVie, Amgen, AstraZeneca, GSK, Janssen, Eli
Lilly, and UCB, Roberto F. Caporali: None declared, Tore K. Kvien TKK reports speaking fees from Grünenthal, Janssen, and Sandoz, TKK reports consulting or advisory fees from AbbVie, Gilead, Janssen, Novartis, Pfizer,
Sandoz, and UCB, TKK reports research grants from AbbVie, BMS, Galapagos, Novartis, Pfizer, and UCB., Sella Aarrestad Provan: None declared, Ruth Fritsch-Stork: None declared, Dan Nordström DN reports speaking fees from Pfizer and UCB, DN reports consulting or advisory fees from BMS, Eli Lilly, MSD, DN reports research grants from BMS, MSD, and UCB, Nina Trokovic: None declared, Karel Pavelka KP reports speaking fees from AbbVie, Eli Lilly, Sandoz, UCB, Medac, and Pfizer, Jakub Závada JZ reports speaking fees from AbbVie, Eli Lilly, Sandoz, Novartis, Egis, UCB, Sanofi, AstraZeneca, and Sobi., JZ reports consulting or advisory fees from AbbVie, Novartis, AstraZeneca, and Glaxo, Ana Maria Rodrigues: None declared, Ziga Rotar ZR has received speaker fees from AbbVie, Eli Lilly, Pfizer., ZR has received consultancy fees from AbbVie, Eli Lilly, Pfizer., Prodromos Sidiropoulos PS reports speaking and lecture fees from AbbVie, Pfizer, Eli Lilly, Novartis, and UCB., Irini Flouri: None declared, Céline LAMACCHIA: None declared, Michele Iudici: None declared, Delphine S Courvoisier: None declared, Kim Lauper KL reports speaking fees from AbbVie (paid to the institution)., KL reports consulting or advisory fees from Pfizer and Novartis (paid to the institution), KL reports grants from AbbVie, Alfasigma S.p.A., Eli Lilly, Galapagos, and Pfizer (all paid to the institution), Axel Finckh AF reports speaking fees from AbbVie, Alfasigma, AstraZeneca, Eli Lilly, Pfizer, and UCB., AF reports consulting or advisory fees from AbbVie, AstraZeneca, Novartis, and Stada, AF reports grants from AbbVie, BMS, Eli Lilly, Galapagos, Pfizer, and Alfasigma (all paid to the institution).