fetching data ...

ABS0718 (2025)
AGE AS THE MAIN PREDICTOR FOR RHEUMATOID ARTHRITIS INITIAL TREATMENT: A MACHINE LEARNING ANALYSIS USING DATA FROM METEOR REGISTRY
Keywords: Artificial intelligence, Registries, Observational studies/ registry
D. Vega-Morales1, S. A. Bergstra2, A. Chopra3, W. M. Orzua-de la Fuente5, P. Machado4, E. Vieira-Sousa6, N. Govind7,8
1Instituto Mexicano del Seguro Social, Hospital General de Zona No. 17, Rheumatology and Infusion Center, Monterrey, Mexico
2Leiden University Medical Center, Department of Rheumatology, Leiden, Netherlands
3Center for Rheumatic Disease, Pune, India
4UCL Queen Square Institute of Neurology, University College London, Department of Neuromuscular Diseases, London, United Kingdom
5Centre for Population Health Research, National Institute of Public Health, Cuernavaca, Mexico
6Reuma.pt, Portuguese Society of Rheumatology, Lisbon, Portugal
7School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Division of Rheumatology, Johannesburg, South Africa
8Chris Hani Baragwanath Academic Hospital, Division of Rheumatology, Department of Internal Medicine, Johannesburg, South Africa

Background: Given the need to assess the complex interaction of various determinants of rheumatoid arthritis (RA) and its treatment, artificial intelligence (AI) is a useful tool for developing learning algorithms to better understand treatment choices for individual RA patients.


Objectives: Examine patient characteristics at diagnosis as predictors for the initial treatment of RA using Machine Learning algorithms.


Methods: We used data from the METEOR (Measurement of Efficacy of Treatment in the “Era of Outcome” in Rheumatology) initiative, an international observational registry. We included patients newly diagnosed (<1 year from established diagnosis) with RA from 20 countries (n=17,486), from 2014 to 2023. We excluded countries with fewer than 100 patients (n=624) and the outliers (outside percentiles 1 st and 99 th ) for body mass index (BMI) (n=178). The treatment regimen recorded at first registered visit was the main outcome. Predictor variables included demographic characteristics, anthropometric measures, smoking history, biomarkers and indicators of disease activity. Data on laboratory parameters: erythrocyte sedimentation rate (ESR), rheumatoid factor (RF), anti-citrullinated protein antibodies (ACPA); and presence of joint erosions as analyzed. Disease activity was assessed using the Disease Activity Score 28 with ESR (DAS28-ESR), patient and physician global visual analogue scale (VAS), and the 28 tender and swollen joint counts and functional status by the Health Assessment Questionnaire Disability Index (HAQ-DI). Statistical analysis and machine learning model training were performed in R Studio (Posit Software, PBC formerly R Studio, MA, USA). Missing data were treated as missing at random and were addressed using multiple imputation with classification and regression trees (5 imputation cycles). Dataset was divided, 70% to train the model and 30% to test it. To further validate the model´s performance, 5-fold cross-validation was used. We examined the predictor variables using a multi-class classifier. We trained four random forests with the full dataset and stratified by country, with treatment class as the output variable. Different configurations were used, in random forests 1 and 2, age was treated as a continuous variable, while in random forest 3 and 4 age was categorized. Additionally, we separated the components of DAS 28-ESR in random forests 2 and 4. The mean decrease in the Gini impurity coefficient was used to assess the importance of the variables for a correct class assignment. The metrics accuracy, precision and recall were estimated to assess the classification performance of the models on the test sample.


Results: The final set included 16,750 patients, 8 countries and 16 drugs used as first treatment. A total of 75 drug combinations were found. Most patients were female (83.0%) and most of them were from India (82.1%). The mean age was 47.4 years (SD 12.9 years). The most common initial treatment regimen was the combination of methotrexate (MTX) and glucocorticoids (GC) (26.1%), followed by hydroxychloroquine (HCQ), MTX and GC (11.8%), and HCQ and GC (10.8%) (Table 1). Random Forest 2 had the best performance metrics (accuracy 0.97, precision 0.94, recall 0.97). The most important predictors for any of the strategies used were individual characteristics (age, weight) and health status indicators (TJC and HAQ), with a mean decrease in Gini coefficient of over 3,000. Age was the main predictor, followed by ESR and weight (Figure 1).


Conclusion: We highlight the model’s effectiveness in accurately predicting first-line treatment choices in RA based on commonly available predictors, with as main components age, ESR and weight. This provides insight in important characteristics in clinical decision-making.


REFERENCES: NIL.

Patient characteristics

Age, % ≤ 39 years 40 - 49 years 50 - 59 years ≥ 60 years 27.6 27.0 26.6 18.8
Country, % India Netherlands South Africa England Portugal Ireland Mexico USA 82.1 2.5 5.1 3.1 2.4 1.9 1.5 1.3
BMI (kg/m 2 ), mean (SD) 26.3 (5.8)
Smoking status , % Current Before 5.0 2.4
Rheumatoid factor ; % Positive Not performed 77.3 21.2
ACPA ; % Positive Not performed 46.5 13.4
Erosions ; % Positive Not performed 30.6 27.6
ESR , med (IQR) 65.0 (53.7)
VAS patient global health , med (IQR) 50.0 (25.0)
VAS patient pain , med (IQR) 50.0 (25.0)
Tender joint count , med (IQR) 12.0 (19.0)
Swollen joint count , med (IQR) 4.0 (8.0)
DAS28-ESR , med (IQR) 6.1 (2.1)
HAQ-DI , med (IQR) 1.0 (0.7)
Treatment classes 1. MTX + GC 2. MTX + HCQ + GC 3. HCQ + GC 4. MTX 5. MTX + SSZ +GC 6. HCQ 7. MTX + HCQ 8. Other regimens 26.1 11.8 10.8 8.7 8.0 8.0 5.1 21.5

Table based on non-imputed data. BMI : body mass index; ACPA : anti-citrullinated protein antibodies; ESR : erythrocyte sedimentation rate; VAS : visual analogic scale; DAS28-ESR : disease activity score 28 with ESR; HAQ-DI : health assessment questionnaire disability index; MTX : methotrexate; GC : glucocorticoids; HCQ : hydroxychloroquine; SSZ : sulfasalazine.

Random forest. DAS28-ESR: disease activity score 28 with ESR; HAQ: health assessment questionnaire - disability index; VAS: visual analogical scale; ACPA: anti-citrullinated protein antibodies; RF: rheumatoid factor; USA: United States. age_c1: ≤ 39 years; age_c2: 40 - 49 years; age_c3: 50 - 59 years; age_c4: ≥ 60 years. For ACPA, RF and erosions: 0 = negative, 1 = positive, 2 = not performed. For smoking: 0 = non-smoker, 1 = current smoker, 2 = former smoker.


Acknowledgements: NIL.


Disclosure of Interests: None declared.

© The Authors 2025. This abstract is an open access article published in Annals of Rheumatic Diseases under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ). Neither EULAR nor the publisher make any representation as to the accuracy of the content. The authors are solely responsible for the content in their abstract including accuracy of the facts, statements, results, conclusion, citing resources etc.


DOI: annrheumdis-2025-eular.B2685
Keywords: Artificial intelligence, Registries, Observational studies/ registry
Citation: , volume 84, supplement 1, year 2025, page 2010
Session: Rheumatoid arthritis (Publication Only)