Background: Currently methotrexate (MTX) is the prevailing baseline treatment for Rheumatoid Arthritis (RA). Treatment response varies since RA is a highly heterogeneous disease. A tantalizing idea is that the factors causing clinical heterogeneity can already be elucidated at baseline.
Objectives: Disentangle clinical heterogeneity of RA patients at baseline to identify likely MTX failure during follow-up.
Methods: We constructed patient-specific profiles, featuring baseline clinical measurements which we split into three layers: 1) joint counts, 2) numerical hematology work up, and 3) categorical features (binary serological markers (aCCP/RF) and localization of joint inflammation and tenderness). We applied Z-score scaling on the numerical data and one hot encoding on the categorical features. To identify hidden structure across these layers we used Maui (Multi-omics Autoencoder Integration) [1], and Phenograph [2] for subsequent clustering of patients within the extracted latent space. We examined the most discriminatory features post-hoc with SHAP. With Kaplan Meier curves we assessed MTX efficacy using treatment switch as proxy for failure. We calculated hazard ratios (HR) with univariate Cox-regression.
Results: We had 944 RA patients with baseline health record data. MAUI identified 23 latent factors from 335 baseline variables. Phenograph showed 6 RA-subgroups (
Baseline characteristics of the different clusters
C1 | C2 | C3 | C4 | C5 | C6 | |
N | 224 | 179 | 171 | 162 | 116 | 92 |
Sex (F ) * | 131 (59) | 122 (68) | 125 (73) | 112 (69) | 71 (61) | 57 (62) |
RF * | 95 (42) | 89 (50) | 106 (62) | 107 (66) | 68 (59) | 55 (60) |
aCCP * | 82 (36.6) | 89 (50) | 110 (64) | 105 (65) | 65 (56) | 54 (59) |
DAS44(3 ) | 3.6 (2.7-4.2) | 2.7 (2.2-3.1) | 2.4 (1.9-2.9) | 2.1 (1.7-2.6) | 2.2 (1.7-2.6) | 2.8 (2.4-3.2) |
SJC | 15 (11-20) | 6 (3-8) | 9.0 (6-12) | 2 (1-5) | 4 (2-6) | 9 (6-12) |
TJC | 19 (14-27) | 9 (6-12) | 12.0 (9-18) | 4 (2-6) | 3 (2-6) | 11 (7-13) |
ESR (mm/hr ) | 33 (14-53) | 33 (14-48) | 19 (9-35) | 28 (14-39) | 23 (11-38) | 25 (9-36) |
Age (yr ) | 63 (14) | 60 (13) | 53 (16) | 59 (15) | 63 (13) | 58 (16) |
MTX prescription * | 192 (85) | 146 (81) | 138 (80) | 131 (80) | 88 (75) | 78 (85) |
Follow up (days ) | 1308 (743-2060) | 1458 (880-2567) | 1821 (982-2566) | 1590 (1022-2245) | 1566 (787-2000) | 1468 (832-2211) |
Symptom duration (days ) | 124 (52-334) | 155 (46-537) | 155 (62-365) | 217 (77-775) | 186 (62-548) | 155 (62-365) |
Presented are * binary variables as n(%) and continuous as median (Q1-Q3) or mean (SD).
Overview of the distinct RA-clusters: A) 2D UMAP, B) Kaplan Meier plot of MTX-probability across 8.6 years (defined by cluster with shortest follow up), C) SHAP plot of most discriminatory features per cluster.
The baseline clusters (C) are characterized by a different joint involvement or lab value: C1 had a low aCCP-positivity (37%) and high median ESR of 33. C1 had the most affected joints (primarily the small joints) with a swollen- (SJC) and tender joint count (TJC) of 15 and 19 respectively. C2 had mediocre aCCP-positivity (50%) and low median SJC=6 and TJC=9. C3 had MTP involvement, high aCCP-positivity (64%), and low ESR=19 but relatively high joint counts; SJC=9 and TJC=12. C4 had no wrists, high aCCP-positivity (65%), high ESR=28 and low joint counts; SJC=2, TJC=4. C5 had low lymphocyte numbers and a low median ESR=23, SJC=4 and TJC=3. C6 had MCP1 involvement, was mostly aCCP-positive (59%) and had a slightly higher median ESR=25, SJC=9 and TJC=11.
Clusters differed in MTX failure: 40%, 53%, 69%, 54%, 48% and 64% (for cluster 1-6, P = 3.2e-2 ). Examining the local differences, we observed the biggest difference between C1 and C3 (HR 0.5 (95% CI 0.36-0.7), P=4.4e-5 ).
Conclusion: Using baseline data, we identified 6 putative novel RA subtypes which were associated with differences in MTX failure. Our study demonstrates the applicability of unsupervised deep learning and cluster analysis to elucidate hidden structure in the multi-modal EHR.
REFERENCES:
[1]Ronen J. doi:10.26508/lsa.201900517
[2]Levine JH. doi:10.1016/j.cell.2015.05.047
Disclosure of Interests: Tjardo Maarseveen: None declared, Marc Maurits: None declared, Thomas Huizinga: None declared, Marcel Reinders: None declared, Erik van den Akker: None declared, Rachel Knevel Grant/research support from: Rachel received a grant from Pfizer.