Background: ANA associated RMDs (ANA-RMDs) include SLE, Sjogren’s, Scleroderma, Myositis, mixed and undifferentiated CTD. Although clinical and immunophenotypic features overlap, there is significant disparity in access to appropriate therapy between ANA-RMDs. A robust reclassification of ANA-RMDs using clinical and biomarkers and with clinical impact could provide better homogeneous cohorts for clinical trials.
To develop and validate a new ANA-RMD classification using two large well-phenotyped ANA-RMDs cohorts using clinical and multiomic biomarker data with deep learning.
To characterise the clinical impact, immunophenotype and long term outcomes of the resulting classes compared to legacy diagnoses.
Methods: For discovery, we trained a variational autoencoder within the European PRECISESADS cohort of 1257 ANA-RMD patients with extant data. All analysis was carried out in R using the Keras package and Tensorflow backend. 25 Input covariates were selected via a systematic survey of ANA-RMD specialists. Data was compressed to an 8-neuron latent space then analysed with multiple clustering techniques. For validation, Kmeans centroids trained in PRECISESADS were exported to the DEFINITION dataset and further clustering was carried out in 219 patients. Cluster durability was assessed by comparison of entropy and elbow plots and the relative cluster stability index. Gene expression data was analysed using heatmaps and summary statistics. Clinical impact was analysed cross-sectionally and longitudinally in DEFINITION using descriptive statistics, including PROs (e.g. SF36, ICECAP-A, EQ5D-5L, VAS & FACIT-Fatigue), physician assessments (e.g. BILAG-2004, SLEDAI, ESSDAI & PGA) and gene expression scores. 5 year follow up outcomes included hospitalisation for any reason, emergency department attendances and outpatient appointment attendances. Kaplan Meir and Sankey plots were generated using the survival and flipPlots R packages.
Results: Machine learning revealed 5 primary “vanguard” ANA-RMD classes, further subdividing one into a total of 7 “precision” classes. Each class encompassed patients from various legacy diagnoses and no legacy diagnosis mapped to a single new class. The precision classes were: (i) Glandular Sjogrenoid, mostly patients with a legacy diagnosis of pSS, SLE or UCTD; (ii) pain symptom dominant, with patient reported pain, variable disease activity but no synovitis; (iii) active refractory synovitis; (iv) chronic sclerotic, mostly patients with a legacy diagnosis of scleroderma, myositis, SLE or UCTD; (v) low disease activity erythroplasmoid, which was distinctive at a gene expression level but with low clinical disease activity; (vi) myeloinflammatory, with the highest steroid use, healthcare impact and nephritis but low plasmablast numbers; while this group was the worst for disease impact it contained substantial numbers of previously undifferentiated patients; (vii) high clinical and interferon activity (Figure 1). 5-year of healthcare data revealed a significant difference in hospital admission rates (p=0.01) and a substantive difference for emergency attendance (p=0.1) for the vanguard classes, but not for legacy diagnoses (p=0.42 and p=0.14 respectively) (Table 1).
Conclusion: Using advanced deep learning, we developed and validated a robust new classification of ANA-RMDs. Our findings showed (i) more of the ANA-RMD spectrum could be assigned a classification than with legacy diagnoses; (ii) immunophenotypic and clinical features within these classes were more homogeneous than legacy diagnoses, suggesting suitability for the same therapies and outcome measures; (iii) these classes predicted long term outcome and healthcare utilisation better than legacy diagnoses. Clinical trials within these populations may produce larger effect sizes and provide evidence applicable to larger patient populations, thereby reducing healthcare inequality.
REFERENCES: NIL.
Sankey plot showing deep learning derived cluster membership within vanguard and precision clusters
Acknowledgements: NIL.
Disclosure of Interests: Jack Arnold Paid speaker for Alumis, Guillermo Barturen: None declared, Samuel Relton: None declared, Lucy Marie Carter Alumis, UCB, Md Yuzaiful Md Yusof Alumis, Novartis, Roche, UCB,, Aurinia, Zoe Wigston: None declared, Daniel Toro-Domínguez: None declared, Marta Alarcon-Riquelme: None declared, Edward M. Vital UCB, Otsuka, AstraZeneca, UCB, Abbvie, Merck, Lilly, Pfizer, Otsuka, Novartis, AstraZeneca and Sandoz.