EULAR Abstract Archive

Bookmarked

POS1085 (2025)

VALIDATION OF HUMAN PHENOTYPE ONTOLOGY AND DEVELOPMENT OF AN ARTIFICIAL-INTELLIGENCE BASED DIAGNOSTIC TOOL FOR AUTOINFLAMMATORY DISEASES USING THE EUROFEVER REGISTRY

Keywords: Real-world evidence, Telemedicine, Digital health, And measuring health, Epitranscriptomics, Epigenetics, And genetics

C. Matucci-Cerinic⁴, G. Fiorito¹, M. Vergani¹, G. Cavalca¹, R. Caorsi⁴, M. E van Gijn², L. Pape², G. Boursier³, P. Uva¹, M. Gattorno⁴

¹IRCCS Istituto Giannina Gaslini, Clinical Bioinformatics, Genoa, Italy
²University of Groningen, Department of genetics, Groningen, Netherlands
³University of Montpellier, Institute for Regenerative medicine and Biotherapy, Montpellier, France
⁴IRCCS Istituto Giannina Gaslini, Rheumatology and Autoinflammatory Diseases, Genoa, Italy

Background: Accurate classification is crucial for the early diagnosis and therapy optimization of systemic autoinflammatory diseases (SAIDs). The overlapping and complex phenotypes of SAIDs make classification challenging. The Human Phenotype Ontology (HPO) project provides a standardized terminology for describing phenotypic features of genetic diseases. In 2022 the AutoInflammatory diseases section was revised and updated, but the accuracy of the new terms has not yet been validated in real patients. HPO help clinicians prioritize diagnosis by ranking diseases based on similarity scores between HPO terms and patient’s symptoms, through its associated prediction tool, Phenomizer. However, the tool’s accuracy has not yet been validated in real-world datasets.

Objectives: i) to evaluate the diagnostic accuracy of HPO terms in a cohort of real patients, ii) to evaluate the accuracy of Phenomizer compared to different machine-learning algorithms based on a provided real-life patients’ dataset. iii) to develop a novel diagnostic tool for SAIDs.

Methods: Our dataset included 2,866 patients from the Eurofever Registry, diagnosed with Familial Mediterranean Fever (FMF), Cryopyrin-Associated Periodic Syndrome (CAPS), Mevalonate Kinase Deficiency (MKD), Periodic Fever, Aphthous Stomatitis, Pharyngitis, Adenitis (PFAPA), or Tumor Necrosis Factor Receptor-Associated Periodic Syndrome (TRAPS). Eurofever clinical variables were codified with HPO terms, and missing terms were retained. The patients’ dataset was split into training (n=2,005) and test sets (n=861). Four machine learning classifiers were evaluated: Elastic Net regression (EN), k-Nearest Neighbors (kNN), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost), comparing their performance to Phenomizer.

Results: 224 Eurofever variables (215 clinical, 5 laboratory, 4 demographic) were codified into HPO terms. Of these,195 had full HPO correspondence, 12 partial, and 17 no correspondence. XGBoost emerged as the best-performing algorithm in assigning the correct diagnosis to the analyzed patients, achieving an average accuracy of 0.80, and significantly outperforming Phenomizer, even when Phenomizer was trained on Eurofever HPO terms’ frequencies. The addition of the terms “fever duration” and “ethnicity” (present in Eurofever but absent in HPO) improved the algorithm accuracy, highlighting the need for new HPO codes. Additionally, the number of HPO terms per patient showed a reversed Ushaped association with classification accuracy, indicating that either too few (low characterization) or too many terms (low specificity) reduced accuracy, underscoring the importance for clinician to carefully select HPO terms in order to optimize classification. Finally, based on the best-performing algorithm, a user-friendly web app where clinicians can input HPO terms to receive the probability of each SAID diagnosis (among those used in the training model) was developed.

Conclusion: The HPO database should be updated including Eurofever patients’ term frequencies. The developed web app correctly identifies the two most probable SAIDs in over 85% of cases, offering a valuable tool for early diagnosis. Further updates will refine the model as additional data from underrepresented diseases become available.

REFERENCES: NIL.

Acknowledgements: NIL.

Disclosure of Interests: None declared.

© The Authors 2025. This abstract is an open access article published in Annals of Rheumatic Diseases under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ ). Neither EULAR nor the publisher make any representation as to the accuracy of the content. The authors are solely responsible for the content in their abstract including accuracy of the facts, statements, results, conclusion, citing resources etc.

DOI: annrheumdis-2025-eular.B3708

Keywords: Real-world evidence, Telemedicine, Digital health, And measuring health, Epitranscriptomics, Epigenetics, And genetics

Citation: , volume 84, supplement 1, year 2025, page 1177

Session: Poster View VII (Poster View)

version:	1.02