EULAR Abstract Archive

Bookmarked

POS0182 (2026)

CLOSING THE GAP BETWEEN NARRATIVE AND STRUCTURE: A WORKFLOW-INTEGRATED CLINICAL DOCUMENTATION IMPROVEMENT TOOL FOR ENHANCING SNOMED CT- CODED PROBLEM LISTS IN RHEUMATOLOGY PRACTICE.

Keywords: Interdisciplinary research, Artificial Intelligence, Public health

I. Hoffman^1,2, I. De Laet³, J. Clukers⁴, D. Ioannidou², A. Omar Khader⁵, L. Rutsaert⁶, G. Baldini², M. Helbert⁷

¹ZAS, Ziekenhuis Aan de Stroom, Rheumatology, Antwerp, Belgium
²HealthSage AI, Utrecht, Netherlands
³ZAS, Ziekenhuis Aan de Stroom, Intensive Care, CMIO, Antwerp, Belgium
⁴ZAS, Ziekenhuis Aan de Stroom, Pulmonology, Antwerp, Belgium
⁵ZAS, Ziekenhuis Aan de Stroom, Cardiology, Antwerp, Belgium
⁶ZAS, Ziekenhuis Aan de Stroom, Haematology, Antwerp, Belgium
⁷ZAS, Ziekenhuis Aan de Stroom, Nephrology, CMIO, Antwerp, Belgium

Background: The European Health Data Space (EHDS) regulation mandates the cross-border sharing of specific Health data across Europe by 2029. One of the first data elements that will need to be shared is a structured patient problem list, a list of relevant diagnoses and medical history items, coded using SNOMED CT. However, according to a survey performed by the Belgian CMIO network (unpublished data), more than two out of three Belgian hospitals are not ready or getting ready for this requirement by implementing a SNOMED CT coded problem list in routine clinical practice. Artificial intelligence (AI) could help clinicians in the creation and maintenance of such problem lists.

Objectives: To accomplish this, we aimed to develop a user-friendly tool, integrated in the clinician’s workflow, and tailored to clinical needs. Using natural language processing, the system identifies diagnoses and procedures in free-text letter conclusions, compares them with the existing problemlist, assigns appropriate SNOMED CT codes, and proposes updates for clinician validation. Validated items are then automatically added to, replaced in, or refined within the existing structured problem list.

Methods: The project comprised a large language model (LLM) training phase based on manual annotations, and a proof-of-concept (PoC) evaluating model performance. During the annotation phase, around 1000 real-world (anonymized) letters of 6 specialties (rheumatology, nephrology, haematology, pulmonology, intensive care and cardiology) were annotated by clinicians experienced in SNOMED CT coding of the respective specialties. The LLM was trained based on these data. Model performance was first assessed in a limited PoC involving 50 cases from 2 specialties. This was followed by an extensive PoC using 363 newly collected letters from all six specialties to further evaluate tool performance. During the PoC phase, clinicians entered the anonymized free-text conclusion of a clinical letter, together with the corresponding problem list, into the application. The model then generated a proposed list of detected problems, which clinicians reviewed by rejecting incorrectly identified items and adding any missing problems. When rejecting items, the clinician could provide a predefined reason (‘hallucination’, ‘error’ or ‘unacceptable granularity in SNOMED CT coding’). In addition, clinicians rated the appropriateness of the SNOMED CT codes suggested by the tool as ‘incorrect’, ‘partially correct’ or ‘correct’.

The following performance metrics were assessed during the PoC:

Completeness (Recall): The percentage of relevant concepts mentioned in the letter that the tool correctly identified, according to a clinician. If the clinician found the tool missed any problems, this would reduce completeness.
Accuracy (Precision): The percentage of concepts identified by the tool that were actually present in the letter, as assessed by a clinician. If the clinician removed items from the suggested list, this would point to a reduced accuracy. Clinicians could reject suggested items as ‘wrong’ due to hallucination, context error, incorrectly identified concepts, or unacceptable granularity in SNOMED CT coding.
SNOMED CT Mapping: clinician-assessed accuracy of the SNOMED CT codes suggested by the tool. Only items retained after clinician review and items removed due to ‘unacceptable granularity in SNOMED CT coding’ are included in this metric.
Semantic accuracy: The percentage of concepts identified by the tool that were actually present in the letter AND were assigned a ‘correct’ or ‘partially correct’ SNOMED CT Code as assessed by a clinician.
Speed:Time in seconds required by the tool to generate a suggested list of problems.
Problemlist increase: The relative increase in the number of problems in the list after using the tool. If, for example, the pre-existing problem list contains 3 problems, and the tool suggests 2 new problems that are validated as correct by the clinician, the increase would be 66%.

Results: The tool achieved a completeness of 90%, an accuracy of 93%, a correct SNOMED CT mapping of 92%, and a semantic accuracy of 88% across all specialties. For rheumatology, a total of 50 letters were evaluated during the PoC phase resulting in a completeness of 94%, an accuracy of 93%, a SNOMED CT mapping quality of 93%, and a semantic accuracy of 90%. On average, the tool required approximately 10 seconds to generate a list of suggested problems for rheumatology cases. Finally, the use of the tool increased the problem list length by 24% for rheumatology patients.

Conclusions: We developed an AI-driven tool, trained on an extensive real-world dataset, to assist clinicians in creating and maintaining a SNOMED CT–coded problem list. During the composition of the clinical letter, the tool identifies in real time any new diagnoses not yet included in the problem list and assigns the appropriate SNOMED CT code. Clinicians can validate the suggested items, which are then automatically added to, replaced in, or refined within the existing structured problem list.

REFERENCES: NIL.

Acknowledgments: NIL.

Disclosure of Interests: Ilse Hoffman: None declared, Employee of HealthSage AI, Inneke De Laet: None declared, Johan Clukers: None declared, Despoina Ioannidou Employee of HealthSage AI, Aaram Omar Khader: None declared, Lynn Rutsaert: None declared, Giulia Baldini Emloyee of HealthSage AI, Mark Helbert: None declared.

DOI: annrheumdis-2026-eular.B.1830

Keywords: Interdisciplinary research, Artificial Intelligence, Public health

Citation: , volume 85, supplement 1, year 2026, page s451

Session: Clinical Poster Tours: Real-world evidence and disease-specific trajectories in RMDs (Poster Tours)

version:	1.02