Background: Triaging of newly referred patients at rheumatology outpatient clinics aims to prioritize patients who are most in need of urgent care, such as patients with rheumatoid arthritis (RA). Frequently the referral letters are manually reviewed. With the recent surge of natural language processing (NLP) methods the opportunity arises to identify patients at high risk for having RA automatically based on the description by the general practitioner (GP).
Objectives: Our goal is to develop a predictive model for RA by leveraging NLP and machine learning on GP referral letters
Methods: We acquired GP referral letters from patients that visited the rheumatology outpatient clinic at Reumazorg ZWN between 2015 and 2022. Among the 8,613 patients with a referral letter, 638 classified as RA, by having an RA-ICD code, 3 months of follow up and a conventional DMARD prescribed. The contents of the letters were processed with a term frequency by inverse document frequency (TF-IDF) transformation and randomly split into a developmental- and test set. We trained an eXtreme Gradient Boosting (XGB)[1] technique in a 5 fold cross validation with a nested hyperparameter tuning procedure on 80% of the data. The performance of our classifier was evaluated with an area under the ROC (AUC-ROC) - and Precision Recall curve (AUC-PRC) in the 20% hold out set. We used the Shapley Additive Explanation (SHAP) to identify the key words and their impact.
Results: The patients selected with our predefined criteria constituted a typical RA group of seropositive middle aged female patients (Table 1). Our RA-classifier achieved an AUC-ROC of 0.77 (CI: 0.74 -0.81) and AUC-PR of 0.29 (CI: 0.23-0.37) in the hold out test set. According to SHAP, the most important predictors were the mentioning of fibromyalgia, RA or the ICPC code for rheumatic musculoskeletal complaints (L88). Notably, the location of pain was also informative: hands, wrist and metacarpophalangeal (MCP) joints were positively associated whereas back-pain problems were negatively associated with RA.
Table 1. Baseline characteristics of RA and non-RA patients
Presented are * binary variables as n(%) and continuous as median (Q1-Q3) or mean (SD).
Overview of the classifier showing performance according to Receiver Operating Characteristic curve (a) and the Precision recall curve (b), calibration (c) and most important predicting features according to SHAP (d). The ROC-AUC (a) shows the diagnostic ability of the model compared to a random assignment (dashed line), whereas the PR-AUC (b) shows the trade-off between sensitivity and precision. The perfect classifier would have its ROC-curve reaching the top-left and PR-curve reaching the top-right. The calibration plot (c) shows the agreement between the predicted probabilities of the classifier and the actual outcomes. SHAP shows the impact and direction of the most informative words in the referral letter per patient (=dot). If the words are frequently occurring for that patient its colored pink, if its absent its colored blue .
Conclusion: Our XGB machine learning model can differentiate RA from non-RA based on referral letters. This demonstrates the predictive value of referral letters in the early detection of RA, which can consequently facilitate early care and reduce workload of clinicians (i.e. creating RA fast tracks). Our next step is to investigate whether this results from the correct diagnosis by the primary care physician or whether our algorithm detects other structures in the text related to RA.
REFERENCES: [1] Chen, T. et al. Xgboost: extreme gradient boosting. R package version 0.4-2, 14 (2015), 1-4.
Acknowledgements: This project has received funding from the European Union via the Horizon grant for SPIDERR (activity No. 101080711) and via the EIT health body for DigiPrevent (funded in April 2022). The study has further received funding by the ZonMW Klinische Fellow No. 40-00703-97-19069, as well as the Zonmw Open Competitie, No. 09120012110075.
Disclosure of Interests: None declared.