fetching data ...

AB1114 (2020)
HANDWORK VS MACHINE; A COMPARISON OF RHEUMATOID ARTHRITIS PATIENT POPULATIONS AS IDENTIFIED BY MACHINE-LEARNING AND CRITERIA-BASED CHART REVIEW.
T. Maarseveen1, M. Maurits1, E. Niemantsverdriet1, T. Huizinga1, A. Van der Helm - van Mil1, R. Knevel1,2
1Leiden University Medical Center (LUMC), Rheumatology, Leiden, Netherlands
2Brigham and Women’s Hospital, Rheumatology, Boston, United States of America

Background: Electronic Medical Records (EMRs) offer a magnitude of observational data which is currently often manually reviewed to identify patient cases. Machine-Learning (ML) methods are highly efficient tools for data extraction, being able to process the information-rich free-text physician notes from EMRs. [1]

We postulate that incorporating this data into an ML pipeline, will enable the high-throughput identification of RA case cohorts qualitatively equal to conventional EULAR/ACR criteria based chart review.


Objectives: To investigate the comparability of Rheumatoid arthritis (RA) cases identified through EMR free-text enriched machine-learning algorithms to the field’s gold-standard (GS) of criteria based chart-review.


Methods: We developed a classification algorithm using a Support Vector Machine (SVM) trained on 1000 manually reviewed patient EMRs and validated on a separate 1000 records. [2]

The SVM assigns a probability of being an RA case to each patient at a speed of >5000 records per second. The probability cutoff for case identification can be tailored to studies’ needs, optimizing on sensitivity, specificity, etc. We optimized on PPV, resulting in a cutoff of 0.48 with a PPV of 0.96.

We used the 1987 and 2010 EULAR/ACR criteria based diagnoses of RA at one year after inclusion in a prospective arthritis cohort as GSs and compared the baseline characteristics of ML identified RA cases with both GSs separately and for those fulfilling at least one set of criteria using Pearson chi-squared and Mann-Whitney U tests (α = 0.05).


Results: At the 0.48 cutoff the ML model performs very well on the annotated set and when compared to criteria based GSs. ( Table 1 )

Since patients diagnosed with RA do not necessarily meet classification criteria, it is not surprising that the ML cases and GSs do not completely overlap. The ML cases overlap to a larger degree with the 1987 GS than with the 2010 GS. Clinically, the ML identified cases do not differ from the 2010 and 1987 GS cohorts except for a slightly higher CCP2 positivity compared to the 1987 GS (65 vs 51%) and the combined criteria (65 vs 56%). ( Table 1 & 2)

Performance of ML-model in the annotated data (N = 1000) predicting cases and 1987, 2010 or either criteria based data * (N = 1235, 1218 & 1244 respectively) predicting cases (prob. ≥ 0.48). TP; True Positive, FP; False Positive, TN; True Negative, FN; False Negative, PPV; Positive Predictive Value, NPV; Negative Predictive Value, Spec; Specificity, Sens; Sensitivity

Accuracy PPV NPV Spec Sens
Annotated Cases 0.98 0.96 0.98 1.00 0.74
Criteria Definite Cases 1987* 0.82 0.79 0.83 0.92 0.61
Criteria Definite Cases 2010* 0.78 0.81 0.77 0.93 0.55
Either Criteria Cases* 0.76 0.90 0.72 0.95 0.51

Conclusion: ML algorithms processing clinician notes enable fast and efficient selection of cases that are clinically similar to cases selected by criteria based chart review. This allows a significant reduction of time and effort required to construct high quality research cohorts.

Comparison of baseline characteristics between 3 ML defined cohorts and a criteria based gold standard. Not statistically tested; * significantly different to the ML selected cases at α = 0.05, Pearson Chi-Squared for proportions, Mann-Whitney U for medians

Predicted Definite Case 1987 Criteria Based Case 2010 Criteria Based Case Either Criteria Based Case
N 335 399 609 666
Proportion Women 0.67 0.63 0.65 0.65
Proportion CCP2 Positive 0.65 0.51* 0.61 0.56*
Proportion RF positive 0.67 0.60 0.69 0.64
Median BMI 26.0 25.5 25.6 25.6
Median BSE 28 29 26 26
Median CRP 9.6 11 9 9
Median Age at Inclusion 57.7 58.7 57.2 58.3
Median Symptom Duration at Diagnosis (in Days) 95.5 90 91 89
Median Number of Swollen Joints 5 6 6 5

REFERENCES:

[1]Maarseveen et al. ARD 2019;78

[2]https://github.com/levrex/DiagnosisExtraction_ML


Acknowledgments: T. Maarseveen and M. Maurits contributed equally.


Disclosure of Interests: Tjardo Maarseveen: None declared, Marc Maurits: None declared, Ellis Niemantsverdriet: None declared, Thomas Huizinga Grant/research support from: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Consultant of: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Annette van der Helm - van Mil: None declared, Rachel Knevel: None declared


Citation: Ann Rheum Dis, volume 79, supplement 1, year 2020, page 1842
Session: Diagnostics and imaging procedures (Abstracts Accepted for Publication)