
Background: Axial spondyloarthritis (axSpA) is a chronic inflammatory disease in which structural spinal damage plays a central role in diagnosis and long-term outcome. Conventional radiography remains widely used in routine clinical practice, particularly during the evaluation of patients with chronic back pain. However, differentiating axSpA-related new bone formation from degenerative spinal changes is often challenging. Radiographic features may overlap, especially in early or intermediate stages of disease, contributing to diagnostic delay and considerable variability in image interpretation.
Objectives: To develop and externally validate an automated model for classification of spinal radiographic changes, with a specific focus on distinguishing axSpA-related structural lesions from degenerative alterations in patients presenting with chronic back pain.
Methods: This retrospective study included 827 patients undergoing diagnostic evaluation for chronic back pain and suspected axial spondyloarthritis, each contributing one lateral spinal radiograph. A development cohort of 677 patients from a single institution was divided at the patient level into training (n=582) and internal validation (n=95) sets. External validation was performed in an independent cohort of 150 patients from a separate institution. Radiographs of the lumbosacral (n=301), thoracic (n=283), and cervical (n=243) spine were analysed. Images were classified into four categories representing a continuum from degenerative to inflammatory structural changes: normal findings, osteophytes, parasyndesmophytes, and syndesmophytes. Model performance was assessed using standard classification metrics, including overall accuracy, macro-averaged F1-score, balanced accuracy, quadratic weighted kappa, and mean absolute error. Ninety-five percent confidence intervals were estimated using non-parametric bootstrap resampling at the patient level based on the external test cohort.
Results: In the external validation cohort, the model achieved an overall accuracy of 0.90 (95% CI 0.85–0.95) and a macro-averaged F1-score of 0.90 (95% CI 0.84–0.95). Balanced accuracy was 0.92 (95% CI 0.86–0.96). Agreement between model predictions and reference grading was high (quadratic weighted kappa 0.91; 95% CI 0.83–0.96), with a low mean absolute error of 0.13 (95% CI 0.07–0.20). Accuracy within one adjacent structural category reached 0.98 (95% CI 0.95–1.00). Misclassifications were mainly observed between neighbouring categories, without systematic confusion between normal radiographs and advanced syndesmophytes.
Conclusions: In patients with chronic back pain evaluated for suspected axial spondyloarthritis, this externally validated model showed robust and clinically coherent performance in distinguishing axSpA-related structural changes from degenerative spinal alterations on conventional radiographs. Such an approach may support radiographic interpretation in daily practice and contribute to more standardized assessment of structural involvement in axSpA.
REFERENCES: NIL.
Acknowledgments: NIL.
Disclosure of Interests: None declared.