Background: Systemic lupus erythematosus (SLE) is a chronic autoimmune disease that often requires glucocorticoid (GC) therapy to control disease flares [1]. Although GCs help control inflammation, they can increase healthcare costs by causing adverse events such as severe infections and necessitating additional monitoring [2]. Previous research using observational data can shed light on the relationship between GC dosage and costs, but these data may omit important clinical details such as BMI and smoking history [3]. Failure to account for these unobserved factors can introduce endogeneity [4] of the GC dosage, leading to biased estimates of the causal effect of GCs on healthcare costs. Traditional instrumental variable (IV) methods, attempt to address endogeneity but require strong external instruments. Finding valid instruments can be challenging. Moreover, purely linear models may fail to capture the non-linear underlying GC–cost relationship. To overcome these challenges, we propose a two-stage copula approach within a generalized additive model (GAM) [5]. This approach offers a versatile framework for addressing endogeneity without requiring IV.
Objectives: Our primary objective was to estimate the causal effect of GC dosage on annual healthcare costs in incident SLE patients. To evaluate the robustness of our machine-learning-based copula-GAM method, we compared it against two alternative approaches: a naive ordinary least squares (OLS) regression that ignores endogeneity and assumes linearity and an IV GAM using physician prescribing preference as a potential IV to cross-validate our method.
Methods: We used British Columbia’s administrative health databases (1996–2023) to identify an incident cohort of SLE patients. Patients were included if they were 18 years or older, presented at least two distinct SLE diagnostic codes within a two-year period, and had no SLE-related codes during the prior seven years. The date of the second diagnostic code was defined as the index date of the incident SLE. All patients had at least one prescription for GCs recorded in pharmacy claims. We computed each patient’s total GC dosage by multiplying the drug strength by the quantity dispensed and summing these values over the entire follow-up. We then converted this total dosage into an annual dosage by dividing by the total days of follow-up and multiplying by 365. Annual healthcare costs were calculated by summing inpatient, outpatient, and medication-related expenses, standardized to 2023 Canadian dollars. Inpatient costs were derived using resource-intensity weights that reflect the relative complexity of hospitalizations. Outpatient costs were based on physician billing data, and pharmacy claims data included drug and dispensing fees. We handled endogeneity through a two-stage copula approach. The idea is to use Gaussian copula model to capture the dependence between GC dose and the unmeasured confounders. In the first stage, GC dosage and all observed covariates were transformed using a Gaussian copula to estimate a residual that captured unmeasured confounding. In the second stage, we added this residual term into a GAM for annual healthcare costs to accommodate non-linear relationships with age and other relevant continuous factors.
Results: We identified 10,746 incident SLE who had received at least one glucocorticoid prescription. A naive OLS model, which did not account for endogeneity, indicated that each additional unit of glucocorticoid dosage was associated with an 8.94 Canadian dollar increase in annual healthcare costs (standard error 1.51, p<0.001). The naive OLS estimate likely has an overestimation bias due to unmeasured confounding. For instance, individuals with more severe diseases (unmeasured in the administrative database) tend to have higher GC dosage and greater health costs, leading to a spurious positive association between GC dosage and health cost. An IV GAM using physician prescribing preference can correct this overestimation bias and produce a substantially lower estimate of 2.47 dollars with a large standard error of 13.11 (p=0.85) due to using a weak IV. In contrast, our copula-GAM correction approach yielded a more moderate effect of 4.40 dollars with a much smaller standard error of 2.31 (p=0.058), suggesting that our proposed copula approach corrects confounding bias without being affected by the weak IV problem.
Conclusion: This study demonstrates the importance of addressing endogeneity in observational research on SLE and provides a novel, flexible strategy that does not require IV. By embedding a two-stage copula correction into a GAM, we effectively isolated the impact of GC dosage on healthcare costs while accounting for unmeasured confounder bias and accommodating non-linear patterns in complex administrative data. This method is valuable in other real-world contexts where unobserved confounding is present and valid IVs are lacking, helping researchers draw causal inferences using observational data.
REFERENCES: [1] Basta F, Fasola F, Triantafyllias K, Schwarting A. Systemic Lupus Erythematosus (SLE) Therapy: The Old and the New. Rheumatol Ther. 2020 Sep 1;7(3):433–46.
[2] Chen SY, Choi CB, Li Q, Yeh WS, Lee YC, Kao AH, et al. Glucocorticoid Use in Patients With Systemic Lupus Erythematosus: Association Between Dose and Health Care Utilization and Costs. Arthritis Care Res. 2015 Aug;67(8):1086–94.
[3] Rice JB, White AG, Johnson M, Wagh A, Qin Y, Bartels-Peculis L, et al. Healthcare resource use and cost associated with varying dosages of extended corticosteroid exposure in a US population. J Med Econ. 2018 Sep;21(9):846–52.
[4] Papies D, Ebbes P, van Heerde H. Addressing Endogeneity in Marketing Models. In 2017. p. 581–627.
[5] Hastie T, Tibshirani R. Generalized Additive Models. Stat Sci. 1986 Aug;1(3):297–310.
Acknowledgements: NIL.
Disclosure of Interests: None declared.
© The Authors 2025. This abstract is an open access article published in Annals of Rheumatic Diseases under the CC BY-NC-ND license (