Fiche publication


Date publication

novembre 2025

Journal

Clinical and translational science

Auteurs

Membres identifiés du Cancéropôle Est :
Pr QUANTIN Catherine


Tous les auteurs :
Duong CH, Escolano S, Demailly R, Thiebaut A, Cottenet J, Quantin C, Tubert-Bitter P, Ahmed I

Résumé

With the growing availability of large healthcare databases for clinical science, mitigating unmeasured confounding has emerged as a major issue in pharmacoepidemiologic studies. Extensions of causal inference methods to high-dimensional settings could help address this problem, but studies comparing their performance in real-world databases are still lacking. This study aims to compare the ability to reduce the measured and indirectly measured confounding of three causal inference methods adapted to a real-world high-dimensional database using a machine learning LASSO algorithm: G-computation (GC), Targeted Maximum Likelihood estimation (TMLE) and Propensity Score with overlap or stabilized inverse probability treatment weighting. This large-scale empirical study was based on the French National Healthcare Claims Database (SNDS), consisting of 2,172,702 pregnancies  22 weeks of gestation over the period 2011-2014. We used a set of 42 negative and 13 positive reference drugs related to prematurity risk. For each reference drug, the logarithm of the odds ratio for prematurity and its 95% confidence interval were estimated using each method. The proportions of false positive and true positive associations were calculated and compared between the methods. All methods yielded fewer false positives than a crude model based on a minimal set of adjusted covariates. TMLE produced the lowest proportion of false positives (45.2%), followed by GC (47.6%). GC yielded the highest proportion of true positives (92.3%). Our results confirm the interest of causal inference methods exploiting the wealth of data in healthcare databases, especially GC in terms of performance and ease of implementation.

Mots clés

g‐computation, machine learning, preterm delivery, propensity score, targeted learning, unmeasured confounders

Référence

Clin Transl Sci. 2025 11;18(11):e70394