1. Post-selection inference for causal effects after causal discovery
- Author
-
Chang, Ting-Hsuan, Guo, Zijian, and Malinsky, Daniel
- Subjects
Statistics - Methodology - Abstract
Algorithms for constraint-based causal discovery select graphical causal models among a space of possible candidates (e.g., all directed acyclic graphs) by executing a sequence of conditional independence tests. These may be used to inform the estimation of causal effects (e.g., average treatment effects) when there is uncertainty about which covariates ought to be adjusted for, or which variables act as confounders versus mediators. However, naively using the data twice, for model selection and estimation, would lead to invalid confidence intervals. Moreover, if the selected graph is incorrect, the inferential claims may apply to a selected functional that is distinct from the actual causal effect. We propose an approach to post-selection inference that is based on a resampling and screening procedure, which essentially performs causal discovery multiple times with randomly varying intermediate test statistics. Then, an estimate of the target causal effect and corresponding confidence sets are constructed from a union of individual graph-based estimates and intervals. We show that this construction has asymptotically correct coverage for the true causal effect parameter. Importantly, the guarantee holds for a fixed population-level effect, not a data-dependent or selection-dependent quantity. Most of our exposition focuses on the PC-algorithm for learning directed acyclic graphs and the multivariate Gaussian case for simplicity, but the approach is general and modular, so it may be used with other conditional independence based discovery algorithms and distributional families.
- Published
- 2024