This paper studies the robustness of estimated policy effects to changes in the distribution of covariates. Robustness to covariate shifts is important, for example, when evaluating the external validity of (quasi)-experimental results, which are often used as a benchmark for evidence-based policy-making. I propose a novel scalar robustness metric. This metric measures the magnitude of the smallest covariate shift needed to invalidate a claim on the policy effect (for example, \( ATE \geq 0 \) ) supported by the (quasi)-experimental evidence. My metric links the heterogeneity of policy effects and robustness in a flexible, nonparametric way and does not require functional form assumptions. I cast the estimation of the robustness metric as a de-biased GMM problem. This approach guarantees a parametric convergence rate for the robustness metric while allowing for machine learning-based estimators of policy effect heterogeneity (for example, lasso, random forest, boosting, neural nets). I apply my procedure to the Oregon Health Insurance experiment. I study the robustness of policy effects estimates of health-care utilization and financial strain outcomes, relative to a shift in the distribution of context-specific covariates. Such covariates are likely to differ across US states, making quantification of robustness an important exercise for adoption of the insurance policy in states other than Oregon. I find that the effect on outpatient visits is the most robust among the metrics of health-care utilization considered.
Supplementary notes can be added here, including code and math.