|
2446 | 2446 | "metadata": {}, |
2447 | 2447 | "source": [ |
2448 | 2448 | "The forest plot below compares posterior estimates of the treatment effect ($\\alpha$) and the confounding correlation ($\\rho$) across model specifications when \n", |
2449 | | - "$\\rho = .6$ in the data-generating process.\n", |
| 2449 | + "$\\rho = .6$ in the data-generating process. The baseline normal model (which places diffuse priors on all parameters) clearly reflects the presence of endogeneity. Its posterior mean for $\\alpha$ is biased upward relative to the true value of 3, and the estimated $\\rho$ is positive, confirming that the model detects correlation between treatment and outcome disturbances. This behaviour mirrors the familiar bias of OLS under confounding: without structural constraints or informative priors, the model attributes part of the outcome variation caused by unobserved factors to the treatment itself. This inflates and corrupts our treatment effect estimate. \n", |
2450 | 2450 | "\n", |
2451 | | - "The baseline normal model—which places diffuse priors on all parameters—clearly reflects the presence of endogeneity. Its posterior mean for $\\alpha$ is biased upward relative to the true value of 3, and the estimated $\\rho$ is positive, confirming that the model detects correlation between treatment and outcome disturbances. This behaviour mirrors the familiar bias of OLS under confounding: without structural constraints or informative priors, the model attributes part of the outcome variation caused by unobserved factors to the treatment itself.\n", |
| 2451 | + "By contrast, models that introduce structure through priors—either by tightening the prior range on $\\rho$ or imposing shrinkage on the regression coefficients—perform noticeably better. The tight-$\\rho$ models regularize the latent correlation, effectively limiting the extent to which endogeneity can distort inference, while spike-and-slab and horseshoe priors perform selective shrinkage on the covariates, allowing the model to emphasize variables that genuinely predict the treatment. This helps isolate more valid “instrument-like” components of variation, pulling the posterior of $\\alpha$ closer to the true causal effect. \n", |
2452 | 2452 | "\n", |
2453 | | - "By contrast, models that introduce structure through priors—either by tightening the prior range on $\\rho$ or imposing shrinkage on the regression coefficients—perform noticeably better. The tight-$\\rho$ models regularize the latent correlation, effectively limiting the extent to which endogeneity can distort inference, while spike-and-slab and horseshoe priors perform selective shrinkage on the covariates, allowing the model to emphasize variables that genuinely predict the treatment. This helps isolate more valid “instrument-like” components of variation, pulling the posterior of $\\alpha$ closer to the true causal effect.\n", |
2454 | | - "\n", |
2455 | | - "The exclusion-restriction specification, which enforces prior beliefs about which covariates affect only the treatment or only the outcome, performs best overall. The imposed restrictions recover both the correct treatment effect and a realistic estimate of residual correlation.\n", |
| 2453 | + "The exclusion-restriction specification, which enforces prior beliefs about which covariates affect only the treatment or only the outcome, performs well too. The imposed restrictions recover both the correct treatment effect and a tight estimate of residual correlation. It may be wishful thinking that this precise instrument structure is available to an analyst in the applied setting, but instrument variable designs and their imposed exclusion restrictions should be motivated by theory. Where that theory is plausible we can hope for such precise estimates.\n", |
2456 | 2454 | "\n", |
2457 | 2455 | "Together, these results illustrate the power of Bayesian joint modelling: even in the presence of confounding, appropriate prior structure enables partial recovery of causal effects. Importantly, the priors do not simply “fix” the bias—they make explicit the trade-offs between flexibility and identification. This transparency is one of the key advantages of Bayesian causal inference over traditional reduced-form methods." |
2458 | 2456 | ] |
|
2599 | 2597 | "\n", |
2600 | 2598 | "In the baseline normal model, the posteriors of $\\alpha$ and $\\rho$ exhibit a strong negative association: as the inferred residual correlation decreases, the estimated treatment effect increases. This pattern is characteristic of endogeneity. Part of the treatment’s apparent effect on the outcome is actually explained by unobserved factors that simultaneously drive both. The normal model correctly detects confounding but cannot disentangle its consequences without additional structure, leaving the treatment effect biased.\n", |
2601 | 2599 | "\n", |
2602 | | - "Introducing tight-$\\rho$ priors fundamentally changes this relationship. By constraining the allowable range of to moderate values, we effectively impose an analyst’s belief that the degree of confounding, while nonzero, is not overwhelming. This acts as a form of structural regularization: the posterior of $\\alpha$ stabilizes around the true causal effect. In practice, this mirrors what applied analysts often do implicitly — anchoring the model with plausible bounds on endogeneity rather than assuming perfect exogeneity or unbounded correlation." |
| 2600 | + "One other feature evident from the spike and slab and horseshoe models is that the posterior distribution is somewhat bi-modal. The evidence pulls in two ways. There is not sufficient evidence in the data alone for the model to decisively characterise the $\\rho$ parameter and this induces a schizphrenic posterior distribution in the $\\alpha$ values estimated with these models. \n", |
| 2601 | + "\n", |
| 2602 | + "Introducing tight-$\\rho$ priors fundamentally changes this relationship. By constraining the allowable range of to moderate values, we effectively impose an analyst’s belief that the degree of confounding, while nonzero, is not overwhelming. This acts as a form of structural regularization: the posterior of $\\alpha$ stabilizes around the true causal effect. In practice, this mirrors what applied analysts often do implicitly. By imposing a weakly informative prior we anchor the model with plausible bounds on endogeneity rather than assuming perfect exogeneity or unbounded correlation. The preference for weakly informative priors here improves the sampling geometry but also clarifies the theoretical position of the analyst. " |
2603 | 2603 | ] |
2604 | 2604 | }, |
2605 | 2605 | { |
|
0 commit comments