Skip to content

Commit d4a4c9b

Browse files
committed
Update documentation
1 parent 4e8d6ca commit d4a4c9b

29 files changed

+120
-122
lines changed

01-Introduction-To-Causality.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -680,7 +680,7 @@ <h2>When Association IS Causation<a class="headerlink" href="#when-association-i
680680
<p><span class="math notranslate nohighlight">\(Y_{1i}\)</span> is the potential outcome for <strong>the same</strong> unit i with the treatment.</p>
681681
<p>Sometimes you might see potential outcomes represented as functions <span class="math notranslate nohighlight">\(Y_i(t)\)</span>, so beware. <span class="math notranslate nohighlight">\(Y_{0i}\)</span> could be <span class="math notranslate nohighlight">\(Y_i(0)\)</span> and <span class="math notranslate nohighlight">\(Y_{1i}\)</span> could be <span class="math notranslate nohighlight">\(Y_i(1)\)</span>. Here, we will use the subscript notation most of the time.</p>
682682
<p><img alt="img" src="_images/potential_outcomes.png" /></p>
683-
<p>Back to our example, <span class="math notranslate nohighlight">\(Y_{1i}\)</span> is the academic performance for student i if he or she is in a classroom with tablets. Whether this is or not the case, it doesn’t matter for <span class="math notranslate nohighlight">\(Y_{1i}\)</span>. It is the same regardless. If student i gets the tablet, we can observe <span class="math notranslate nohighlight">\(Y_{1i}\)</span>. If not, we can observe <span class="math notranslate nohighlight">\(Y_{0i}\)</span>. Notice how in this last case, <span class="math notranslate nohighlight">\(Y_{1i}\)</span> is still defined, we just can’t see it. In this case, it is a counterfactual potential outcome.</p>
683+
<p>Back to our example, <span class="math notranslate nohighlight">\(Y_{1i}\)</span> is the academic performance for student i if he or she is in a classroom with tablets. Whether or not this is the case, it doesn’t matter for <span class="math notranslate nohighlight">\(Y_{1i}\)</span>. It is the same regardless. If student i gets the tablet, we can observe <span class="math notranslate nohighlight">\(Y_{1i}\)</span>. If not, we can observe <span class="math notranslate nohighlight">\(Y_{0i}\)</span>. Notice how in this last case, <span class="math notranslate nohighlight">\(Y_{1i}\)</span> is still defined, we just can’t see it. In this case, it is a counterfactual potential outcome.</p>
684684
<p>With potential outcomes, we can define the individual treatment effect:</p>
685685
<p><span class="math notranslate nohighlight">\(Y_{1i} - Y_{0i}\)</span></p>
686686
<p>Of course, due to the fundamental problem of causal inference, we can never know the individual treatment effect because we only observe one of the potential outcomes. For the time being, let’s focus on something easier than estimating the individual treatment effect. Instead, lets focus on the <strong>average treatment effect</strong>, which is defined as follows.</p>

02-Randomised-Experiments.html

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -477,8 +477,8 @@ <h1 class="site-logo" id="site-title">Causal Inference for the Brave and True</h
477477
<nav id="bd-toc-nav" aria-label="Page">
478478
<ul class="visible nav section-nav flex-column">
479479
<li class="toc-h2 nav-item toc-entry">
480-
<a class="reference internal nav-link" href="#the-golden-standard">
481-
The Golden Standard
480+
<a class="reference internal nav-link" href="#the-gold-standard">
481+
The Gold Standard
482482
</a>
483483
</li>
484484
<li class="toc-h2 nav-item toc-entry">
@@ -531,8 +531,8 @@ <h2> Contents </h2>
531531
<nav aria-label="Page">
532532
<ul class="visible nav section-nav flex-column">
533533
<li class="toc-h2 nav-item toc-entry">
534-
<a class="reference internal nav-link" href="#the-golden-standard">
535-
The Golden Standard
534+
<a class="reference internal nav-link" href="#the-gold-standard">
535+
The Gold Standard
536536
</a>
537537
</li>
538538
<li class="toc-h2 nav-item toc-entry">
@@ -577,8 +577,8 @@ <h2> Contents </h2>
577577

578578
<section class="tex2jax_ignore mathjax_ignore" id="randomised-experiments">
579579
<h1>02 - Randomised Experiments<a class="headerlink" href="#randomised-experiments" title="Permalink to this headline">#</a></h1>
580-
<section id="the-golden-standard">
581-
<h2>The Golden Standard<a class="headerlink" href="#the-golden-standard" title="Permalink to this headline">#</a></h2>
580+
<section id="the-gold-standard">
581+
<h2>The Gold Standard<a class="headerlink" href="#the-gold-standard" title="Permalink to this headline">#</a></h2>
582582
<p>In the previous session, we saw why and how association is different from causation. We also saw what is required to make association be causation.</p>
583583
<p><span class="math notranslate nohighlight">\(
584584
E[Y|T=1] - E[Y|T=0] = \underbrace{E[Y_1 - Y_0|T=1]}_{ATT} + \underbrace{\{ E[Y_0|T=1] - E[Y_0|T=0] \}}_{BIAS}

03-Stats-Review-The-Most-Dangerous-Equation.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1101,11 +1101,11 @@ <h2>Key Ideas<a class="headerlink" href="#key-ideas" title="Permalink to this he
11011101
<span class="n">z_stats</span> <span class="o">=</span> <span class="p">(</span><span class="n">diff</span><span class="o">-</span><span class="n">h0</span><span class="p">)</span><span class="o">/</span><span class="n">se_diff</span>
11021102
<span class="n">p_value</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">cdf</span><span class="p">(</span><span class="n">z_stats</span><span class="p">)</span>
11031103

1104-
<span class="k">def</span> <span class="nf">critial</span><span class="p">(</span><span class="n">se</span><span class="p">):</span> <span class="k">return</span> <span class="o">-</span><span class="n">se</span><span class="o">*</span><span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">-</span> <span class="n">confidence</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span>
1104+
<span class="k">def</span> <span class="nf">critical</span><span class="p">(</span><span class="n">se</span><span class="p">):</span> <span class="k">return</span> <span class="o">-</span><span class="n">se</span><span class="o">*</span><span class="n">stats</span><span class="o">.</span><span class="n">norm</span><span class="o">.</span><span class="n">ppf</span><span class="p">((</span><span class="mi">1</span> <span class="o">-</span> <span class="n">confidence</span><span class="p">)</span><span class="o">/</span><span class="mi">2</span><span class="p">)</span>
11051105

1106-
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Test </span><span class="si">{</span><span class="n">confidence</span><span class="o">*</span><span class="mi">100</span><span class="si">}</span><span class="s2">% CI: </span><span class="si">{</span><span class="n">mu1</span><span class="si">}</span><span class="s2"> +- </span><span class="si">{</span><span class="n">critial</span><span class="p">(</span><span class="n">se1</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
1107-
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Control </span><span class="si">{</span><span class="n">confidence</span><span class="o">*</span><span class="mi">100</span><span class="si">}</span><span class="s2">% CI: </span><span class="si">{</span><span class="n">mu2</span><span class="si">}</span><span class="s2"> +- </span><span class="si">{</span><span class="n">critial</span><span class="p">(</span><span class="n">se2</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
1108-
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Test-Control </span><span class="si">{</span><span class="n">confidence</span><span class="o">*</span><span class="mi">100</span><span class="si">}</span><span class="s2">% CI: </span><span class="si">{</span><span class="n">diff</span><span class="si">}</span><span class="s2"> +- </span><span class="si">{</span><span class="n">critial</span><span class="p">(</span><span class="n">se_diff</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
1106+
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Test </span><span class="si">{</span><span class="n">confidence</span><span class="o">*</span><span class="mi">100</span><span class="si">}</span><span class="s2">% CI: </span><span class="si">{</span><span class="n">mu1</span><span class="si">}</span><span class="s2"> +- </span><span class="si">{</span><span class="n">critical</span><span class="p">(</span><span class="n">se1</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
1107+
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Control </span><span class="si">{</span><span class="n">confidence</span><span class="o">*</span><span class="mi">100</span><span class="si">}</span><span class="s2">% CI: </span><span class="si">{</span><span class="n">mu2</span><span class="si">}</span><span class="s2"> +- </span><span class="si">{</span><span class="n">critical</span><span class="p">(</span><span class="n">se2</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
1108+
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Test-Control </span><span class="si">{</span><span class="n">confidence</span><span class="o">*</span><span class="mi">100</span><span class="si">}</span><span class="s2">% CI: </span><span class="si">{</span><span class="n">diff</span><span class="si">}</span><span class="s2"> +- </span><span class="si">{</span><span class="n">critical</span><span class="p">(</span><span class="n">se_diff</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
11091109
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Z Statistic </span><span class="si">{</span><span class="n">z_stats</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
11101110
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;P-Value </span><span class="si">{</span><span class="n">p_value</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
11111111

05-The-Unreasonable-Effectiveness-of-Linear-Regression.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -720,7 +720,7 @@ <h2>Regression Theory<a class="headerlink" href="#regression-theory" title="Perm
720720
<p><span class="math notranslate nohighlight">\(
721721
\kappa = \dfrac{Cov(Y_i, \tilde{T_i})}{Var(\tilde{T_i})}
722722
\)</span></p>
723-
<p>where <span class="math notranslate nohighlight">\(\tilde{T_i}\)</span> is the residual from a regression of all other covariates <span class="math notranslate nohighlight">\(X_{1i} + ... + X_{ki}\)</span> on <span class="math notranslate nohighlight">\(T_i\)</span>. Now, let’s appreciate how cool this is. It means that the coefficient of a multivariate regression is the bivariate coefficient of the same regressor <strong>after accounting for the effect of other variables in the model</strong>. In causal inference terms, <span class="math notranslate nohighlight">\(\kappa\)</span> is the bivariate coefficient of <span class="math notranslate nohighlight">\(T\)</span> after having used all other variables to predict it.</p>
723+
<p>where <span class="math notranslate nohighlight">\(\tilde{T_i}\)</span> is the residual from a regression of <span class="math notranslate nohighlight">\(T_i\)</span> on all other covariates <span class="math notranslate nohighlight">\(X_{1i}, ..., X_{ki}\)</span>. Now, let’s appreciate how cool this is. It means that the coefficient of a multivariate regression is the bivariate coefficient of the same regressor <strong>after accounting for the effect of other variables in the model</strong>. In causal inference terms, <span class="math notranslate nohighlight">\(\kappa\)</span> is the bivariate coefficient of <span class="math notranslate nohighlight">\(T\)</span> after having used all other variables to predict it.</p>
724724
<p>This has a nice intuition behind it. If we can predict <span class="math notranslate nohighlight">\(T\)</span> using other variables, it means it’s not random. However, we can make it so that <span class="math notranslate nohighlight">\(T\)</span> is as good as random once we control for other available variables. To do so, we use linear regression to predict it from the other variables and then we take the residuals of that regression <span class="math notranslate nohighlight">\(\tilde{T}\)</span>. By definition, <span class="math notranslate nohighlight">\(\tilde{T}\)</span> cannot be predicted by the other variables <span class="math notranslate nohighlight">\(X\)</span> that we’ve already used to predict <span class="math notranslate nohighlight">\(T\)</span>. Quite elegantly, <span class="math notranslate nohighlight">\(\tilde{T}\)</span> is a version of the treatment that is not associated with any other variable in <span class="math notranslate nohighlight">\(X\)</span>.</p>
725725
<p>By the way, this is also a property of linear regression. The residual are always orthogonal or uncorrelated with any of the variables in the model that created it:</p>
726726
<div class="cell docutils container">
@@ -1070,7 +1070,7 @@ <h2>Omitted Variable or Confounding Bias<a class="headerlink" href="#omitted-var
10701070
<p><span class="math notranslate nohighlight">\(
10711071
\dfrac{Cov(Wage_i, Educ_i)}{Var(Educ_i)} = \kappa + \beta'\delta_{Ability}
10721072
\)</span></p>
1073-
<p>where <span class="math notranslate nohighlight">\(\delta_{A}\)</span> is the vector of coefficients from the regression of <span class="math notranslate nohighlight">\(A\)</span> on <span class="math notranslate nohighlight">\(Educ\)</span></p>
1073+
<p>where <span class="math notranslate nohighlight">\(\delta_{A}\)</span> is the vector of coefficients from the regression of <span class="math notranslate nohighlight">\(Educ\)</span> on <span class="math notranslate nohighlight">\(A\)</span></p>
10741074
<p>The key point here is that it won’t be exactly the <span class="math notranslate nohighlight">\(\kappa\)</span> that we want. Instead, it comes with this extra annoying term <span class="math notranslate nohighlight">\(\beta'\delta_{A}\)</span>. This term is the impact of the omitted <span class="math notranslate nohighlight">\(A\)</span> on <span class="math notranslate nohighlight">\(Wage\)</span>, <span class="math notranslate nohighlight">\(\beta\)</span> times the impact of the omitted on the included <span class="math notranslate nohighlight">\(Educ\)</span>. This is important for economists that Joshua Angrist made a mantra out of it so that the students can recite it in meditation:</p>
10751075
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="s2">&quot;Short equals long </span>
10761076
<span class="n">plus</span> <span class="n">the</span> <span class="n">effect</span> <span class="n">of</span> <span class="n">omitted</span>

08-Instrumental-Variables.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -668,7 +668,7 @@ <h2>Going Around Omitted Variable Bias<a class="headerlink" href="#going-around-
668668
<div class="cell_output docutils container">
669669
<img alt="_images/08-Instrumental-Variables_4_0.svg" src="_images/08-Instrumental-Variables_4_0.svg" /></div>
670670
</div>
671-
<p>If we have such a variable, we can recover the causal effect <span class="math notranslate nohighlight">\(\kappa\)</span> with what we will see as the IV formula. To do so, let’s think about the ideal equation we want to run. Using more general terms like <span class="math notranslate nohighlight">\(T\)</span> for the treatment and <span class="math notranslate nohighlight">\(W\)</span> for the confounders, here is want we want:</p>
671+
<p>If we have such a variable, we can recover the causal effect <span class="math notranslate nohighlight">\(\kappa\)</span> with what we will see as the IV formula. To do so, let’s think about the ideal equation we want to run. Using more general terms like <span class="math notranslate nohighlight">\(T\)</span> for the treatment and <span class="math notranslate nohighlight">\(W\)</span> for the confounders, here is what we want:</p>
672672
<p><span class="math notranslate nohighlight">\(
673673
Y_i = \beta_0 + \kappa \ T_i + \pmb{\beta}W_i + u_i
674674
\)</span></p>
@@ -703,7 +703,7 @@ <h2>Going Around Omitted Variable Bias<a class="headerlink" href="#going-around-
703703
</section>
704704
<section id="quarter-of-birth-and-the-effect-of-education-on-wage">
705705
<h2>Quarter of Birth and the Effect of Education on Wage<a class="headerlink" href="#quarter-of-birth-and-the-effect-of-education-on-wage" title="Permalink to this headline">#</a></h2>
706-
<p>So far, we’ve been treating these instruments as some magical variable <span class="math notranslate nohighlight">\(Z\)</span> which have the miraculous propriety of only affecting the outcome through the treatment. To be honest, good instruments are so hard to come by that we might as well consider them miracles. Let’s just say it is not for the faint of heart. Rumor has it that the cool kids at Chicago School of Economics talk about how they come up with this or that instrument at the bar.</p>
706+
<p>So far, we’ve been treating these instruments as some magical variable <span class="math notranslate nohighlight">\(Z\)</span> which have the miraculous property of only affecting the outcome through the treatment. To be honest, good instruments are so hard to come by that we might as well consider them miracles. Let’s just say it is not for the faint of heart. Rumor has it that the cool kids at Chicago School of Economics talk about how they come up with this or that instrument at the bar.</p>
707707
<p><img alt="img" src="_images/good-iv.png" /></p>
708708
<p>Still, we do have some interesting examples of instruments to make things a little more concrete. We will again try to estimate the effect of education on wage. To do so, we will use the person’s quarter of birth as the instrument Z.</p>
709709
<p>This idea takes advantage of US compulsory attendance law. Usually, they state that a kid must have turned 6 years by January 1 of the year they enter school. For this reason, kids that are born at the beginning of the year will enter school at an older age. Compulsory attendance law also requires students to be in school until they turn 16, at which point they are legally allowed to drop out. The result is that people born later in the year have, on average, more years of education than those born in the beginning of the year.</p>

0 commit comments

Comments
 (0)