Skip to content

Commit 4e8d6ca

Browse files
committed
Update documentation
1 parent 4269a07 commit 4e8d6ca

File tree

88 files changed

+3692
-1104
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

88 files changed

+3692
-1104
lines changed

.buildinfo

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
22
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: 0a85976861bf9fc55bff888562c121eb
3+
config: 25651f6452cb78245dd6d545c81197d0
44
tags: 645f666f9bcd5a90fca523b33c5a78b7

01-Introduction-To-Causality.html

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -618,7 +618,7 @@ <h2>Answering a Different Kind of Question<a class="headerlink" href="#answering
618618
</section>
619619
<section id="when-association-is-causation">
620620
<h2>When Association IS Causation<a class="headerlink" href="#when-association-is-causation" title="Permalink to this headline">#</a></h2>
621-
<p>Intuitively, we kind of know why the association is not causation. If someone tells you that schools that give tablets to their students perform better than those that don’t, you can quickly point out that it is probably the case that those schools with the tablets are wealthier. As such, they would do better than average even without the tablets. Because of this, we can’t conclude that giving tablets to kids during classes will cause an increase in their academic performance. We can only say that tablets in school are associated with high academic performance.</p>
621+
<p>Intuitively, we kind of know why the association is not causation. If someone tells you that schools that give tablets to their students perform better than those that don’t, you can quickly point out that it is probably the case that those schools with the tablets are wealthier. As such, they would do better than average even without the tablets. Because of this, we can’t conclude that giving tablets to kids during classes will cause an increase in their academic performance. We can only say that tablets in school are associated with high academic performance, as measured by ENEM (sort of the SAT in Brazil, which stands for National High School Exam):</p>
622622
<div class="cell tag_hide-input docutils container">
623623
<div class="cell_input docutils container">
624624
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span>
@@ -692,11 +692,11 @@ <h2>When Association IS Causation<a class="headerlink" href="#when-association-i
692692
<div class="cell_input docutils container">
693693
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span>
694694
<span class="n">i</span><span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">],</span>
695-
<span class="n">y0</span><span class="o">=</span><span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">800</span><span class="p">,</span><span class="mi">700</span><span class="p">],</span>
696-
<span class="n">y1</span><span class="o">=</span><span class="p">[</span><span class="mi">450</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
697-
<span class="n">t</span><span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span>
698-
<span class="n">y</span><span class="o">=</span> <span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
699-
<span class="n">te</span><span class="o">=</span><span class="p">[</span><span class="o">-</span><span class="mi">50</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="o">-</span><span class="mi">200</span><span class="p">,</span><span class="mi">50</span><span class="p">],</span>
695+
<span class="n">Y0</span><span class="o">=</span><span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">800</span><span class="p">,</span><span class="mi">700</span><span class="p">],</span>
696+
<span class="n">Y1</span><span class="o">=</span><span class="p">[</span><span class="mi">450</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
697+
<span class="n">T</span><span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span>
698+
<span class="n">Y</span><span class="o">=</span> <span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
699+
<span class="n">TE</span><span class="o">=</span><span class="p">[</span><span class="o">-</span><span class="mi">50</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="o">-</span><span class="mi">200</span><span class="p">,</span><span class="mi">50</span><span class="p">],</span>
700700
<span class="p">))</span>
701701
</pre></div>
702702
</div>
@@ -721,11 +721,11 @@ <h2>When Association IS Causation<a class="headerlink" href="#when-association-i
721721
<tr style="text-align: right;">
722722
<th></th>
723723
<th>i</th>
724-
<th>y0</th>
725-
<th>y1</th>
726-
<th>t</th>
727-
<th>y</th>
728-
<th>te</th>
724+
<th>Y0</th>
725+
<th>Y1</th>
726+
<th>T</th>
727+
<th>Y</th>
728+
<th>TE</th>
729729
</tr>
730730
</thead>
731731
<tbody>
@@ -778,11 +778,11 @@ <h2>When Association IS Causation<a class="headerlink" href="#when-association-i
778778
<div class="cell_input docutils container">
779779
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span>
780780
<span class="n">i</span><span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">],</span>
781-
<span class="n">y0</span><span class="o">=</span><span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">],</span>
782-
<span class="n">y1</span><span class="o">=</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
783-
<span class="n">t</span><span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span>
784-
<span class="n">y</span><span class="o">=</span> <span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
785-
<span class="n">te</span><span class="o">=</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">],</span>
781+
<span class="n">Y0</span><span class="o">=</span><span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">],</span>
782+
<span class="n">Y1</span><span class="o">=</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
783+
<span class="n">T</span><span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span>
784+
<span class="n">Y</span><span class="o">=</span> <span class="p">[</span><span class="mi">500</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">600</span><span class="p">,</span><span class="mi">750</span><span class="p">],</span>
785+
<span class="n">TE</span><span class="o">=</span><span class="p">[</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">],</span>
786786
<span class="p">))</span>
787787
</pre></div>
788788
</div>
@@ -807,11 +807,11 @@ <h2>When Association IS Causation<a class="headerlink" href="#when-association-i
807807
<tr style="text-align: right;">
808808
<th></th>
809809
<th>i</th>
810-
<th>y0</th>
811-
<th>y1</th>
812-
<th>t</th>
813-
<th>y</th>
814-
<th>te</th>
810+
<th>Y0</th>
811+
<th>Y1</th>
812+
<th>T</th>
813+
<th>Y</th>
814+
<th>TE</th>
815815
</tr>
816816
</thead>
817817
<tbody>

03-Stats-Review-The-Most-Dangerous-Equation.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -578,7 +578,7 @@ <h2> Contents </h2>
578578
<section class="tex2jax_ignore mathjax_ignore" id="stats-review-the-most-dangerous-equation">
579579
<h1>03 - Stats Review: The Most Dangerous Equation<a class="headerlink" href="#stats-review-the-most-dangerous-equation" title="Permalink to this headline">#</a></h1>
580580
<p>In his famous article of 2007, Howard Wainer writes about very dangerous equations:</p>
581-
<p>“Some equations are dangerous if you know them, and others are dangerous if you do not. The first category may pose danger because the secrets within its bounds open doors behind which lies terrible peril. The obvious winner in this is Einstein’s iconic equation <span class="math notranslate nohighlight">\(E = MC^2\)</span>, for it provides a measure of the enormous energy hidden within ordinary matter. […] Instead I am interested in equations that unleash their danger not when we know about them, but rather when we do not. Kept close at hand, these equations allow us to understand things clearly, but their absence leaves us dangerously ignorant.”</p>
581+
<p>“Some equations are dangerous if you know them, and others are dangerous if you do not. The first category may pose danger because the secrets within its bounds open doors behind which lies terrible peril. The obvious winner in this is Einstein’s iconic equation <span class="math notranslate nohighlight">\(E = mc^2\)</span>, for it provides a measure of the enormous energy hidden within ordinary matter. […] Instead I am interested in equations that unleash their danger not when we know about them, but rather when we do not. Kept close at hand, these equations allow us to understand things clearly, but their absence leaves us dangerously ignorant.”</p>
582582
<p>The equation he talks about is Moivre’s equation:</p>
583583
<p><span class="math notranslate nohighlight">\(
584584
SE = \dfrac{\sigma}{\sqrt{n}}
@@ -857,7 +857,7 @@ <h2>Confidence Intervals<a class="headerlink" href="#confidence-intervals" title
857857
</div>
858858
</div>
859859
<p>Of course, we don’t need to restrict ourselves to the 95% confidence interval. We could generate the 99% interval by finding what we need to multiply the standard deviation by so the interval contains 99% of the mass of a normal distribution.</p>
860-
<p>The function <code class="docutils literal notranslate"><span class="pre">ppf</span></code> in python gives us the inverse of the CDF. Instead of multiplying the standard error by 2 like we did to find the 95% CI, we will multiply it by <code class="docutils literal notranslate"><span class="pre">z</span></code>, which will result in the 99% CI. So, <code class="docutils literal notranslate"><span class="pre">ppf(0.5)</span></code> will return 0.0, saying that 50% of the mass of the standard normal distribution is below 0.0. By the same token, if we plug 99.5%, we will have the value <code class="docutils literal notranslate"><span class="pre">z</span></code>, such that 99.5% of the distribution mass falls below this value. In other words, 0.5% of the mass falls above this value.</p>
860+
<p>The function <code class="docutils literal notranslate"><span class="pre">ppf</span></code> in python gives us the inverse of the CDF. Instead of multiplying the standard error by 2 like we did to find the 95% CI, we will multiply it by <code class="docutils literal notranslate"><span class="pre">z</span></code>, which will result in the 99% CI. So, <code class="docutils literal notranslate"><span class="pre">ppf(0.5)</span></code> will return 0.0, saying that 50% of the mass of the standard normal distribution (mean 0 and 1 standard deviation) is below 0.0. By the same token, if we plug 99.5%, we will have the value <code class="docutils literal notranslate"><span class="pre">z</span></code>, such that 99.5% of the distribution mass falls below this value. In other words, 0.5% of the mass falls above this value.</p>
861861
<div class="cell docutils container">
862862
<div class="cell_input docutils container">
863863
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">scipy</span> <span class="kn">import</span> <span class="n">stats</span>
@@ -950,7 +950,7 @@ <h2>Hypothesis Testing<a class="headerlink" href="#hypothesis-testing" title="Pe
950950
\mu_{diff} = \mu_1 - \mu_2
951951
\)</span></p>
952952
<p><span class="math notranslate nohighlight">\(
953-
SE_{diff} = \sqrt{SE_1 + SE_2} = \sqrt{\sigma_1^2/n_1 + \sigma_2^2/n_2}
953+
SE_{diff} = \sqrt{SE^2_1 + SE^2_2} = \sqrt{\sigma_1^2/n_1 + \sigma_2^2/n_2}
954954
\)</span></p>
955955
<p>Let’s return to our classroom example. We will construct this distribution of the difference. Of course, once we have it, building the 95% CI is straightforward.</p>
956956
<div class="cell docutils container">

04-Graphical-Causal-Models.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -862,7 +862,7 @@ <h2>Selection Bias<a class="headerlink" href="#selection-bias" title="Permalink
862862
<img alt="_images/04-Graphical-Causal-Models_19_0.svg" src="_images/04-Graphical-Causal-Models_19_0.svg" /></div>
863863
</div>
864864
<p>Imagine that investments and education take only 2 values to demonstrate why this is the case. Whether people invest or not. They are either educated or not. Initially, when we don’t control for investments, the bias term is zero: <span class="math notranslate nohighlight">\(E[Y_0|T=1] - E[Y_0|T=0] = 0\)</span> because the education was randomised. This means that the wage people would have if they didn’t receive education <span class="math notranslate nohighlight">\(Wage_0\)</span> is the same if they do or don’t receive the education treatment. But what happens if we condition on investments?</p>
865-
<p>Looking at those that invest, we probably have the case that <span class="math notranslate nohighlight">\(E[Y_0|T=0, I=1] &gt; E[Y_0|T=1, I=1]\)</span>. In words, among those that invest, those that manage to do so even without education are more independent of education to achieve high earnings. For this reason, the wage those people have, <span class="math notranslate nohighlight">\(Wage_0|T=0\)</span>, is probably higher than the wage the educated group would have if they didn’t have education, <span class="math notranslate nohighlight">\(Wage_0|T=1\)</span>. A similar reasoning can be applied to those that don’t invest, where we also probably have <span class="math notranslate nohighlight">\(E[Y_0|T=0, I=0] &gt; E[Y_0|T=1, I=0]\)</span>. Those who don’t invest even with education probably would have a lower wage, had they not got the education, than those who didn’t invest but didn’t have an education.</p>
865+
<p>Looking at those who invest, we probably have the case that <span class="math notranslate nohighlight">\(E[Y_0|T=0, I=1] &gt; E[Y_0|T=1, I=1]\)</span>. In words, among those who invest, the ones who manage to do so even without education are more likely to achieve high earnings, regardless of their education level. For this reason, the wage those people have, <span class="math notranslate nohighlight">\(Wage_0|T=0\)</span>, is probably higher than the wage the educated group would have if they didn’t have education, <span class="math notranslate nohighlight">\(Wage_0|T=1\)</span>. A similar reasoning can be applied to those who don’t invest, where we also probably have <span class="math notranslate nohighlight">\(E[Y_0|T=0, I=0] &gt; E[Y_0|T=1, I=0]\)</span>. Those who don’t invest even with education probably would have a lower wage, had they not got the education, than those who didn’t invest but didn’t have an education.</p>
866866
<p>To use a purely graphical argument, if someone invests, knowing that they have high education explains away the second cause, which is wage. Conditioned on investing, higher education is associated with low wages and we have a negative bias <span class="math notranslate nohighlight">\(E[Y_0|T=0, I=i] &gt; E[Y_0|T=1, I=i]\)</span>.</p>
867867
<p>As a side note, all of this we’ve discussed is true if we condition on any descendent of a common effect.</p>
868868
<div class="cell tag_hide-input docutils container">
@@ -944,16 +944,16 @@ <h2>Contribute<a class="headerlink" href="#contribute" title="Permalink to this
944944
},
945945
codeMirrorConfig: {
946946
theme: "abcdef",
947-
mode: "causal-glory"
947+
mode: "conda-root-py"
948948
},
949949
kernelOptions: {
950-
kernelName: "causal-glory",
950+
kernelName: "conda-root-py",
951951
path: "./."
952952
},
953953
predefinedOutput: true
954954
}
955955
</script>
956-
<script>kernelName = 'causal-glory'</script>
956+
<script>kernelName = 'conda-root-py'</script>
957957

958958
</div>
959959

0 commit comments

Comments
 (0)