Skip to content

Commit 15faa7e

Browse files
committed
Clarify indicator function and dummy variable usage in event study
Expanded documentation in both the EventStudy class and the event study PyMC notebook to explain the equivalence between indicator functions and dummy variables. Added details on how dummy variables are constructed for each event time, the omission of the reference period to avoid multicollinearity, and the interpretation of regression coefficients as ATT at each event time.
1 parent 5bdca9e commit 15faa7e

File tree

2 files changed

+24
-0
lines changed

2 files changed

+24
-0
lines changed

causalpy/experiments/event_study.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,18 @@ class EventStudy(BaseExperiment):
5454
- :math:`E_{it} = t - G_i` is event time (time relative to treatment)
5555
- :math:`\\beta_k` are the dynamic treatment effects at event time k
5656
- :math:`k_0` is the reference (omitted) event time
57+
- :math:`\\mathbf{1}\\{E_{it} = k\\}` is the indicator function: equals 1 when the
58+
condition :math:`E_{it} = k` is true (i.e., when observation it is at event time k),
59+
and 0 otherwise
60+
61+
**Implementation via dummy variables:** The indicator function notation is equivalent
62+
to creating dummy (binary) variables for each event time. Internally, this class
63+
creates one dummy variable for each event time k in the event window, where the dummy
64+
equals 1 for treated observations at that specific event time and 0 otherwise. One
65+
event time (the reference period, typically k=-1) is omitted to avoid perfect
66+
multicollinearity. The estimated regression coefficient :math:`\\beta_k` for each
67+
dummy variable represents the Average Treatment Effect on the Treated (ATT) at event
68+
time k, measured relative to the reference period.
5769
5870
.. warning::
5971

docs/source/notebooks/event_study_pymc.ipynb

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,18 @@
3535
"- $\\lambda_t$ are time fixed effects\n",
3636
"- $\\beta_k$ are the dynamic treatment effects at event time $k$\n",
3737
"- $k_0$ is the reference (omitted) period, typically $k=-1$\n",
38+
"- $\\mathbf{1}\\{E_{it} = k\\}$ is the **indicator function**: it equals 1 when the condition inside the braces is true (i.e., when observation $it$ is at event time $k$), and 0 otherwise\n",
39+
"\n",
40+
"### Understanding the Indicator Function as Dummy Variables\n",
41+
"\n",
42+
"The indicator function notation $\\mathbf{1}\\{E_{it} = k\\}$ is mathematically equivalent to creating **dummy (binary) variables**. For each event time $k$ in the event window:\n",
43+
"\n",
44+
"- Create a dummy variable $D_k$ that equals 1 for treated units at event time $k$, and 0 otherwise\n",
45+
"- Omit one event time (the **reference period**, typically $k=-1$) to avoid perfect multicollinearity\n",
46+
"\n",
47+
"This means the summation $\\sum_{k \\neq k_0} \\beta_k \\cdot \\mathbf{1}\\{E_{it} = k\\}$ is equivalent to including dummy variables $D_{-5}, D_{-4}, ..., D_0, D_1, ...$ (excluding $D_{-1}$) in your regression.\n",
48+
"\n",
49+
"**Key insight:** The estimated regression coefficient $\\beta_k$ for each dummy variable represents the **Average Treatment Effect on the Treated (ATT)** at event time $k$, measured *relative to the reference period*. Since the reference period ($k=-1$) is omitted, its coefficient is implicitly zero, and all other coefficients show the difference from that baseline.\n",
3850
"\n",
3951
"**Interpretation:**\n",
4052
"- $\\beta_k$ for $k < 0$ (pre-treatment): Should be near zero if parallel trends hold\n",

0 commit comments

Comments
 (0)