Skip to content

Commit 8ff274b

Browse files
committed
Implement IV 2SLS basic code
1 parent f97fe53 commit 8ff274b

File tree

1 file changed

+212
-1
lines changed

1 file changed

+212
-1
lines changed

book/ate/iv.ipynb

Lines changed: 212 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,217 @@
1616
"- Weak IV, DML 응용 등 (causal-ml book에 다양한 상황이 제시되어 있음. 실용적인 내용은 최대한 다루기)"
1717
]
1818
},
19+
{
20+
"cell_type": "code",
21+
"execution_count": null,
22+
"metadata": {},
23+
"outputs": [],
24+
"source": [
25+
"% pip install linearmodels"
26+
]
27+
},
28+
{
29+
"cell_type": "code",
30+
"execution_count": 3,
31+
"metadata": {},
32+
"outputs": [],
33+
"source": [
34+
"import pandas as pd\n",
35+
"import numpy as np\n",
36+
"from linearmodels.iv import IV2SLS"
37+
]
38+
},
39+
{
40+
"cell_type": "markdown",
41+
"metadata": {},
42+
"source": [
43+
"push delivered(푸시 메세지 전달)와 In-App 구매력 사이의 연관관계는 인과관계가 될 수 없습니다. 소득이 confouder로 작용하기 때문입니다. (부유한 고객은 최신 스마트폰을 가져 푸시 메세지를 잘 받고, 동시에 In-App 구매력도 높기 때문입니다.)\n",
44+
"\n",
45+
"IV를 사용할 때, exclusion restriction가 반드시 필요합니다. 이는 정량적으로 검증할 수 없지만, 이 경우에 대해서는 push assigned(푸시 할당)는 랜덤 할당이고 다른 채널이 없기 때문에 exclusion restriction을 쉽게 주장할 수 있습니다. 다시 말해, Push Assigned는 반드시 Push Delivered를 통해서만 구매에 영향을 미칩니다."
46+
]
47+
},
48+
{
49+
"cell_type": "code",
50+
"execution_count": 6,
51+
"metadata": {},
52+
"outputs": [
53+
{
54+
"data": {
55+
"text/html": [
56+
"<div>\n",
57+
"<style scoped>\n",
58+
" .dataframe tbody tr th:only-of-type {\n",
59+
" vertical-align: middle;\n",
60+
" }\n",
61+
"\n",
62+
" .dataframe tbody tr th {\n",
63+
" vertical-align: top;\n",
64+
" }\n",
65+
"\n",
66+
" .dataframe thead th {\n",
67+
" text-align: right;\n",
68+
" }\n",
69+
"</style>\n",
70+
"<table border=\"1\" class=\"dataframe\">\n",
71+
" <thead>\n",
72+
" <tr style=\"text-align: right;\">\n",
73+
" <th></th>\n",
74+
" <th>in_app_purchase</th>\n",
75+
" <th>push_assigned</th>\n",
76+
" <th>push_delivered</th>\n",
77+
" </tr>\n",
78+
" </thead>\n",
79+
" <tbody>\n",
80+
" <tr>\n",
81+
" <th>0</th>\n",
82+
" <td>47</td>\n",
83+
" <td>1</td>\n",
84+
" <td>1</td>\n",
85+
" </tr>\n",
86+
" <tr>\n",
87+
" <th>1</th>\n",
88+
" <td>43</td>\n",
89+
" <td>1</td>\n",
90+
" <td>0</td>\n",
91+
" </tr>\n",
92+
" <tr>\n",
93+
" <th>2</th>\n",
94+
" <td>51</td>\n",
95+
" <td>1</td>\n",
96+
" <td>1</td>\n",
97+
" </tr>\n",
98+
" <tr>\n",
99+
" <th>3</th>\n",
100+
" <td>49</td>\n",
101+
" <td>0</td>\n",
102+
" <td>0</td>\n",
103+
" </tr>\n",
104+
" <tr>\n",
105+
" <th>4</th>\n",
106+
" <td>79</td>\n",
107+
" <td>0</td>\n",
108+
" <td>0</td>\n",
109+
" </tr>\n",
110+
" </tbody>\n",
111+
"</table>\n",
112+
"</div>"
113+
],
114+
"text/plain": [
115+
" in_app_purchase push_assigned push_delivered\n",
116+
"0 47 1 1\n",
117+
"1 43 1 0\n",
118+
"2 51 1 1\n",
119+
"3 49 0 0\n",
120+
"4 79 0 0"
121+
]
122+
},
123+
"execution_count": 6,
124+
"metadata": {},
125+
"output_type": "execute_result"
126+
}
127+
],
128+
"source": [
129+
"data = pd.read_csv(\"../data/matheus_data/app_engagement_push.csv\")\n",
130+
"data.head()"
131+
]
132+
},
133+
{
134+
"cell_type": "markdown",
135+
"metadata": {},
136+
"source": [
137+
"### 1st Stage (Relevance)"
138+
]
139+
},
140+
{
141+
"cell_type": "code",
142+
"execution_count": 14,
143+
"metadata": {},
144+
"outputs": [
145+
{
146+
"name": "stdout",
147+
"output_type": "stream",
148+
"text": [
149+
"1st Stage: Push Assignment -> Push Delivered\n",
150+
" Parameter Estimates \n",
151+
"=================================================================================\n",
152+
" Parameter Std. Err. T-stat P-value Lower CI Upper CI\n",
153+
"---------------------------------------------------------------------------------\n",
154+
"Intercept 2.22e-16 \n",
155+
"push_assigned 0.7176 0.0064 112.07 0.0000 0.7050 0.7301\n",
156+
"=================================================================================\n",
157+
"Compliance rate: 71.76%\n"
158+
]
159+
},
160+
{
161+
"name": "stderr",
162+
"output_type": "stream",
163+
"text": [
164+
"/Users/sanakang/anaconda3/lib/python3.11/site-packages/linearmodels/iv/results.py:198: RuntimeWarning: invalid value encountered in sqrt\n",
165+
" std_errors = sqrt(diag(self.cov))\n"
166+
]
167+
}
168+
],
169+
"source": [
170+
"print(\"1st Stage: Push Assignment -> Push Delivered\")\n",
171+
"first_stage = IV2SLS.from_formula(\"push_delivered ~ 1 + push_assigned\", data).fit()\n",
172+
"print(first_stage.summary.tables[1])\n",
173+
"print(f\"Compliance rate: {first_stage.params['push_assigned']:.2%}\")"
174+
]
175+
},
176+
{
177+
"cell_type": "markdown",
178+
"metadata": {},
179+
"source": [
180+
"### 2SLS Estimation: LATE"
181+
]
182+
},
183+
{
184+
"cell_type": "code",
185+
"execution_count": 17,
186+
"metadata": {},
187+
"outputs": [
188+
{
189+
"name": "stdout",
190+
"output_type": "stream",
191+
"text": [
192+
" Parameter Estimates \n",
193+
"==================================================================================\n",
194+
" Parameter Std. Err. T-stat P-value Lower CI Upper CI\n",
195+
"----------------------------------------------------------------------------------\n",
196+
"Intercept 69.292 0.3624 191.22 0.0000 68.581 70.002\n",
197+
"push_delivered 3.2938 0.7165 4.5974 0.0000 1.8896 4.6981\n",
198+
"==================================================================================\n"
199+
]
200+
}
201+
],
202+
"source": [
203+
"iv_model = IV2SLS.from_formula(\"in_app_purchase ~ 1 + [push_delivered ~ push_assigned]\", data).fit()\n",
204+
"print(iv_model.summary.tables[1])"
205+
]
206+
},
207+
{
208+
"cell_type": "code",
209+
"execution_count": 18,
210+
"metadata": {},
211+
"outputs": [
212+
{
213+
"name": "stdout",
214+
"output_type": "stream",
215+
"text": [
216+
"LATE 추정치: 3.294\n",
217+
"95% 신뢰구간: [1.890, 4.698]\n"
218+
]
219+
}
220+
],
221+
"source": [
222+
"late_estimate = iv_model.params['push_delivered']\n",
223+
"ci_lower = late_estimate - 1.96 * iv_model.std_errors['push_delivered'] \n",
224+
"ci_upper = late_estimate + 1.96 * iv_model.std_errors['push_delivered']\n",
225+
"\n",
226+
"print(f\"LATE 추정치: {late_estimate:.3f}\")\n",
227+
"print(f\"95% 신뢰구간: [{ci_lower:.3f}, {ci_upper:.3f}]\")"
228+
]
229+
},
19230
{
20231
"cell_type": "markdown",
21232
"metadata": {},
@@ -47,7 +258,7 @@
47258
"name": "python",
48259
"nbconvert_exporter": "python",
49260
"pygments_lexer": "ipython3",
50-
"version": "3.12.7"
261+
"version": "3.11.4"
51262
}
52263
},
53264
"nbformat": 4,

0 commit comments

Comments
 (0)