|
27 | 27 | }, |
28 | 28 | { |
29 | 29 | "cell_type": "code", |
30 | | - "execution_count": 1, |
| 30 | + "execution_count": 15, |
31 | 31 | "metadata": {}, |
32 | 32 | "outputs": [], |
33 | 33 | "source": [ |
|
51 | 51 | }, |
52 | 52 | { |
53 | 53 | "cell_type": "code", |
54 | | - "execution_count": 2, |
| 54 | + "execution_count": 16, |
55 | 55 | "metadata": {}, |
56 | 56 | "outputs": [], |
57 | 57 | "source": [ |
|
125 | 125 | }, |
126 | 126 | { |
127 | 127 | "cell_type": "code", |
128 | | - "execution_count": 64, |
| 128 | + "execution_count": 17, |
129 | 129 | "metadata": {}, |
130 | 130 | "outputs": [], |
131 | 131 | "source": [ |
|
183 | 183 | }, |
184 | 184 | { |
185 | 185 | "cell_type": "code", |
186 | | - "execution_count": 3, |
| 186 | + "execution_count": 18, |
187 | 187 | "metadata": {}, |
188 | 188 | "outputs": [], |
189 | 189 | "source": [ |
|
193 | 193 | }, |
194 | 194 | { |
195 | 195 | "cell_type": "code", |
196 | | - "execution_count": null, |
197 | | - "metadata": {}, |
198 | | - "outputs": [], |
199 | | - "source": [] |
200 | | - }, |
201 | | - { |
202 | | - "cell_type": "code", |
203 | | - "execution_count": 4, |
| 196 | + "execution_count": 19, |
204 | 197 | "metadata": {}, |
205 | 198 | "outputs": [], |
206 | 199 | "source": [ |
|
230 | 223 | }, |
231 | 224 | { |
232 | 225 | "cell_type": "code", |
233 | | - "execution_count": 5, |
| 226 | + "execution_count": 20, |
234 | 227 | "metadata": {}, |
235 | 228 | "outputs": [], |
236 | 229 | "source": [ |
|
278 | 271 | }, |
279 | 272 | { |
280 | 273 | "cell_type": "code", |
281 | | - "execution_count": 6, |
| 274 | + "execution_count": 21, |
282 | 275 | "metadata": {}, |
283 | 276 | "outputs": [ |
284 | 277 | { |
|
360 | 353 | }, |
361 | 354 | { |
362 | 355 | "cell_type": "code", |
363 | | - "execution_count": 7, |
| 356 | + "execution_count": 22, |
364 | 357 | "metadata": {}, |
365 | 358 | "outputs": [], |
366 | 359 | "source": [ |
|
387 | 380 | }, |
388 | 381 | { |
389 | 382 | "cell_type": "code", |
390 | | - "execution_count": null, |
| 383 | + "execution_count": 23, |
391 | 384 | "metadata": {}, |
392 | 385 | "outputs": [ |
393 | 386 | { |
|
472 | 465 | }, |
473 | 466 | { |
474 | 467 | "cell_type": "code", |
475 | | - "execution_count": 10, |
| 468 | + "execution_count": 24, |
476 | 469 | "metadata": {}, |
477 | 470 | "outputs": [], |
478 | 471 | "source": [ |
|
495 | 488 | }, |
496 | 489 | { |
497 | 490 | "cell_type": "code", |
498 | | - "execution_count": 13, |
| 491 | + "execution_count": 25, |
499 | 492 | "metadata": {}, |
500 | 493 | "outputs": [ |
501 | 494 | { |
502 | 495 | "name": "stdout", |
503 | 496 | "output_type": "stream", |
504 | 497 | "text": [ |
505 | | - "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001017 seconds.\n", |
506 | | - "You can set `force_col_wise=true` to remove the overhead.\n", |
| 498 | + "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000162 seconds.\n", |
| 499 | + "You can set `force_row_wise=true` to remove the overhead.\n", |
| 500 | + "And if memory is not enough, you can set `force_col_wise=true`.\n", |
507 | 501 | "[LightGBM] [Info] Total Bins 996\n", |
508 | 502 | "[LightGBM] [Info] Number of data points in the train set: 15000, number of used features: 5\n", |
509 | 503 | "[LightGBM] [Info] Start training from score 0.200467\n", |
|
607 | 601 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
608 | 602 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
609 | 603 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
610 | | - "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000367 seconds.\n", |
611 | | - "You can set `force_row_wise=true` to remove the overhead.\n", |
612 | | - "And if memory is not enough, you can set `force_col_wise=true`.\n", |
| 604 | + "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000233 seconds.\n", |
| 605 | + "You can set `force_col_wise=true` to remove the overhead.\n", |
613 | 606 | "[LightGBM] [Info] Total Bins 958\n", |
614 | 607 | "[LightGBM] [Info] Number of data points in the train set: 3370, number of used features: 4\n", |
615 | 608 | "[LightGBM] [Info] Start training from score 0.237982\n", |
|
713 | 706 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
714 | 707 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
715 | 708 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
716 | | - "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000278 seconds.\n", |
| 709 | + "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000189 seconds.\n", |
717 | 710 | "You can set `force_col_wise=true` to remove the overhead.\n", |
718 | 711 | "[LightGBM] [Info] Total Bins 988\n", |
719 | 712 | "[LightGBM] [Info] Number of data points in the train set: 11630, number of used features: 4\n", |
|
818 | 811 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
819 | 812 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
820 | 813 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
821 | | - "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000223 seconds.\n", |
| 814 | + "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000328 seconds.\n", |
822 | 815 | "You can set `force_col_wise=true` to remove the overhead.\n", |
823 | 816 | "[LightGBM] [Info] Total Bins 958\n", |
824 | 817 | "[LightGBM] [Info] Number of data points in the train set: 3370, number of used features: 4\n", |
|
923 | 916 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
924 | 917 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
925 | 918 | "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n", |
926 | | - "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000361 seconds.\n", |
| 919 | + "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000192 seconds.\n", |
927 | 920 | "You can set `force_col_wise=true` to remove the overhead.\n", |
928 | 921 | "[LightGBM] [Info] Total Bins 988\n", |
929 | 922 | "[LightGBM] [Info] Number of data points in the train set: 11630, number of used features: 4\n", |
|
1088 | 1081 | }, |
1089 | 1082 | { |
1090 | 1083 | "cell_type": "code", |
1091 | | - "execution_count": 14, |
| 1084 | + "execution_count": 26, |
1092 | 1085 | "metadata": {}, |
1093 | 1086 | "outputs": [ |
1094 | 1087 | { |
|
1157 | 1150 | "cell_type": "markdown", |
1158 | 1151 | "metadata": {}, |
1159 | 1152 | "source": [ |
1160 | | - "모든 모델(OLS, S, T, X)이 베이스라인(ATE)보다 훨씬 우수한 성과를 보였으며, 그중 S-Learner(LGBM)가 가장 근소하게 우수했습니다 (AUUC: 0.0353).아이스크림 데이터와 달리, 이메일 데이터에서는 CATE의 이질성이 주로 선형적이거나 매우 단*할 가능성이 높습니다. OLS(AUUC: 0.0336)가 다른 메타-러너들과 거의 대등한 성능을 보인 것은, 복잡한 비선형 모델이 추가적으로 잡아낼 이질성이 거의 없었다는 의미입니다. 이는 OLS가 CATE 추론에서 강력한 기준선임을 재확인시켜 줍니다." |
| 1153 | + "모든 모델(OLS, S, T, X)이 베이스라인(ATE)보다 훨씬 우수한 성과를 보였으며, 그중 X-Learner(LGBM)가 가장 근소하게 우수했습니다 (AUUC: 0.0394).아이스크림 데이터와 달리, 이메일 데이터에서는 CATE의 이질성이 주로 선형적이거나 매우 단순할 가능성이 높습니다. OLS(AUUC: 0.0336)가 다른 메타-러너들과 거의 대등한 성능을 보인 것은, 복잡한 비선형 모델이 추가적으로 잡아낼 이질성이 거의 없었다는 의미입니다. 이는 OLS가 CATE 추론에서 강력한 기준선임을 재확인시켜 줍니다." |
1161 | 1154 | ] |
1162 | 1155 | }, |
1163 | 1156 | { |
|
0 commit comments