마무리 검토

SoyoungJun-SL · SoyoungJun-SL · commit 24400e49a9ea · 2025-11-15T03:43:52.000-05:00
diff --git a/book/cate_and_policy/meta_learner.ipynb b/book/cate_and_policy/meta_learner.ipynb
@@ -27,7 +27,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 15,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -51,7 +51,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 16,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -125,7 +125,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 64,
+   "execution_count": 17,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -183,7 +183,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -193,14 +193,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 19,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -230,7 +223,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 20,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -278,7 +271,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 21,
    "metadata": {},
    "outputs": [
     {
@@ -360,7 +353,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 22,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -387,7 +380,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 23,
    "metadata": {},
    "outputs": [
     {
@@ -472,7 +465,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 24,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -495,15 +488,16 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 25,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001017 seconds.\n",
-      "You can set `force_col_wise=true` to remove the overhead.\n",
+      "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000162 seconds.\n",
+      "You can set `force_row_wise=true` to remove the overhead.\n",
+      "And if memory is not enough, you can set `force_col_wise=true`.\n",
       "[LightGBM] [Info] Total Bins 996\n",
       "[LightGBM] [Info] Number of data points in the train set: 15000, number of used features: 5\n",
       "[LightGBM] [Info] Start training from score 0.200467\n",
@@ -607,9 +601,8 @@
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
-      "[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000367 seconds.\n",
-      "You can set `force_row_wise=true` to remove the overhead.\n",
-      "And if memory is not enough, you can set `force_col_wise=true`.\n",
+      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000233 seconds.\n",
+      "You can set `force_col_wise=true` to remove the overhead.\n",
       "[LightGBM] [Info] Total Bins 958\n",
       "[LightGBM] [Info] Number of data points in the train set: 3370, number of used features: 4\n",
       "[LightGBM] [Info] Start training from score 0.237982\n",
@@ -713,7 +706,7 @@
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
-      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000278 seconds.\n",
+      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000189 seconds.\n",
       "You can set `force_col_wise=true` to remove the overhead.\n",
       "[LightGBM] [Info] Total Bins 988\n",
       "[LightGBM] [Info] Number of data points in the train set: 11630, number of used features: 4\n",
@@ -818,7 +811,7 @@
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
-      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000223 seconds.\n",
+      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000328 seconds.\n",
       "You can set `force_col_wise=true` to remove the overhead.\n",
       "[LightGBM] [Info] Total Bins 958\n",
       "[LightGBM] [Info] Number of data points in the train set: 3370, number of used features: 4\n",
@@ -923,7 +916,7 @@
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
       "[LightGBM] [Warning] No further splits with positive gain, best gain: -inf\n",
-      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000361 seconds.\n",
+      "[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000192 seconds.\n",
       "You can set `force_col_wise=true` to remove the overhead.\n",
       "[LightGBM] [Info] Total Bins 988\n",
       "[LightGBM] [Info] Number of data points in the train set: 11630, number of used features: 4\n",
@@ -1088,7 +1081,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 26,
    "metadata": {},
    "outputs": [
     {
@@ -1157,7 +1150,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "모든 모델(OLS, S, T, X)이 베이스라인(ATE)보다 훨씬 우수한 성과를 보였으며, 그중 S-Learner(LGBM)가 가장 근소하게 우수했습니다 (AUUC: 0.0353).아이스크림 데이터와 달리, 이메일 데이터에서는 CATE의 이질성이 주로 선형적이거나 매우 단*할 가능성이 높습니다. OLS(AUUC: 0.0336)가 다른 메타-러너들과 거의 대등한 성능을 보인 것은, 복잡한 비선형 모델이 추가적으로 잡아낼 이질성이 거의 없었다는 의미입니다. 이는 OLS가 CATE 추론에서 강력한 기준선임을 재확인시켜 줍니다."
+    "모든 모델(OLS, S, T, X)이 베이스라인(ATE)보다 훨씬 우수한 성과를 보였으며, 그중 X-Learner(LGBM)가 가장 근소하게 우수했습니다 (AUUC: 0.0394).아이스크림 데이터와 달리, 이메일 데이터에서는 CATE의 이질성이 주로 선형적이거나 매우 단순할 가능성이 높습니다. OLS(AUUC: 0.0336)가 다른 메타-러너들과 거의 대등한 성능을 보인 것은, 복잡한 비선형 모델이 추가적으로 잡아낼 이질성이 거의 없었다는 의미입니다. 이는 OLS가 CATE 추론에서 강력한 기준선임을 재확인시켜 줍니다."
    ]
   },
   {