Skip to content

Commit 56d4124

Browse files
committed
small changes for readability and changing to log likelihood loss
1 parent c741820 commit 56d4124

File tree

2 files changed

+19
-16
lines changed

2 files changed

+19
-16
lines changed

4tmle.Rmd

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,10 @@ Now that we have covered
2020
- outcome models (e.g., G-computation) and
2121
- exposure models (e.g., propensity score models),
2222

23-
let us talk about Doubly robust (DR) estimators. DR has several important properties:
23+
let us talk about doubly robust (DR) estimators. DR has several important properties:
2424

25-
* They use information from
26-
- both the exposure and
25+
* They use information from both
26+
- the exposure and
2727
- the outcome models.
2828
* They provide a **consistent estimator** if either of the above mentioned models is correctly specified.
2929
- consistent estimator means as the sample size increases, distribution of the estimates gets concentrated near the true parameter
@@ -108,7 +108,7 @@ Y.fit.sl <- SuperLearner(Y=ObsData$Y.bounded,
108108
SL.library=c("SL.glm",
109109
"SL.glmnet",
110110
"SL.xgboost"),
111-
method="method.NNLS",
111+
method="method.CC_nloglik",
112112
family="gaussian")
113113
```
114114

@@ -139,15 +139,15 @@ summary(ObsData$Pred.Y1)
139139

140140
- $Q^0(A=0,L)$ predictions:
141141

142-
### Get initial treatment effect estimate
143-
144142
```{r SL_out02, cache=TRUE}
145143
ObsData.noY$A <- 0
146144
ObsData$Pred.Y0 <- predict(Y.fit.sl, newdata = ObsData.noY,
147145
type = "response")$pred
148146
summary(ObsData$Pred.Y0)
149147
```
150148

149+
### Get initial treatment effect estimate
150+
151151
```{r SL_out03, cache=cachex, echo = TRUE}
152152
ObsData$Pred.TE <- ObsData$Pred.Y1 - ObsData$Pred.Y0
153153
```
@@ -201,7 +201,7 @@ PS.fit.SL <- SuperLearner(Y=ObsData$A,
201201
SL.library=c("SL.glm",
202202
"SL.glmnet",
203203
"SL.xgboost"),
204-
method="method.NNLS",
204+
method="method.CC_nloglik",
205205
family="binomial")
206206
```
207207

@@ -269,11 +269,11 @@ Aggregated or individual clever covariate components show slight difference in t
269269
- a vector with 2 components $\hat\epsilon_0$ and $\hat\epsilon_1$.
270270
- It is estimated through MLE, using a model with an offset based on the initial estimate, and clever covariates as independent variables [@gruber2009targeted]:
271271

272-
$E(Y=1|A,L)(\epsilon) = \frac{1}{1+\exp(-\log\frac{\bar Q^0(A,L)}{(1-\bar Q^0(A,L))}-\epsilon \times H(A,L))}$
272+
$E(Y|A,L)(\epsilon) = \frac{1}{1+\exp(-\log\frac{\bar Q^0(A,L)}{(1-\bar Q^0(A,L))}-\epsilon \times H(A,L))}$
273273

274274
### $\hat\epsilon$ = $\hat\epsilon_0$ and $\hat\epsilon_1$
275275

276-
This is more close to how how `tmle` package has implement clever covariates
276+
This is closer to how `tmle` package has implement clever covariates
277277

278278
```{r eestimate, cache=TRUE, warning=FALSE}
279279
eps_mod <- glm(Y.bounded ~ -1 + H.A1L + H.A0L +
@@ -285,7 +285,7 @@ epsilon["H.A1L"]
285285
epsilon["H.A0L"]
286286
```
287287

288-
Note that, if `init.Pred` includes -ve values, `NaNs` would be produced after applying `qlogis()`.
288+
Note that, if `init.Pred` includes negative values, `NaNs` would be produced after applying `qlogis()`.
289289

290290
### Only 1 $\hat\epsilon$
291291

@@ -330,7 +330,7 @@ summary(ObsData$Pred.Y1.update1)
330330
summary(ObsData$Pred.Y0.update1)
331331
```
332332

333-
Note that, if `Pred.Y1` and `Pred.Y0` include -ve values, `NaNs` would be produced after applying `qlogis()`.
333+
Note that, if `Pred.Y1` and `Pred.Y0` include negative values, `NaNs` would be produced after applying `qlogis()`.
334334

335335
```{r hestimate, cache=TRUE, warning=FALSE, include = FALSE}
336336
# # clever covariates

5software.Rmd

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ SL.library = c("SL.glm",
6060
```
6161

6262

63-
```{r tmlepkg33, cache=cachex, results='hide', message=FALSE, warning=FALSE}
63+
```{r tmlepkg33, cache=cachex, message=FALSE, warning=FALSE}
6464
tmle.fit <- tmle::tmle(Y = ObsData$Y_transf,
6565
A = ObsData$A,
6666
W = ObsData.noYA,
@@ -72,15 +72,17 @@ tmle.fit
7272
```
7373

7474

75-
```{r tmlepkgtr2, cache=cachex, results='hide', message=FALSE, warning=FALSE}
75+
```{r tmlepkgtr2, cache=cachex, message=FALSE, warning=FALSE}
7676
summary(tmle.fit)
7777
```
7878

7979

80-
```{r tmlepkgtr, cache=cachex, results='hide', message=FALSE, warning=FALSE}
80+
```{r tmlepkgtr, cache=cachex, message=FALSE, warning=FALSE}
8181
tmle_est_tr <- tmle.fit$estimates$ATE$psi
8282
# transform back the ATE estimate
8383
tmle_est <- (max.Y-min.Y)*tmle_est_tr
84+
85+
tmle_est
8486
```
8587

8688
```{r, cache=TRUE, echo = TRUE}
@@ -107,7 +109,6 @@ Notes about the _tmle_ package:
107109
* does not scale the outcome for you
108110
* can give some error messages when dealing with variable types it is not expecting
109111
* practically all steps are nicely packed up in one function, very easy to use but need to dig a little to truly understand what it does
110-
* at first was not straightforward to figure out how to use with a continuous outcome and log-likelihood loss function as the difference between several parameters relating to variable type and loss function was unclear
111112

112113
Most helpful resources:
113114

@@ -188,7 +189,7 @@ sl_disc <- Lrnr_sl$new(
188189

189190
The SuperLearner is then trained on the sl3 task we created at the start and then it can be used to make predictions.
190191

191-
```{r sl305, cache=cachexy, results='hide', message=FALSE, warning=FALSE}
192+
```{r sl305, cache=cachexy, message=FALSE, warning=FALSE}
192193
set.seed(1444)
193194
194195
# train SL
@@ -202,6 +203,8 @@ sl3_data$sl_preds <- sl_fit$predict()
202203
203204
sl3_est <- mean(sl3_data$sl_preds[sl3_data$A == 1]) -
204205
mean(sl3_data$sl_preds[sl3_data$A == 0])
206+
207+
sl3_est
205208
```
206209

207210
```{r, cache=TRUE, echo = TRUE}

0 commit comments

Comments
 (0)