ehsanx
diff --git a/‎1RHC.Rmd‎
Lines changed: 6 additions & 2 deletions b/‎1RHC.Rmd‎
Lines changed: 6 additions & 2 deletions
diff --git a/‎2gcomp.Rmd‎
Lines changed: 5 additions & 0 deletions b/‎2gcomp.Rmd‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎2gcomp2.Rmd‎
Lines changed: 5 additions & 2 deletions b/‎2gcomp2.Rmd‎
Lines changed: 5 additions & 2 deletions
diff --git a/‎3ipw.Rmd‎
Lines changed: 17 additions & 4 deletions b/‎3ipw.Rmd‎
Lines changed: 17 additions & 4 deletions
diff --git a/‎3ipw2.Rmd‎
Lines changed: 6 additions & 1 deletion b/‎3ipw2.Rmd‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎5software.Rmd‎
Lines changed: 1 addition & 1 deletion b/‎5software.Rmd‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎6final.Rmd‎
Lines changed: 11 additions & 3 deletions b/‎6final.Rmd‎
Lines changed: 11 additions & 3 deletions
@@ -188,7 +188,7 @@ Table 5 in @connors1996effectiveness showed that, after propensity score pair (1
 We also conduct propensity score pair matching analysis, as follows. 
 
 ```{block, type='rmdcomment'}
-**Note**: In this workshop, we will not cover Propensity Score Matching (PSM) in this workshop. If you want to learn more about this, feel free to check out this other workshop: [Understanding Propensity Score Matching](https://ehsanx.github.io/psw/).
+**Note**: In this workshop, we will not cover Propensity Score Matching (PSM). If you want to learn more about this, feel free to check out this other workshop: [Understanding Propensity Score Matching](https://ehsanx.github.io/psw/) and the [video recording](https://www.youtube.com/watch?v=u4Nl7gnDEAY) on youtube.
 ```
 
 ```{r ps16854, cache=TRUE, echo = TRUE}
@@ -236,6 +236,8 @@ The love plot suggests satisfactory propensity score matching (all SMD < 0.1).
 
 #### PSM results
 
+##### p-value
+
 ```{r ps3, cache=TRUE, echo = TRUE}
 matched.data <- match.data(match.obj)   
 tab1y <- CreateTableOne(vars = c("Y"),
@@ -246,9 +248,11 @@ print(tab1y, showAllLevels = FALSE,
 ```
 
 ```{block, type='rmdcomment'}
-Hence, our conclusion based on propensity score pair matched data ($p  \lt 0.001$) is different than Table 5 in @connors1996effectiveness ($p = 0.14$). Variability in results for 1-to-1 matching is possible, and modelling choices may be different (we used caliper option here).
+Our conclusion based on propensity score pair matched data ($p  \lt 0.001$) is different than Table 5 in @connors1996effectiveness ($p = 0.14$). Variability in results for 1-to-1 matching is possible, and modelling choices may be different (we used caliper option here).
 ```
 
+##### Treatment effect
+
 - We can also estimate the effect of `RHC` on `length of stay` using propensity score-matched sample:
 
 ```{r ps12ryy, cache=TRUE, echo = TRUE}
 
@@ -345,6 +345,11 @@ out.formula <- as.formula(paste("Y~ A +",
                                      collapse = "+")))
 fit1 <- lm(out.formula, data = ObsData)
 ```
+
+```{block, type='rmdcomment'}
+$Q(A,L)$ is often used to represent the predictions from the G-comp model.
+```
+
 #### Step 2
 
 Extract outcome prediction for treated $\hat{Y}_{A=1}$ by setting all $A=1$|
 
@@ -289,7 +289,7 @@ Notice that the mean is very similar to the parametric G-computation method.
 ## G-comp using SuperLearner
 
 ```{block, type='rmdcomment'}
-SuperLearner is an ensemble MLtechnique, that uses **cross-validation** to find a weighted combination of estimates provided by different **candidate learners** (that help predict).
+SuperLearner is an ensemble ML technique, that uses **cross-validation** to find a weighted combination of estimates provided by different **candidate learners** (that help predict).
 ```
 
 - There exists many candidate learners. Here we are using a combination of
@@ -446,13 +446,16 @@ scaled.coefs <- abs(coefs)/sum(abs(coefs))
 scaled.coefs
 ``` 
 
-
 Scaled coefs
 
 ```{r ML12stestcoef, cache=cachex, echo = TRUE}
 fit.sl$coef
 ```
 
+```{r ML12stestcoef2b, cache=cachex, echo = TRUE}
+sum(fit.sl$coef)
+```
+
 Hence, in creating superlearner prediction column,
 
 a. Linear regression has no contribution
 
@@ -1,7 +1,9 @@
 # IPTW
 
+In this chapter, we will cover PS and IPTW (or IPW).
+
 ```{block, type='rmdcomment'}
-In this chapter, we are primarily interested about **exposure modelling** (e.g., fixing imbalance first, before doing outcome analysis).
+We are now primarily interested about **exposure modelling** (e.g., fixing imbalance first, before doing outcome analysis).
 ```
 
 ```{r setup01i, include=FALSE}
@@ -66,7 +68,10 @@ require(Publish)
 publish(PS.fit,  format = "[u;l]")
 ```
 
-- Coef of PS model fit is not of concern 
+```{block, type='rmdcomment'}
+Coef of PS model fit is not of concern.  
+```
+
 - Model can be rich: to the extent that prediction is better
 - But look for multi-collinearity issues
   - SE too high?
@@ -77,6 +82,10 @@ Obtain the propesnity score (PS) values from the fit
 ObsData$PS <- predict(PS.fit, type="response")
 ```
 
+```{block, type='rmdcomment'}
+These propensity score predictions (`PS`) are often represented as $g(A_i=1|L_i)$.
+```
+
 Check summaries: 
 
 - enough overlap?
@@ -101,14 +110,18 @@ Convert $PS$ to $IPW$  = $\frac{A}{PS} + \frac{1-A}{1-PS}$
 ```
 
 - Convert PS to IPW using the formula. We are using the formula for average treatment effect (ATE). 
-- It is possible to use alternative formulas, but we are using ATE formula for our illustration.
+
+```{block, type='rmdcomment'}
+It is possible to use alternative formulas, but we are using ATE formula for our illustration.
+```
+
 
 ```{r psx2c, cache=TRUE, echo = TRUE}
 ObsData$IPW <- ObsData$A/ObsData$PS + (1-ObsData$A)/(1-ObsData$PS)
 summary(ObsData$IPW)
 ```
 
-Also possible to use pre-packged software packages to do the same:
+Also possible to use pre-packaged software packages to do the same:
 
 ```{r psx2c2, cache=TRUE, echo = TRUE}
 require(WeightIt)
 
@@ -47,7 +47,12 @@ This is the exposure model that we decided on:
 ps.formula
 ```
 
-Fit SuperLearner to estimate propensity scores. We again use the same candidate learners:
+```{block, type='rmdcomment'}
+Fit SuperLearner (SL) to estimate propensity scores. 
+```
+
+
+We again use the same candidate learners:
 
 - linear model
 - LASSO
 
@@ -399,7 +399,7 @@ kable(results,digits = 2)%>%
 ```
 
 ```{block, type='rmdcomment'}
-@keele2021comparing used superlearner based on an ensemble of 3 different learners: (1) GLM, (2) random forests, and (3) LASSO.
+@keele2021comparing used TMLE-SL based on an ensemble of 3 different learners: (1) GLM, (2) random forests, and (3) LASSO.
 ```
 
 ## Other packages
 
@@ -81,7 +81,11 @@ knitr::include_graphics("images/dagpred.png")
 - Assuming all covariates are measured, **parametric models** such as linear and logistic regressions are very efficient, but relies on strong assumptions. In real-world scenarios, it is often hard (if not impossible) to guess the correct specification of the right hand side of the regression equation.
 - Machine learning (ML) methods are very helpful for prediction goals. They are also helpful in **identifying complex functions** (non-linearities and non-additive terms) of the covariates (again, assuming they are measured). 
 - There are many ML methods, but the procedures are very different, and they come with their own advantages and disadvantages. In a given real data, it is **hard to apriori predict which is the best ML algorithm** for a given problem.
-- That's where super learner is helpful in **combining strength from various algorithms**, and producing 1 prediction column that has **optimal statistical properties**.
+
+
+```{block, type='rmdcomment'}
+Super learner is helpful in **combining strength from various algorithms**, and producing 1 prediction column that has **optimal statistical properties**.
+```
 
 ### Causal inference
 
@@ -122,8 +126,12 @@ knitr::include_graphics("images/dagci.png")
 ```
 
 - For causal inference goals (when we have a primary exposure of interest), machine learning methods are often misleading. This is primarily due to the fact that they usually do not have an inherent mechanism of focusing on **primary exposure** (RHC in this example); and treats the primary exposure as any other predictors. 
-- When using g-computation with ML methods, estimation of variance becomes a difficult problem. Generalized procedures such as **robust SE or bootstrap methods** are not supported by theory.
-- That's where TMLE methods shine, with the help of it's important **statistical properties (double robustness, finite sample properties)**.
+- When using g-computation with ML methods, estimation of variance becomes a difficult problem (with correct coverage). Generalized procedures such as **robust SE or bootstrap methods** are not supported by theory.
+
+
+```{block, type='rmdcomment'}
+TMLE method shine, with the help of it's important **statistical properties (double robustness, finite sample properties)**.
+```
 
 ### Identifiability assumptions