You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/user_guide/estimands.md
+27-15Lines changed: 27 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -119,11 +119,7 @@ statisticalΨ = ATE(
119
119
)
120
120
```
121
121
122
-
- Factorial Treatments
123
-
124
-
It is possible to generate a `ComposedEstimand` containing all linearly independent IATEs from a set of treatment values or from a dataset. For that purpose, use the `factorialEstimand` function.
125
-
126
-
## The Interaction Average Treatment Effect
122
+
## The Average Interaction Effect
127
123
128
124
- Causal Question:
129
125
@@ -136,14 +132,14 @@ For a general higher-order definition, please refer to [Higher-order interaction
136
132
For two points interaction with both treatment and control levels ``0`` and ``1`` for ease of notation:
It is possible to generate a `ComposedEstimand` containing all linearly independent IATEs from a set of treatment values or from a dataset. For that purpose, use the `factorialEstimand` function.
186
-
187
-
## Composed Estimands
181
+
It is possible to generate a `JointEstimand` containing all linearly independent AIEs from a set of treatment values or from a dataset. For that purpose, use the `factorialEstimand` function.
188
182
189
-
As a result of Julia's automatic differentiation facilities, given a set of predefined estimands ``(\Psi_1, ..., \Psi_k)``, we can automatically compute an estimator for $f(\Psi_1, ..., \Psi_k)$. This is done via the `ComposedEstimand` type.
183
+
## Joint And Composed Estimands
190
184
191
-
For example, the difference in ATE for a treatment with 3 levels (0, 1, 2)can be defined as follows:
185
+
A `JointEstimand` is simply a list of one dimensional estimands that are grouped together. For instance for a treatment `T` taking three possible values ``(0, 1, 2)`` we can define the two following Average Treatment Effects and a corresponding `JointEstimand`:
192
186
193
187
```julia
194
188
ATE₁ =ATE(
@@ -201,5 +195,23 @@ ATE₂ = ATE(
201
195
treatment_values = (T = (control =1, case =2),),
202
196
treatment_confounders = [:W]
203
197
)
204
-
ATEdiff =ComposedEstimand(-, (ATE₁, ATE₂))
198
+
joint_estimand =JointEstimand(ATE₁, ATE₂)
199
+
```
200
+
201
+
You can easily generate joint estimands corresponding to Counterfactual Means, Average Treatment Effects or Average Interaction Effects by using the `factorialEstimand` function.
202
+
203
+
To estimate a joint estimand you can use any of the estimators defined in this package exactly as you would do it for a one dimensional estimand.
204
+
205
+
There are two main use cases for them that we now describe.
206
+
207
+
### Joint Testing
208
+
209
+
In some cases, like in factorial analyses where multiple versions of a treatment are tested, it may be of interest to know if any version of the versions has had an effect. This can be done via a Hotelling's T2 Test, which is simply a multivariate generalisation of the Student's T test. This is the default returned by the `significance_test` function provided in TMLE.jl and the result of the test is also printed to the REPL for any joint estimate.
210
+
211
+
### Composition
212
+
213
+
Once you have estimated a `JointEstimand` and have a `JointEstimate`, you may be interested to ask further questions. For instance whether two treatment versions have the same effect. This question is typically answered by testing if the difference in Average Treatment Effect is 0. Using the Delta Method and Julia's automatic differentiation, you don't need to explicitly define a semi-parametric estimator for it. You can simply call `compose`:
Copy file name to clipboardExpand all lines: docs/src/user_guide/estimation.md
+38-27Lines changed: 38 additions & 27 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -110,7 +110,7 @@ Again, required nuisance functions are fitted and stored in the cache.
110
110
111
111
## Specifying Models
112
112
113
-
By default, TMLE.jl uses generalized linear models for the estimation of relevant and nuisance factors such as the outcome mean and the propensity score. However, this is not the recommended usage since the estimators' performance is closely related to how well we can estimate these factors. More sophisticated models can be provided using the `models` keyword argument of each estimator which is essentially a `NamedTuple` mapping variables' names to their respective model.
113
+
By default, TMLE.jl uses generalized linear models for the estimation of relevant and nuisance factors such as the outcome mean and the propensity score. However, this is not the recommended usage since the estimators' performance is closely related to how well we can estimate these factors. More sophisticated models can be provided using the `models` keyword argument of each estimator which is a `Dict{Symbol, Model}` mapping variables' names to their respective model.
114
114
115
115
Rather than specifying a specific model for each variable it may be easier to override the default models using the `default_models` function:
116
116
@@ -121,9 +121,9 @@ using MLJXGBoostInterface
121
121
xgboost_regressor = XGBoostRegressor()
122
122
xgboost_classifier = XGBoostClassifier()
123
123
models = default_models(
124
-
Q_binary=xgboost_classifier,
125
-
Q_continuous=xgboost_regressor,
126
-
G=xgboost_classifier
124
+
Q_binary = xgboost_classifier,
125
+
Q_continuous = xgboost_regressor,
126
+
G = xgboost_classifier
127
127
)
128
128
tmle_gboost = TMLEE(models=models)
129
129
```
@@ -140,19 +140,18 @@ stack_binary = Stack(
140
140
lr=lr
141
141
)
142
142
143
-
models = (
144
-
T₁ = with_encoder(xgboost_classifier), # T₁ with XGBoost prepended with a Continuous Encoder
145
-
default_models( # For all other variables use the following defaults
146
-
Q_binary=stack_binary, # A Super Learner
147
-
Q_continuous=xgboost_regressor, # An XGBoost
143
+
models = default_models( # For all non-specified variables use the following defaults
144
+
Q_binary = stack_binary, # A Super Learner
145
+
Q_continuous = xgboost_regressor, # An XGBoost
146
+
# T₁ with XGBoost prepended with a Continuous Encoder
147
+
T₁ = xgboost_classifier
148
148
# Unspecified G defaults to Logistic Regression
149
-
)...
150
149
)
151
150
152
151
tmle_custom = TMLEE(models=models)
153
152
```
154
153
155
-
Notice that `with_encoder` is simply a shorthand to construct a pipeline with a `ContinuousEncoder` and that the resulting `models` is simply a `NamedTuple`.
154
+
Notice that `with_encoder` is simply a shorthand to construct a pipeline with a `ContinuousEncoder` and that the resulting `models` is simply a `Dict`.
156
155
157
156
## CV-Estimation
158
157
@@ -196,10 +195,10 @@ result₃
196
195
nothing # hide
197
196
```
198
197
199
-
This time only the model for `Y` is fitted again while reusing the models for `T₁` and `T₂`. Finally, let's see what happens if we estimate the `IATE` between `T₁` and `T₂`.
198
+
This time only the model for `Y` is fitted again while reusing the models for `T₁` and `T₂`. Finally, let's see what happens if we estimate the `AIE` between `T₁` and `T₂`.
200
199
201
200
```@example estimation
202
-
Ψ₄ = IATE(
201
+
Ψ₄ = AIE(
203
202
outcome=:Y,
204
203
treatment_values=(
205
204
T₁=(case=true, control=false),
@@ -218,18 +217,20 @@ nothing # hide
218
217
219
218
All nuisance functions have been reused, only the fluctuation is fitted!
220
219
221
-
## Composing Estimands
220
+
## Joint Estimands and Composition
222
221
223
-
By leveraging the multivariate Central Limit Theorem and Julia's automatic differentiation facilities, we can estimate any estimand which is a function of already estimated estimands. By default, TMLE.jl will use [Zygote](https://fluxml.ai/Zygote.jl/latest/) but since we are using [AbstractDifferentiation.jl](https://github.com/JuliaDiff/AbstractDifferentiation.jl) you can change the backend to your favorite AD system.
222
+
As explained in [Joint And Composed Estimands](@ref), a joint estimand is simply a collection of estimands. Here, we will illustrate that an Average Interaction Effect is also defined as a difference in partial Average Treatment Effects.
224
223
225
-
For instance, by definition of the ``IATE``, we should be able to retrieve:
224
+
More precisely, we would like to see if the left-hand side of this equation is equal to the right-hand side:
The printed output is the result of a Hotelling's T2 Test which is the multivariate counterpart of the Student's T Test. It tells us whether any of the component of this joint estimand is different from 0.
264
+
265
+
Then we can formally test our hypothesis by leveraging the multivariate Central Limit Theorem and Julia's automatic differentiation.
266
+
267
+
```@example estimation
268
+
composed_result = compose((x, y, z) -> x - y - z, joint_estimate)
260
269
isapprox(
261
270
estimate(result₄),
262
-
estimate(composed_iate_result),
271
+
first(estimate(composed_result)),
263
272
atol=0.1
264
273
)
265
274
```
275
+
276
+
By default, TMLE.jl will use [Zygote](https://fluxml.ai/Zygote.jl/latest/) but since we are using [AbstractDifferentiation.jl](https://github.com/JuliaDiff/AbstractDifferentiation.jl) you can change the backend to your favorite AD system.
Copy file name to clipboardExpand all lines: docs/src/walk_through.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,10 +108,10 @@ marginal_ate_t1 = ATE(
108
108
)
109
109
```
110
110
111
-
- The Interaction Average Treatment Effect:
111
+
- The Average Interaction Effect:
112
112
113
113
```@example walk-through
114
-
iate = IATE(
114
+
aie = AIE(
115
115
outcome = :Y,
116
116
treatment_values = (
117
117
T₁=(case=1, control=0),
@@ -125,7 +125,7 @@ iate = IATE(
125
125
Identification is the process by which a Causal Estimand is turned into a Statistical Estimand, that is, a quantity we may estimate from data. This is done via the `identify` function which also takes in the ``SCM``:
126
126
127
127
```@example walk-through
128
-
statistical_iate = identify(iate, scm)
128
+
statistical_aie = identify(aie, scm)
129
129
```
130
130
131
131
Alternatively, you can also directly define the statistical parameters (see [Estimands](@ref)).
@@ -149,7 +149,7 @@ Statistical Estimands can be estimated without a ``SCM``, let's use the One-Step
149
149
150
150
```@example walk-through
151
151
ose = OSE()
152
-
result, cache = ose(statistical_iate, dataset)
152
+
result, cache = ose(statistical_aie, dataset)
153
153
result
154
154
```
155
155
@@ -160,3 +160,5 @@ Both TMLE and OSE asymptotically follow a Normal distribution. It means we can p
160
160
```@example walk-through
161
161
OneSampleTTest(result)
162
162
```
163
+
164
+
If the estimate is high-dimensional, a `OneSampleHotellingT2Test` should be performed instead. Alternatively, the `significance_test` function will automatically select the appropriate test for the estimate and return its result.
0 commit comments