You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/spec/estimation/dml.rst
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,7 +72,7 @@ Most of the methods provided make a parametric form assumption on the heterogene
72
72
linear on some pre-defined; potentially high-dimensional; featurization). These methods include:
73
73
:class:`.DML`, :class:`.LinearDML`,
74
74
:class:`.SparseLinearDML`, :class:`.KernelDML`.
75
-
For fullly non-parametric heterogeneous treatment effect models, check out the :class:`.NonParamDML`
75
+
For fully non-parametric heterogeneous treatment effect models, check out the :class:`.NonParamDML`
76
76
and the :class:`.CausalForestDML`.
77
77
For more options of non-parametric CATE estimators,
78
78
check out the :ref:`Forest Estimators User Guide <orthoforestuserguide>`
@@ -165,7 +165,7 @@ structure of the implemented CATE estimators is as follows.
165
165
Below we give a brief description of each of these classes:
166
166
167
167
* **DML.** The class :class:`.DML` assumes that the effect model for each outcome :math:`i` and treatment :math:`j` is linear, i.e. takes the form :math:`\theta_{ij}(X)=\langle\theta_{ij}, \phi(X)\rangle`, and allows for any arbitrary scikit-learn linear estimator to be defined as the final stage (e.g.
168
-
:class:`~sklearn.linear_model.ElasticNet`, :class:`~sklearn.linear_model.Lasso`, :class:`~sklearn.linear_model.LinearRegression` and their multi-task variations in the case where we have mulitple outcomes, i.e. :math:`Y` is a vector). The final linear model will be fitted on features that are derived by the Kronecker-product
168
+
:class:`~sklearn.linear_model.ElasticNet`, :class:`~sklearn.linear_model.Lasso`, :class:`~sklearn.linear_model.LinearRegression` and their multi-task variations in the case where we have multiple outcomes, i.e. :math:`Y` is a vector). The final linear model will be fitted on features that are derived by the Kronecker-product
169
169
of the vectors :math:`T` and :math:`\phi(X)`, i.e. :math:`\tilde{T}\otimes\phi(X) = \mathtt{vec}(\tilde{T}\cdot\phi(X)^T)`. This regression will estimate the coefficients :math:`\theta_{ijk}`
170
170
for each outcome :math:`i`, treatment :math:`j` and feature :math:`k`. The final model is minimizing a regularized empirical square loss of the form:
171
171
@@ -239,7 +239,7 @@ Below we give a brief description of each of these classes:
239
239
[Nie2017]_. It approximates any function in the RKHS by creating random Fourier features. Then runs a ElasticNet
240
240
regularized final model. Thus it approximately implements the results of [Nie2017], via the random fourier feature
241
241
approximate representation of functions in the RKHS. Moreover, given that we use Random Fourier Features this class
242
-
asssumes an RBF kernel.
242
+
assumes an RBF kernel.
243
243
244
244
* **NonParamDML.** The class :class:`.NonParamDML` makes no assumption on the effect model for each outcome :math:`i`.
245
245
However, it applies only when the treatment is either binary or single-dimensional continuous. It uses the observation that for a single
@@ -350,7 +350,7 @@ Usage FAQs
350
350
it does so in a manner that is robust to the estimation mistakes that these ML algorithms
351
351
might be making.
352
352
353
-
Moreover, one may typically want to estimate treatment effect hetergoeneity,
353
+
Moreover, one may typically want to estimate treatment effect heterogeneity,
354
354
which the above OLS approach wouldn't provide. One potential way of providing such heterogeneity
355
355
is to include product features of the form :math:`X\cdot T` in the OLS model. However, then
356
356
one faces again the same problems as above:
@@ -564,7 +564,7 @@ Usage FAQs
564
564
- **How can I assess the performance of the CATE model?**
565
565
566
566
Each of the DML classes have an attribute `score_` after they are fitted. So one can access that
567
-
attribute and compare the performance accross different modeling parameters (lower score is better):
567
+
attribute and compare the performance across different modeling parameters (lower score is better):
Copy file name to clipboardExpand all lines: notebooks/Double Machine Learning Examples.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -927,7 +927,7 @@
927
927
"source": [
928
928
"### 2.4 Interpretability with SHAP Values\n",
929
929
"\n",
930
-
"Explain the hetergoeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."
930
+
"Explain the heterogeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."
Copy file name to clipboardExpand all lines: notebooks/Interpretability with SHAP.ipynb
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@
23
23
"\n",
24
24
"[SHAP](https://shap.readthedocs.io/en/latest/) is a popular open source library for interpreting black-box machine learning models using the [Shapley values methodology](https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html).\n",
25
25
"\n",
26
-
"Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or smaller effect values for particular segments of the population. Which were the features that lead to such differentiation? This question is easy to address when the model is succinctly described, such as the case of linear heterogneity models, where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive models, such as Random Forests and Causal Forests to model effect hetergoeneity. SHAP values can be of immense help to understand the leading factors of effect hetergoeneity that the model picked up from the training data.\n",
26
+
"Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or smaller effect values for particular segments of the population. Which were the features that lead to such differentiation? This question is easy to address when the model is succinctly described, such as the case of linear heterogeneity models, where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive models, such as Random Forests and Causal Forests to model effect heterogeneity. SHAP values can be of immense help to understand the leading factors of effect heterogeneity that the model picked up from the training data.\n",
27
27
"\n",
28
28
"Our package offers seamless integration with the SHAP library. Every `CateEstimator` has a method `shap_values`, which returns the SHAP value explanation of the estimators output for every treatment and outcome pair. These values can then be visualized with the plethora of visualizations that the SHAP library offers. Moreover, whenever possible our library invokes fast specialized algorithms from the SHAP library, for each type of final model, which can greatly reduce computation times.\n",
0 commit comments