typo fixes (#864)

alexmalins · web-flow · commit 6219695cd1a6 · 2024-03-26T10:43:30.000-04:00
Signed-off-by: Alex Malins &lt;github@alexmalins.com&gt;
diff --git a/doc/spec/comparison.rst b/doc/spec/comparison.rst
@@ -4,7 +4,7 @@ Detailed estimator comparison
 
 
 +---------------------------------------------+--------------+--------------+------------------+-------------+-----------------+------------+--------------+--------------------+
-| Estimator                                   | | Treatment  | | Requires   | | Delivers Conf. | | Linear    | | Linear        | | Mulitple | | Multiple   | | High-Dimensional |
+| Estimator                                   | | Treatment  | | Requires   | | Delivers Conf. | | Linear    | | Linear        | | Multiple | | Multiple   | | High-Dimensional |
 |                                             | | Type       | | Instrument | | Intervals      | | Treatment | | Heterogeneity | | Outcomes | | Treatments | | Features         |
 +=============================================+==============+==============+==================+=============+=================+============+==============+====================+
 | :class:`.SieveTSLS`                         | Any          | Yes          |                  | Yes         | Assumed         | Yes        | Yes          |                    |
diff --git a/doc/spec/estimation/dml.rst b/doc/spec/estimation/dml.rst
@@ -72,7 +72,7 @@ Most of the methods provided make a parametric form assumption on the heterogene
 linear on some pre-defined; potentially high-dimensional; featurization). These methods include: 
 :class:`.DML`, :class:`.LinearDML`,
 :class:`.SparseLinearDML`, :class:`.KernelDML`.
-For fullly non-parametric heterogeneous treatment effect models, check out the :class:`.NonParamDML`
+For fully non-parametric heterogeneous treatment effect models, check out the :class:`.NonParamDML`
 and the :class:`.CausalForestDML`. 
 For more options of non-parametric CATE estimators, 
 check out the :ref:`Forest Estimators User Guide <orthoforestuserguide>` 
@@ -165,7 +165,7 @@ structure of the implemented CATE estimators is as follows.
 Below we give a brief description of each of these classes:
 
     * **DML.** The class :class:`.DML` assumes that the effect model for each outcome :math:`i` and treatment :math:`j` is linear, i.e. takes the form :math:`\theta_{ij}(X)=\langle \theta_{ij}, \phi(X)\rangle`, and allows for any arbitrary scikit-learn linear estimator to be defined as the final stage (e.g.    
-      :class:`~sklearn.linear_model.ElasticNet`, :class:`~sklearn.linear_model.Lasso`, :class:`~sklearn.linear_model.LinearRegression` and their multi-task variations in the case where we have mulitple outcomes, i.e. :math:`Y` is a vector). The final linear model will be fitted on features that are derived by the Kronecker-product
+      :class:`~sklearn.linear_model.ElasticNet`, :class:`~sklearn.linear_model.Lasso`, :class:`~sklearn.linear_model.LinearRegression` and their multi-task variations in the case where we have multiple outcomes, i.e. :math:`Y` is a vector). The final linear model will be fitted on features that are derived by the Kronecker-product
       of the vectors :math:`T` and :math:`\phi(X)`, i.e. :math:`\tilde{T}\otimes \phi(X) = \mathtt{vec}(\tilde{T}\cdot \phi(X)^T)`. This regression will estimate the coefficients :math:`\theta_{ijk}` 
       for each outcome :math:`i`, treatment :math:`j` and feature :math:`k`. The final model is minimizing a regularized empirical square loss of the form:
       
@@ -239,7 +239,7 @@ Below we give a brief description of each of these classes:
           [Nie2017]_. It approximates any function in the RKHS by creating random Fourier features. Then runs a ElasticNet
           regularized final model. Thus it approximately implements the results of [Nie2017], via the random fourier feature
           approximate representation of functions in the RKHS. Moreover, given that we use Random Fourier Features this class
-          asssumes an RBF kernel.
+          assumes an RBF kernel.
     
     * **NonParamDML.** The class :class:`.NonParamDML` makes no assumption on the effect model for each outcome :math:`i`.
       However, it applies only when the treatment is either binary or single-dimensional continuous. It uses the observation that for a single
@@ -350,7 +350,7 @@ Usage FAQs
         it does so in a manner that is robust to the estimation mistakes that these ML algorithms
         might be making.
     
-    Moreover, one may typically want to estimate treatment effect hetergoeneity,
+    Moreover, one may typically want to estimate treatment effect heterogeneity,
     which the above OLS approach wouldn't provide. One potential way of providing such heterogeneity
     is to include product features of the form :math:`X\cdot T` in the OLS model. However, then
     one faces again the same problems as above:
@@ -564,7 +564,7 @@ Usage FAQs
 - **How can I assess the performance of the CATE model?**
 
     Each of the DML classes have an attribute `score_` after they are fitted. So one can access that
-    attribute and compare the performance accross different modeling parameters (lower score is better):
+    attribute and compare the performance across different modeling parameters (lower score is better):
 
     .. testcode::
 
diff --git a/doc/spec/estimation/dr.rst b/doc/spec/estimation/dr.rst
@@ -472,7 +472,7 @@ Usage FAQs
 - **How can I assess the performance of the CATE model?**
 
     Each of the DRLearner classes have an attribute `score_` after they are fitted. So one can access that
-    attribute and compare the performance accross different modeling parameters (lower score is better):
+    attribute and compare the performance across different modeling parameters (lower score is better):
 
     .. testcode::
 
diff --git a/doc/spec/estimation/forest.rst b/doc/spec/estimation/forest.rst
@@ -257,7 +257,7 @@ Then the criterion implicit in the reduction is the weighted mean squared error,
 where :math:`Var_n`, denotes the empirical variance. Essentially, this criterion tries to maximize heterogeneity
 (as captured by maximizing the sum of squares of the two estimates), while penalizing splits that create nodes
 with small variation in the treatment. On the contrary the criterion proposed in [Athey2019]_ ignores the within
-child variation of the treatment and solely maximizes the hetergoeneity, i.e.
+child variation of the treatment and solely maximizes the heterogeneity, i.e.
 
 .. math::
 
diff --git a/doc/spec/faq.rst b/doc/spec/faq.rst
@@ -57,7 +57,7 @@ How do I give feedback?
 ------------------------------------
 
 This project welcomes contributions and suggestions.  We use the `DCO bot <https://github.com/apps/dco>`_ to enforce a 
-`Developer Certificate of Origin <https://developercertificate.org/>` which requires users to sign-off on their commits.
+`Developer Certificate of Origin <https://developercertificate.org/>`_ which requires users to sign-off on their commits.
 This is a simple way to certify that you wrote or otherwise have the right to submit the code you are contributing to 
 the project.  Git provides a :code:`-s` command line option to include this automatically when you commit via :code:`git commit`.
 
diff --git a/doc/spec/interpretability.rst b/doc/spec/interpretability.rst
@@ -73,10 +73,10 @@ models using the Shapley values methodology (see e.g. [Lundberg2017]_).
 Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect
 heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or
 smaller effect values for particular segments of the population. Which were the features that lead to such differentiation?
-This question is easy to address when the model is succinctly described, such as the case of linear heterogneity models, 
+This question is easy to address when the model is succinctly described, such as the case of linear heterogeneity models, 
 where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive
-models, such as Random Forests and Causal Forests to model effect hetergoeneity. SHAP values can be of immense help to
-understand the leading factors of effect hetergoeneity that the model picked up from the training data.
+models, such as Random Forests and Causal Forests to model effect heterogeneity. SHAP values can be of immense help to
+understand the leading factors of effect heterogeneity that the model picked up from the training data.
 
 Our package offers seamless integration with the SHAP library. Every CATE estimator has a method `shap_values`, which returns the
 SHAP value explanation of the estimators output for every treatment and outcome pair. These values can then be visualized with
@@ -92,8 +92,8 @@ For instance:
     est = LinearDML()
     est.fit(y, t, X=X, W=W)
     shap_values = est.shap_values(X)
-    # local view: explain hetergoeneity for a given observation
+    # local view: explain heterogeneity for a given observation
     ind=0
     shap.plots.force(shap_values["Y0"]["T0"][ind], matplotlib=True)
-    # global view: explain hetergoeneity for a sample of dataset
+    # global view: explain heterogeneity for a sample of dataset
     shap.summary_plot(shap_values['Y0']['T0'])
diff --git a/econml/solutions/causal_analysis/_causal_analysis.py b/econml/solutions/causal_analysis/_causal_analysis.py
@@ -1546,7 +1546,7 @@ def plot_heterogeneity_tree(self, Xtest, feature_index, *,
                                 include_model_uncertainty=False,
                                 alpha=0.05):
         """
-        Plot an effect hetergoeneity tree using matplotlib.
+        Plot an effect heterogeneity tree using matplotlib.
 
         Parameters
         ----------
diff --git a/notebooks/Double Machine Learning Examples.ipynb b/notebooks/Double Machine Learning Examples.ipynb
@@ -927,7 +927,7 @@
    "source": [
     "### 2.4 Interpretability with SHAP Values\n",
     "\n",
-    "Explain the hetergoeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."
+    "Explain the heterogeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."
    ]
   },
   {
diff --git a/notebooks/Interpretability with SHAP.ipynb b/notebooks/Interpretability with SHAP.ipynb
@@ -23,7 +23,7 @@
     "\n",
     "[SHAP](https://shap.readthedocs.io/en/latest/) is a popular open source library for interpreting black-box machine learning models using the [Shapley values methodology](https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html).\n",
     "\n",
-    "Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or smaller effect values for particular segments of the population. Which were the features that lead to such differentiation? This question is easy to address when the model is succinctly described, such as the case of linear heterogneity models, where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive models, such as Random Forests and Causal Forests to model effect hetergoeneity. SHAP values can be of immense help to understand the leading factors of effect hetergoeneity that the model picked up from the training data.\n",
+    "Similar to how black-box predictive machine learning models can be explained with SHAP, we can also explain black-box effect heterogeneity models. This approach provides an explanation as to why a heterogeneous causal effect model produced larger or smaller effect values for particular segments of the population. Which were the features that lead to such differentiation? This question is easy to address when the model is succinctly described, such as the case of linear heterogeneity models, where one can simply investigate the coefficients of the model. However, it becomes hard when one starts using more expressive models, such as Random Forests and Causal Forests to model effect heterogeneity. SHAP values can be of immense help to understand the leading factors of effect heterogeneity that the model picked up from the training data.\n",
     "\n",
     "Our package offers seamless integration with the SHAP library. Every `CateEstimator` has a method `shap_values`, which returns the SHAP value explanation of the estimators output for every treatment and outcome pair. These values can then be visualized with the plethora of visualizations that the SHAP library offers. Moreover, whenever possible our library invokes fast specialized algorithms from the SHAP library, for each type of final model, which can greatly reduce computation times.\n",
     "\n",

Original file line number	Diff line number	Diff line change
`@@ -927,7 +927,7 @@`
`927`	`927`	`"source": [`
`928`	`928`	`"### 2.4 Interpretability with SHAP Values\n",`
`929`	`929`	`"\n",`
`930`		`- "Explain the hetergoeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."`
	`930`	`+ "Explain the heterogeneity model for the constant marginal effect of the treatment using <a href=\"https://shap.readthedocs.io/en/latest/\">SHAP values</a>."`
`931`	`931`	`]`
`932`	`932`	`},`
`933`	`933`	`{`