@@ -15,8 +15,8 @@ associated with at least one data point from the original training set.
1515 :align: center
1616
1717 A binary decision tree trained on the dataset :math: `X = \{ \mathbf{x}_1,
18- \l dots, \m athbf{x}_{10} \} `. Each example in the dataset is a 5 -dimensional
19- vector of real-valued features labeled :math: `x_1, \ldots, x_5 `. Unshaded
18+ \l dots, \m athbf{x}_{10} \} `. Each example in the dataset is a 4 -dimensional
19+ vector of real-valued features labeled :math: `x_1, \ldots, x_4 `. Unshaded
2020 circles correspond to internal decision nodes, while shaded circles
2121 correspond to leaf nodes. Each leaf node is associated with a subset of the
2222 examples in `X `, selected based on the decision rules along the path from
@@ -52,13 +52,13 @@ impurity after a particular split is
5252.. math ::
5353
5454 \Delta \mathcal {L} = \mathcal {L}(\text {Parent}) -
55- P_{left} \mathcal {L}(\text {Left child}) -
56- (1 - P_{left})\mathcal {L}(\text {Right child})
55+ P_{\text { left} } \mathcal {L}(\text {Left child}) -
56+ (1 - P_{\text { left} })\mathcal {L}(\text {Right child})
5757
5858 where :math: `\mathcal {L}(x)` is the impurity of the dataset at node `x `,
59- and :math: `P_{left}`/:math: `P_{right}` are the proportion of examples at the
60- current node that are partitioned into the left / right children, respectively,
61- by the proposed split.
59+ and :math: `P_{\text { left}} `/:math: `P_{\text { right}} ` are the proportion of
60+ examples at the current node that are partitioned into the left / right
61+ children, respectively, by the proposed split.
6262
6363.. _`Decision trees` : https://en.wikipedia.org/wiki/Decision_tree_learning
6464
@@ -123,7 +123,7 @@ that proceeds by iteratively fitting a sequence of `m` weak learners such that:
123123
124124 where `b ` is a fixed initial estimate for the targets, :math: `\eta ` is
125125a learning rate parameter, and :math: `w_{i}` and :math: `g_{i}`
126- denote the weights and predictions for ` i ` th learner.
126+ denote the weights and predictions of the :math: `i^{th}` learner.
127127
128128At each training iteration a new weak learner is fit to predict the negative
129129gradient of the loss with respect to the previous prediction,
0 commit comments