Skip to content

Commit a377e24

Browse files
20260131 - prediction
1 parent 0139b98 commit a377e24

File tree

2 files changed

+5
-2
lines changed

2 files changed

+5
-2
lines changed

correlation.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1458,7 +1458,7 @@ plotly::ggplotly(plot_scatterplotWithRangeRestriction)
14581458
As described in @sec-correlationCausation, correlation does not imply causation.
14591459
There are several reasons (described in @sec-correlationCausation) that, just because `X` is correlated with `Y` does not necessarily mean that `X` causes `Y`.
14601460
However, correlation can still be useful.
1461-
In order for two processes to be causally related, they must be associated.
1461+
In order for two processes to be causally related, they must be associated, as described in @sec-conditionsForCausality.
14621462
That is, association is necessary but insufficient for causality.
14631463
14641464
## Conclusion {#sec-correlationConclusion}

machine-learning.qmd

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,9 @@ Machine learning is a class of algorithmic approaches that are used to identify
119119
Machine learning takes us away from focusing on [causal inference](#sec-causalInference).
120120
Machine learning does not care about which processes are causal—i.e., which processes influence the outcome.
121121
Instead, machine learning cares about prediction—it cares about a predictor variable to the extent that it increases predictive accuracy regardless of whether it is causally related to the outcome.
122+
Nevertheless, association is necessary (despite being insufficient) for causality, as described in @sec-conditionsForCausality.
123+
Thus, achieving strong prediction is important (even if insufficient) for the model to be useful.
124+
If a model does explains only a small portion of variance, it is difficult for it to be useful.
122125

123126
Machine learning can be useful for leveraging big data and many predictor variables to develop predictive models with greater accuracy.
124127
However, many machine learning techniques are black boxes—it is often unclear how or why certain predictions are made, which can make it difficult to interpret the model's decisions and understand the underlying relationships between variables.
@@ -179,7 +182,7 @@ This chapter discusses several key ones:
179182
Supervised learning involves learning from data where the correct classification or outcome is known (and the classification is thus part of the data).
180183
For instance, predicting how many points a player will score is a supervised learning task, because there is a ground truth—the actual number of points scored—that can be used to train and evaluate the model.
181184
If the outcome variable is categorical, the approach involves classification.
182-
If the outcome vairable is continuous, the approach involves regression.
185+
If the outcome variable is continuous, the approach involves regression.
183186

184187
Unlike linear and logistic regression, various machine learning techniques can handle [multicollinearity](#sec-multipleRegressionMulticollinearity), including [LASSO regression](#sec-lasso), [ridge regression](#sec-ridgeRegression), and [elastic net regression](#sec-elasticNet) via regularization.
185188
Regularization involves penalizing model complexity to avoid [overfitting](#sec-overfitting) [@Ramasubramanian2016].

0 commit comments

Comments
 (0)