You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: regression2.Rmd
+15-13Lines changed: 15 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -72,7 +72,7 @@ to draw the straight line of best fit through our existing data points.
72
72
The small subset of data as well as the line of best fit are shown
73
73
in Figure \@ref(fig:08-lin-reg1).
74
74
75
-
```{r 08-lin-reg1, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 4, fig.width = 5, fig.cap = "Scatter plot of sale price versus size with line of best fit for subset of the Sacramento housing data."}
75
+
```{r 08-lin-reg1, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 3.5, fig.width = 4.5, fig.cap = "Scatter plot of sale price versus size with line of best fit for subset of the Sacramento housing data."}
76
76
library(tidyverse)
77
77
library(tidymodels)
78
78
library(scales)
@@ -122,7 +122,7 @@ above to evaluate the predicted sale price given the value we have for the
```{r 08-lin-reg2, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 4, fig.width = 5, fig.cap = "Scatter plot of sale price versus size with line of best fit and a red dot at the predicted sale price for a 2000 square foot home."}
125
+
```{r 08-lin-reg2, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 3.5, fig.width = 4.5, fig.cap = "Scatter plot of sale price versus size with line of best fit and a red dot at the predicted sale price for a 2000 square foot home."}
126
126
small_model <- lm(price ~ sqft, data = small_sacramento)
@@ -150,7 +150,7 @@ exactly does simple linear regression choose the line of best fit? Many
150
150
different lines could be drawn through the data points.
151
151
Some plausible examples are shown in Figure \@ref(fig:08-several-lines).
152
152
153
-
```{r 08-several-lines, echo = FALSE, message = FALSE, warning = FALSE, fig.height = 4, fig.width = 5, fig.cap = "Scatter plot of sale price versus size with many possible lines that could be drawn through the data points."}
153
+
```{r 08-several-lines, echo = FALSE, message = FALSE, warning = FALSE, fig.height = 3.5, fig.width = 4.5, fig.cap = "Scatter plot of sale price versus size with many possible lines that could be drawn through the data points."}
154
154
small_plot +
155
155
geom_abline(intercept = -64542.23, slope = 190, color = "green") +
156
156
geom_abline(intercept = -6900, slope = 175, color = "purple") +
@@ -165,7 +165,7 @@ accuracy of a simple linear regression model,
165
165
we use RMSPE—the same measure of predictive performance we used with KNN regression.
166
166
\index{RMSPE}
167
167
168
-
```{r 08-verticalDistToMin, echo = FALSE, message = FALSE, warning = FALSE, fig.height = 4, fig.width = 5, fig.cap = "Scatter plot of sale price versus size with red lines denoting the vertical distances between the predicted values and the observed data points."}
168
+
```{r 08-verticalDistToMin, echo = FALSE, message = FALSE, warning = FALSE, fig.height = 3.5, fig.width = 4.5, fig.cap = "Scatter plot of sale price versus size with red lines denoting the vertical distances between the predicted values and the observed data points."}
@@ -268,7 +268,7 @@ linear regression predicted line of best fit. By default `geom_smooth` adds some
268
268
to the plot that we are not interested in at this point; we provide the argument `se = FALSE` to
269
269
tell `geom_smooth` not to show that information. Figure \@ref(fig:08-lm-predict-all) displays the result.
270
270
271
-
```{r 08-lm-predict-all, fig.height = 4, fig.width = 5, warning = FALSE, message = FALSE, fig.cap = "Scatter plot of sale price versus size with line of best fit for the full Sacramento housing data."}
271
+
```{r 08-lm-predict-all, fig.height = 3.5, fig.width = 4.5, warning = FALSE, message = FALSE, fig.cap = "Scatter plot of sale price versus size with line of best fit for the full Sacramento housing data."}
272
272
lm_plot_final <- ggplot(sacramento_train, aes(x = sqft, y = price)) +
lm_plot_outlier_large <- ggplot(sacramento_train, aes(x = sqft, y = price)) +
@@ -660,7 +662,7 @@ Since the two people are each slightly inaccurate, the two measurements might
660
662
not agree exactly, but they are very strongly linearly related to each other,
661
663
as shown in Figure \@ref(fig:08-lm-multicol).
662
664
663
-
```{r 08-lm-multicol, fig.height = 4, fig.width = 5, warning = FALSE, echo = FALSE, fig.cap = "Scatter plot of the with possible outlier highlighted in red."}
665
+
```{r 08-lm-multicol, fig.height = 3.5, fig.width = 4.5, warning = FALSE, echo = FALSE, fig.cap = "Scatter plot of house size (in square inches) versus house size (in square feet)."}
664
666
sacramento_train <- sacramento_train |>
665
667
mutate(sqft1 = sqft + 100 * sample(1000000,
666
668
size=nrow(sacramento_train),
@@ -793,7 +795,7 @@ df <- df |>
793
795
df
794
796
```
795
797
796
-
```{r 08-predictor-design, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 4, fig.width = 5, fig.cap = "Example of a data set with a nonlinear relationship between the predictor and the response."}
798
+
```{r 08-predictor-design, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 3.5, fig.width = 4.5, fig.cap = "Example of a data set with a nonlinear relationship between the predictor and the response."}
797
799
curve_plt <- ggplot(df, aes(x = x, y = y)) +
798
800
geom_point() +
799
801
xlab("x") +
@@ -820,7 +822,7 @@ Note that none of the `y` response values have changed between Figures \@ref(fig
820
822
and \@ref(fig:08-predictor-design-2); the only change is that the `x` values
821
823
have been replaced by `z` values.
822
824
823
-
```{r 08-predictor-design-2, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 4, fig.width = 5, fig.cap = "Relationship between the transformed predictor and the response."}
825
+
```{r 08-predictor-design-2, message = FALSE, warning = FALSE, echo = FALSE, fig.height = 3.5, fig.width = 4.5, fig.cap = "Relationship between the transformed predictor and the response."}
0 commit comments