Skip to content

Commit 9975fb5

Browse files
committed
fixed figure captions for captions with underscores and merged dev
2 parents 3ceb826 + dad07d8 commit 9975fb5

File tree

8 files changed

+131
-41
lines changed

8 files changed

+131
-41
lines changed

classification1.Rmd

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ knitr::opts_chunk$set(echo = TRUE,
1111
options(knitr.table.format = function() {
1212
if (knitr::is_latex_output()) 'latex' else 'pandoc'
1313
})
14+
reticulate::use_miniconda('r-reticulate')
1415
```
1516

1617
## Overview
@@ -572,7 +573,7 @@ Based on $K=5$ nearest neighbors with these three predictors we would classify t
572573
Figure \@ref(fig:05-more) shows what the data look like when we visualize them
573574
as a 3-dimensional scatter with lines from the new observation to its five nearest neighbors.
574575

575-
```{r 05-more, echo = FALSE, message = FALSE, fig.cap = "3D scatter plot of the standardized symmetry, concavity, and perimeter variables. Note that in general we recommend against using 3D visualizations; here we show the data in 3D only to illustrate what higher dimensions and nearest neighbors look like, for learning purposes.", fig.retina=2, out.width="80%"}
576+
```{r 05-more, echo = FALSE, message = FALSE, fig.cap = "3D scatter plot of the standardized symmetry, concavity, and perimeter variables. Note that in general we recommend against using 3D visualizations; here we show the data in 3D only to illustrate what higher dimensions and nearest neighbors look like, for learning purposes.", fig.retina=2, out.width="100%"}
576577
attrs <- c("Perimeter", "Concavity", "Symmetry")
577578
578579
# create new scaled obs and get NNs
@@ -602,7 +603,7 @@ plot_3d <- scaled_cancer_3 |>
602603
z = ~Symmetry,
603604
color = ~Class,
604605
opacity = 0.4,
605-
size = 150,
606+
size = 2,
606607
colors = c("orange2", "steelblue2", "red"),
607608
symbol = ~Class, symbols = c('circle','circle','diamond'))
608609
@@ -641,6 +642,11 @@ plot_3d <- plot_3d %>%
641642
if(!is_latex_output()){
642643
plot_3d
643644
} else {
645+
# scene = list(camera = list(eye = list(x=2, y=2, z = 1.5)))
646+
# plot_3d <- plot_3d %>% layout(scene = scene)
647+
# save_image(plot_3d, "img/plot3d_knn_classification.png", scale = 10)
648+
# cannot adjust size of points in this plot for pdf
649+
# so using a screenshot for now instead
644650
knitr::include_graphics("img/plot3d_knn_classification.png")
645651
}
646652
```

clustering.Rmd

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,9 @@ In the first cluster from the example, there are `r nrow(clus1)` data points. Th
254254
(`r paste("flipper_length_standardized =", round(mean(clus1$flipper_length_standardized),2))` and `r paste("bill_length_standardized =", round(mean(clus1$bill_length_standardized),2))`) highlighted
255255
in Figure \@ref(fig:10-toy-example-clus1-center).
256256

257-
```{r 10-toy-example-clus1-center, echo = FALSE, warning = FALSE, fig.height = 4, fig.width = 4.35, fig.cap = "Cluster 1 from the `penguin_data` data set example. Observations are in blue, with the cluster center highlighted in red."}
257+
(ref:10-toy-example-clus1-center) Cluster 1 from the `penguin_data` data set example. Observations are in blue, with the cluster center highlighted in red.
258+
259+
```{r 10-toy-example-clus1-center, echo = FALSE, warning = FALSE, fig.height = 4, fig.width = 4.35, fig.cap = "(ref:10-toy-example-clus1-center)"}
258260
base <- ggplot(data, aes(x = flipper_length_standardized, y = bill_length_standardized)) +
259261
geom_point() +
260262
xlab("Flipper Length (standardized)") +
@@ -299,7 +301,9 @@ S^2 = \left((x_1 - \mu_x)^2 + (y_1 - \mu_y)^2\right) + \left((x_2 - \mu_x)^2 + (
299301

300302
These distances are denoted by lines in Figure \@ref(fig:10-toy-example-clus1-dists) for the first cluster of the penguin data example.
301303

302-
```{r 10-toy-example-clus1-dists, echo = FALSE, warning = FALSE, fig.height = 4, fig.width = 4.35, fig.cap = "Cluster 1 from the `penguin_data` data set example. Observations are in blue, with the cluster center highlighted in red. The distances from the observations to the cluster center are represented as black lines."}
304+
(ref:10-toy-example-clus1-dists) Cluster 1 from the `penguin_data` data set example. Observations are in blue, with the cluster center highlighted in red. The distances from the observations to the cluster center are represented as black lines.
305+
306+
```{r 10-toy-example-clus1-dists, echo = FALSE, warning = FALSE, fig.height = 4, fig.width = 4.35, fig.cap = "(ref:10-toy-example-clus1-dists)"}
303307
base <- ggplot(clus1) +
304308
geom_point(aes(y = bill_length_standardized,
305309
x = flipper_length_standardized),
@@ -336,7 +340,9 @@ this means adding up all the squared distances for the 18 observations.
336340
These distances are denoted by black lines in
337341
Figure \@ref(fig:10-toy-example-all-clus-dists).
338342

339-
```{r 10-toy-example-all-clus-dists, echo = FALSE, warning = FALSE, fig.height = 4, fig.width = 5, fig.cap = "All clusters from the `penguin_data` data set example. Observations are in orange, blue, and yellow with the cluster center highlighted in red. The distances from the observations to each of the respective cluster centers are represented as black lines."}
343+
(ref:10-toy-example-all-clus-dists) All clusters from the `penguin_data` data set example. Observations are in orange, blue, and yellow with the cluster center highlighted in red. The distances from the observations to each of the respective cluster centers are represented as black lines.
344+
345+
```{r 10-toy-example-all-clus-dists, echo = FALSE, warning = FALSE, fig.height = 4, fig.width = 5, fig.cap = "(ref:10-toy-example-all-clus-dists)"}
340346
341347
342348
all_clusters_base <- data |>
@@ -431,7 +437,9 @@ There each row corresponds to an iteration,
431437
where the left column depicts the center update,
432438
and the right column depicts the reassignment of data to clusters.
433439

434-
```{r 10-toy-kmeans-iter, echo = FALSE, warning = FALSE, fig.height = 16, fig.width = 8, fig.cap = "First four iterations of K-means clustering on the `penguin_data` example data set. Each row corresponds to an iteration, where the left column depicts the center update, and the right column depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black."}
440+
(ref:10-toy-kmeans-iter) First four iterations of K-means clustering on the `penguin_data` example data set. Each row corresponds to an iteration, where the left column depicts the center update, and the right column depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.
441+
442+
```{r 10-toy-kmeans-iter, echo = FALSE, warning = FALSE, fig.height = 16, fig.width = 8, fig.cap = "(ref:10-toy-kmeans-iter)"}
435443
list_plot_cntrs <- vector(mode = "list", length = 4)
436444
list_plot_lbls <- vector(mode = "list", length = 4)
437445
@@ -557,7 +565,9 @@ plt_lbl
557565

558566
Figure \@ref(fig:10-toy-kmeans-bad-iter) shows what the iterations of K-means would look like with the unlucky random initialization shown in Figure \@ref(fig:10-toy-kmeans-bad-init).
559567

560-
```{r 10-toy-kmeans-bad-iter, echo = FALSE, warning = FALSE, fig.height = 20, fig.width = 8, fig.cap = "First five iterations of K-means clustering on the `penguin_data` example data set with a poor random initialization. Each row corresponds to an iteration, where the left column depicts the center update, and the right column depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black."}
568+
(ref:10-toy-kmeans-bad-iter) First five iterations of K-means clustering on the `penguin_data` example data set with a poor random initialization. Each row corresponds to an iteration, where the left column depicts the center update, and the right column depicts the reassignment of data to clusters. Cluster centers are indicated by larger points that are outlined in black.
569+
570+
```{r 10-toy-kmeans-bad-iter, echo = FALSE, warning = FALSE, fig.height = 20, fig.width = 8, fig.cap = "(ref:10-toy-kmeans-bad-iter)"}
561571
list_plot_cntrs <- vector(mode = "list", length = 5)
562572
list_plot_lbls <- vector(mode = "list", length = 5)
563573

img/generate-pat_01.png

120 KB
Loading

img/generate-pat_02.png

402 KB
Loading

img/generate-pat_03.png

202 KB
Loading

regression1.Rmd

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ library(knitr)
55
library(plotly)
66
77
knitr::opts_chunk$set(fig.align = "center")
8+
reticulate::use_miniconda('r-reticulate')
89
```
910

1011
## Overview
@@ -759,7 +760,7 @@ Figure \@ref(fig:07-knn-mult-viz) visualizes the model's predictions overlaid on
759760
time the predictions are a surface in 3D space, instead of a line in 2D space, as we have 2
760761
predictors instead of 1.
761762

762-
```{r 07-knn-mult-viz, echo = FALSE, message = FALSE, warning = FALSE, fig.cap = "KNN regression model’s predictions represented as a surface in 3D space overlaid on top of the data using three predictors (price, house size, and the number of bedrooms). Note that in general we recommend against using 3D visualizations; here we use a 3D visualization only to illustrate what the surface of predictions looks like for learning purposes.", out.width="80%"}
763+
```{r 07-knn-mult-viz, echo = FALSE, message = FALSE, warning = FALSE, fig.cap = "KNN regression model’s predictions represented as a surface in 3D space overlaid on top of the data using three predictors (price, house size, and the number of bedrooms). Note that in general we recommend against using 3D visualizations; here we use a 3D visualization only to illustrate what the surface of predictions looks like for learning purposes.", out.width="100%"}
763764
xvals <- seq(from = min(sacramento_train$sqft),
764765
to = max(sacramento_train$sqft),
765766
length = 50)
@@ -780,12 +781,12 @@ plot_3d <- plot_ly() |>
780781
x = ~sqft,
781782
y = ~beds,
782783
z = ~price,
783-
marker = list(size = 5, opacity = 0.4, color = "red")
784+
marker = list(size = 2, opacity = 0.4, color = "red")
784785
) |>
785786
layout(scene = list(
786-
xaxis = list(title = "House size (square feet)"),
787+
xaxis = list(title = "Size (sq ft)"),
787788
zaxis = list(title = "Price (USD)"),
788-
yaxis = list(title = "Number of bedrooms")
789+
yaxis = list(title = "Bedrooms")
789790
)) |>
790791
add_surface(
791792
x = ~xvals,
@@ -797,6 +798,9 @@ plot_3d <- plot_ly() |>
797798
if(!is_latex_output()){
798799
plot_3d
799800
} else {
801+
scene = list(camera = list(eye = list(x = -2.1, y = -2.2, z = 0.75)))
802+
plot_3d <- plot_3d |> layout(scene = scene)
803+
save_image(plot_3d, "img/plot3d_knn_regression.png", scale = 10)
800804
knitr::include_graphics("img/plot3d_knn_regression.png")
801805
}
802806
```

regression2.Rmd

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ library(knitr)
55
library(plotly)
66
77
knitr::opts_chunk$set(fig.align = "center")
8+
reticulate::use_miniconda('r-reticulate')
89
```
910

1011
## Overview
@@ -453,7 +454,7 @@ is `r format(round(lm_mult_test_results %>% filter(.metric == 'rmse') %>% pull(.
453454
In the case of two predictors, we can plot the predictions made by our linear regression creates a *plane* of best fit, as
454455
shown in Figure \@ref(fig:08-3DlinReg).
455456

456-
```{r 08-3DlinReg, echo = FALSE, message = FALSE, warning = FALSE, fig.cap = "Linear regression plane of best fit overlaid on top of the data (using price, house size, and number of bedrooms as predictors). Note that in general we recommend against using 3D visualizations; here we use a 3D visualization only to illustrate what the regression plane looks like for learning purposes.", out.width="80%"}
457+
```{r 08-3DlinReg, echo = FALSE, message = FALSE, warning = FALSE, fig.cap = "Linear regression plane of best fit overlaid on top of the data (using price, house size, and number of bedrooms as predictors). Note that in general we recommend against using 3D visualizations; here we use a 3D visualization only to illustrate what the regression plane looks like for learning purposes.", out.width="100%"}
457458
xvals <- seq(from = min(sacramento_train$sqft),
458459
to = max(sacramento_train$sqft),
459460
length = 50)
@@ -474,12 +475,12 @@ plot_3d <- plot_ly() |>
474475
x = ~sqft,
475476
y = ~beds,
476477
z = ~price,
477-
marker = list(size = 5, opacity = 0.4, color = "red")
478+
marker = list(size = 2, opacity = 0.4, color = "red")
478479
) |>
479480
layout(scene = list(
480-
xaxis = list(title = "House size (square feet)"),
481+
xaxis = list(title = "Size (sq ft)"),
481482
zaxis = list(title = "Price (USD)"),
482-
yaxis = list(title = "Number of bedrooms")
483+
yaxis = list(title = "Bedrooms")
483484
)) |>
484485
add_surface(
485486
x = ~xvals,
@@ -491,6 +492,9 @@ plot_3d <- plot_ly() |>
491492
if(!is_latex_output()){
492493
plot_3d
493494
} else {
495+
scene = list(camera = list(eye = list(x = -2.1, y = -2.2, z = 0.75)))
496+
plot_3d <- plot_3d %>% layout(scene = scene)
497+
save_image(plot_3d, "img/plot3d_linear_regression.png", scale = 10)
494498
knitr::include_graphics("img/plot3d_linear_regression.png")
495499
}
496500
```

0 commit comments

Comments
 (0)