remaining additional resource sections

trevorcampbell · trevorcampbell · commit 35b6a813213e · 2021-12-18T20:44:36.000-08:00
diff --git a/classification2.Rmd b/classification2.Rmd
@@ -1394,5 +1394,24 @@ please follow the instructions for computer setup needed to run the worksheets
 found in Chapter \@ref(move-to-your-own-machine).
 
 ## Additional resources
-- The [`tidymodels` website](https://tidymodels.org/packages) is an excellent reference for more details on, and advanced usage of, the functions and packages in the past two chapters. Aside from that, it also has a [nice beginner's tutorial](https://www.tidymodels.org/start/) and [an extensive list of more advanced examples](https://www.tidymodels.org/learn/) that you can use to continue learning beyond the scope of this book. It's worth noting that the `tidymodels` package does a lot more than just classification, and so the examples on the website similarly go beyond classification as well. In the next two chapters, you'll learn about another kind of predictive modeling setting, so it might be worth visiting the website only after reading through those chapters.
-- [*An Introduction to Statistical Learning*](https://www.statlearning.com/) [@james2013introduction] provides a great next stop in the process of learning about classification. Chapter 4 discusses additional basic techniques for classification that we do not cover, such as logistic regression, linear discriminant analysis, and naive Bayes. Chapter 5 goes into much more detail about cross-validation. Chapters 8 and 9 cover decision trees and support vector machines, two very popular but more advanced classification methods. Finally, Chapter 6 covers a number of methods for selecting predictor variables. Note that while this book is still a very accessible introductory text, it requires a bit more mathematical background than we require.
+- The [`tidymodels` website](https://tidymodels.org/packages) is an excellent
+  reference for more details on, and advanced usage of, the functions and
+  packages in the past two chapters. Aside from that, it also has a [nice
+  beginner's tutorial](https://www.tidymodels.org/start/) and [an extensive list
+  of more advanced examples](https://www.tidymodels.org/learn/) that you can use
+  to continue learning beyond the scope of this book. It's worth noting that the
+  `tidymodels` package does a lot more than just classification, and so the
+  examples on the website similarly go beyond classification as well. In the next
+  two chapters, you'll learn about another kind of predictive modeling setting,
+  so it might be worth visiting the website only after reading through those
+  chapters.
+- *An Introduction to Statistical Learning* [@james2013introduction] provides 
+  a great next stop in the process of
+  learning about classification. Chapter 4 discusses additional basic techniques
+  for classification that we do not cover, such as logistic regression, linear
+  discriminant analysis, and naive Bayes. Chapter 5 goes into much more detail
+  about cross-validation. Chapters 8 and 9 cover decision trees and support
+  vector machines, two very popular but more advanced classification methods.
+  Finally, Chapter 6 covers a number of methods for selecting predictor
+  variables. Note that while this book is still a very accessible introductory
+  text, it requires a bit more mathematical background than we require.
diff --git a/clustering.Rmd b/clustering.Rmd
@@ -1106,4 +1106,12 @@ please follow the instructions for computer setup needed to run the worksheets
 found in Chapter \@ref(move-to-your-own-machine).
 
 ## Additional resources
-- Chapter 10 of [*An Introduction to Statistical Learning*](https://www.statlearning.com/) [@james2013introduction] provides a great next stop in the process of learning about clustering and unsupervised learning in general. In the realm of clustering specifically, it provides a great companion introduction to K-means, but also covers *hierarchical* clustering for when you expect there to be subgroups, and then subgroups within subgroups, etc., in your data. In the realm of more general unsupervised learning, it covers *principal components analysis (PCA)*, which is a very popular technique for reducing the number of predictors in a dataset. 
+- Chapter 10 of *An Introduction to Statistical
+  Learning* [@james2013introduction] provides a
+  great next stop in the process of learning about clustering and unsupervised
+  learning in general. In the realm of clustering specifically, it provides a
+  great companion introduction to K-means, but also covers *hierarchical*
+  clustering for when you expect there to be subgroups, and then subgroups within
+  subgroups, etc., in your data. In the realm of more general unsupervised
+  learning, it covers *principal components analysis (PCA)*, which is a very
+  popular technique for reducing the number of predictors in a dataset. 
diff --git a/inference.Rmd b/inference.Rmd
@@ -1183,5 +1183,20 @@ found in Chapter \@ref(move-to-your-own-machine).
 
 ## Additional resources
 
-- Chapters 7 to 10 of [*Modern Dive*](https://moderndive.com/) provide a great next step in learning about inference. In particular, Chapters 7 and 8 cover sampling and bootstrapping using `tidyverse` and `infer` in a slightly more in-depth manner than the present chapter. Chapters 9 and 10 take the next step beyond the scope of this chapter and begin to provide some of the initial mathematical underpinnings of inference and more advanced applications of the concept of inference in testing hypotheses and performing regression. This material offers a great starting point for getting more into the technical side of statistics.
-- Chapters 4 to 7 of [*OpenIntro Statistics*](https://www.openintro.org/) provide a good next step after *Modern Dive*. Although it is still certainly an introductory text, things get a bit more mathematical here. Depending on your background, you may actually want to start going through Chapters 1 to 3 first, where you will learn some fundamental concepts in probability theory. Although it may seem like a diversion, probability theory is *the language of statistics*; if you have a solid grasp of probability, more advanced statistics will come naturally to you!
+- Chapters 7 to 10 of *Modern Dive* [@moderndive] provide a great
+  next step in learning about inference. In particular, Chapters 7 and 8 cover
+  sampling and bootstrapping using `tidyverse` and `infer` in a slightly more
+  in-depth manner than the present chapter. Chapters 9 and 10 take the next step
+  beyond the scope of this chapter and begin to provide some of the initial
+  mathematical underpinnings of inference and more advanced applications of the
+  concept of inference in testing hypotheses and performing regression. This
+  material offers a great starting point for getting more into the technical side
+  of statistics.
+- Chapters 4 to 7 of *OpenIntro Statistics* [@openintro]
+  provide a good next step after *Modern Dive*. Although it is still certainly
+  an introductory text, things get a bit more mathematical here. Depending on
+  your background, you may actually want to start going through Chapters 1 to 3
+  first, where you will learn some fundamental concepts in probability theory.
+  Although it may seem like a diversion, probability theory is *the language of
+  statistics*; if you have a solid grasp of probability, more advanced statistics
+  will come naturally to you!
diff --git a/jupyter.Rmd b/jupyter.Rmd
@@ -419,10 +419,11 @@ files using lower case characters and separating words by a dash (`-`) or an
 underscore (`_`).
 
 ## Additional resources
-- The [JupyterLab Documentation](https://jupyterlab.readthedocs.io/en/latest/) is a good
-next place to look for more information about working in Jupyter notebooks. This documentation
-goes into significantly more detail about all of the topics we covered in this chapter, and
-covers more advanced topics as well.
-- If you are keen to learn about the Markdown language for rich text formatting, two good places to 
-start are [this Markdown cheatsheet](https://commonmark.org/help/)
-and [Markdown tutorial](https://commonmark.org/help/tutorial/), both provided by CommonMark.
+- The [JupyterLab Documentation](https://jupyterlab.readthedocs.io/en/latest/)
+  is a good next place to look for more information about working in Jupyter
+  notebooks. This documentation goes into significantly more detail about all of
+  the topics we covered in this chapter, and covers more advanced topics as well.
+- If you are keen to learn about the Markdown language for rich text
+  formatting, two good places to start are CommonMark's [Markdown
+  cheatsheet](https://commonmark.org/help/) and [Markdown
+  tutorial](https://commonmark.org/help/tutorial/).
diff --git a/references.bib b/references.bib
@@ -349,6 +349,13 @@ @misc{stanfordhealthcare
   url = {https://stanfordhealthcare.org/medical-conditions/cancer/cancer.html}
 }
 
+@book{moderndive,
+  title={Statistical Inference via Data Science: A {M}odern{D}ive into {R} and the {T}idyverse},
+  author = {Chester Ismay and Albert Kim},
+  year = {2020},
+  publisher = {Chapman and Hall/CRC Press},
+  url = {https://moderndive.com/}}
+
 @book{wickham2016r,
   title={R for Data Science: Import, Tidy, Transform, Visualize, and Model Data},
   author={Wickham, Hadley and Grolemund, Garrett},
@@ -415,3 +422,10 @@ @article{lubridatepaper
   volume = {40},
   number = {3},
   pages = {1--25}}
+
+@book{openintro,
+  title = {OpenIntro Statistics},
+  author = {David Diez and Mine \c{C}etinkaya-Rundel and Christopher Barr},
+  year = {2019},
+  publisher = {OpenIntro, Inc.},
+  url = {https://openintro.org/book/os/}}
diff --git a/regression2.Rmd b/regression2.Rmd
@@ -902,6 +902,28 @@ please follow the instructions for computer setup needed to run the worksheets
 found in Chapter \@ref(move-to-your-own-machine).
 
 ## Additional resources
-- The [`tidymodels` website](https://tidymodels.org/packages) is an excellent reference for more details on, and advanced usage of, the functions and packages in the past two chapters. Aside from that, it also has a [nice beginner's tutorial](https://www.tidymodels.org/start/) and [an extensive list of more advanced examples](https://www.tidymodels.org/learn/) that you can use to continue learning beyond the scope of this book. 
-- [*Modern Dive*](https://moderndive.com/) is another textbook that uses the `tidyverse` / `tidymodels` framework. Chapter 6 complements the material in the current chapter well; it covers some slightly more advanced concepts than we do without getting mathematical. Give this chapter a read before moving on to the next reference. It is also worth noting that this book takes a more "explanatory" / "inferential" approach to regression in general (in Chapters 5, 6, and 10), which provides a nice complement to the predictive tack we take in the present book.
-- [*An Introduction to Statistical Learning*](https://www.statlearning.com/) [@james2013introduction] provides a great next stop in the process of learning about regression. Chapter 3 covers linear regression at a slightly more mathematical level than we do here, but it is not too large a leap and so should provide a good stepping stone. Chapter 6 discusses how to pick a subset of "informative" predictors when you have a data set with many predictors, and you expect only a few of them to be relevant. Chapter 7 covers regression models that are more flexible than linear regression models but still enjoy the computational efficiency of linear regression. In contrast, the KNN methods we covered earlier are indeed more flexible but become very slow when given lots of data.
+- The [`tidymodels` website](https://tidymodels.org/packages) is an excellent
+  reference for more details on, and advanced usage of, the functions and
+  packages in the past two chapters. Aside from that, it also has a [nice
+  beginner's tutorial](https://www.tidymodels.org/start/) and [an extensive list
+  of more advanced examples](https://www.tidymodels.org/learn/) that you can use
+  to continue learning beyond the scope of this book. 
+- *Modern Dive* [@moderndive] is another textbook that uses the
+  `tidyverse` / `tidymodels` framework. Chapter 6 complements the material in
+  the current chapter well; it covers some slightly more advanced concepts than
+  we do without getting mathematical. Give this chapter a read before moving on
+  to the next reference. It is also worth noting that this book takes a more
+  "explanatory" / "inferential" approach to regression in general (in Chapters 5,
+  6, and 10), which provides a nice complement to the predictive tack we take in
+  the present book.
+- *An Introduction to Statistical Learning* [@james2013introduction] provides 
+  a great next stop in the process of
+  learning about regression. Chapter 3 covers linear regression at a slightly
+  more mathematical level than we do here, but it is not too large a leap and so
+  should provide a good stepping stone. Chapter 6 discusses how to pick a subset
+  of "informative" predictors when you have a data set with many predictors, and
+  you expect only a few of them to be relevant. Chapter 7 covers regression
+  models that are more flexible than linear regression models but still enjoy the
+  computational efficiency of linear regression. In contrast, the KNN methods we
+  covered earlier are indeed more flexible but become very slow when given lots
+  of data.
diff --git a/version-control.Rmd b/version-control.Rmd
@@ -940,8 +940,22 @@ found in Chapter \@ref(move-to-your-own-machine).
 Now that you've picked up the basics of version control with Git and GitHub, 
 you can expand your knowledge through the resources listed below:
 
-- GitHub's [guides website](https://guides.github.com/) and [YouTube channel](https://www.youtube.com/githubguides),
-and [*Happy Git with R*](https://happygitwithr.com/) are great resources to take the next steps in learning about Git and GitHub.
-- [Good enough practices in scientific computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510#sec014) [-@wilson2014best] provides more advice on useful workflows and "good enough" practices in data analysis projects.
-- In addition to [GitHub](https://github.com), there are other popular Git repository hosting services such as [GitLab](https://gitlab.com) and  [BitBucket](https://bitbucket.org). Comparing all of these options is beyond the scope of this book, and until you become a more advanced user, you are perfectly fine to just stick with GitHub. Just be aware that you have options!
-- [GitHub's documentation on creating a personal access token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) and the *Happy Git with R* [cache credentials for HTTPS](https://happygitwithr.com/credential-caching.html) chapter are both excellent additional resources to consult if you need additional help generating and using personal access tokens.
+- GitHub's [guides website](https://guides.github.com/) and [YouTube
+  channel](https://www.youtube.com/githubguides), and [*Happy Git and GitHub 
+  for the useR*](https://happygitwithr.com/) are great resources to take the next steps in
+  learning about Git and GitHub.
+- [Good enough practices in scientific
+  computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510#sec014)
+  [@wilson2014best] provides more advice on useful workflows and "good enough"
+  practices in data analysis projects.
+- In addition to [GitHub](https://github.com), there are other popular Git
+  repository hosting services such as [GitLab](https://gitlab.com) and
+  [BitBucket](https://bitbucket.org). Comparing all of these options is beyond
+  the scope of this book, and until you become a more advanced user, you are
+  perfectly fine to just stick with GitHub. Just be aware that you have options!
+- GitHub's [documentation on creating a personal access
+  token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token)
+  and the *Happy Git and GitHub for the useR* 
+  [personal access tokens chapter](https://happygitwithr.com/https-pat.html) are both
+  excellent additional resources to consult if you need additional help
+  generating and using personal access tokens.