cleaning up additional resources reading wrangling

trevorcampbell · trevorcampbell · commit 22fe5aa091ec · 2021-12-18T20:19:37.000-08:00
diff --git a/reading.Rmd b/reading.Rmd
@@ -346,7 +346,8 @@ above, R assigns each column a name of `X1, X2, X3, X4, X5, X6`.
 It is best to rename your columns to help differentiate between them 
 (e.g., `X1, X2`, etc., are not very descriptive names and will make it more confusing as
 you code). To rename your columns, you can use the `rename` function
-\index{rename} from the `dplyr` \index{dplyr} package (one of the packages
+\index{rename} from [the `dplyr` R package](https://dplyr.tidyverse.org/) [@dplyr]
+ \index{dplyr} (one of the packages
 loaded with `tidyverse`, so we don't need to load it separately). The first
 argument is the data set, and in the subsequent arguments you 
 write `new_name = old_name` for the selected variables to 
@@ -1225,14 +1226,33 @@ please follow the instructions for computer setup needed to run the worksheets
 found in Chapter \@ref(move-to-your-own-machine).
 
 ## Additional resources
-- The [`readr` page on the Tidyverse website](https://readr.tidyverse.org/) is where you should look if you want to learn more about the functions in this chapter, the full set of arguments you can use, and other related functions. The site also provides a very nice cheat sheet that summarizes many of the data wrangling functions from this chapter.
-- Sometimes you might run into data in such poor shape that none of the reading functions we cover in this chapter works. In that case, you can consult the [data import chapter](https://r4ds.had.co.nz/data-import.html) from [R for Data Science](https://r4ds.had.co.nz/), which goes into a lot more detail about how R parses text from files into data frames.
-- The documentation for many of the reading functions we cover in this chapter can be found [on the Tidyverse website](https://readr.tidyverse.org/reference/read_delim.html). This site shows you the full set of arguments available for each function.
-- The [`here` package](https://cran.r-project.org/web/packages/here/index.html) provides a way for you to construct or find your files' paths. 
-- The [`readxl` documentation](https://readxl.tidyverse.org/) provides more details on reading data from Excel, such as reading in data with multiple sheets, or specifying the cells to read in. 
-- The [`rio` package](https://github.com/leeper/rio) provides an alternative set of tools for reading and writing data in R. It aims to be a "Swiss army knife" for data reading/writing/converting, and supports a wide variety of data types (including data formats generated by other statistical software like SPSS and SAS).
-- This [video](https://www.youtube.com/embed/ephId3mYu9o) from the [Udacity course "Linux Command Line Basics"](https://www.udacity.com/course/linux-command-line-basics--ud595) provides a good explanation of absolute versus relative paths.
-- If you read the subsection on obtaining data from the web via scraping and APIs, we provide two companion tutorial video links:
-    - [A brief video tutorial](https://www.youtube.com/embed/YdIWI6K64zo) on using the SelectorGadget tool to obtain desired CSS selectors for extracting the price and size data for apartment listings on Craigslist
-    - [Another brief video tutorial](https://www.youtube.com/embed/O9HKbdhqYzk) on using the SelectorGadget tool to obtain desired CSS selectors for extracting Canadian city names and 2016 census populations from Wikipedia
-- The [`polite` package](https://cran.r-project.org/web/packages/polite/index.html) provides a set of tools for responsibly scraping data from websites. 
+- The [`readr` documentation](https://readr.tidyverse.org/) 
+  provides the documentation for many of the reading functions we cover in this chapter.
+  It is where you should look if you want to learn more about the functions in this
+  chapter, the full set of arguments you can use, and other related functions.
+  The site also provides a very nice cheat sheet that summarizes many of the data
+  wrangling functions from this chapter.
+- Sometimes you might run into data in such poor shape that none of the reading
+  functions we cover in this chapter work. In that case, you can consult the
+  [data import chapter](https://r4ds.had.co.nz/data-import.html) from *R for Data
+  Science* [@wickham2016r], which goes into a lot more detail about how R parses
+  text from files into data frames.
+- The [`here` R package](https://here.r-lib.org/) [@here]
+  provides a way for you to construct or find your files' paths. 
+- The [`readxl` documentation](https://readxl.tidyverse.org/) provides more
+  details on reading data from Excel, such as reading in data with multiple
+  sheets, or specifying the cells to read in. 
+- The [`rio` R package](https://github.com/leeper/rio) [@rio] provides an alternative
+  set of tools for reading and writing data in R. It aims to be a "Swiss army
+  knife" for data reading/writing/converting, and supports a wide variety of data
+  types (including data formats generated by other statistical software like SPSS
+  and SAS).
+- A [video](https://www.youtube.com/embed/ephId3mYu9o) from the Udacity
+  course *Linux Command Line Basics* provides a good explanation of absolute versus relative paths.
+- If you read the subsection on obtaining data from the web via scraping and
+  APIs, we provide two companion tutorial video links for how to use the
+  SelectorGadget tool to obtain desired CSS selectors for:
+    - [extracting the price and size data for apartment listings on Craigslist](https://dmi3kno.github.io/polite/)
+    - [extracting Canadian city names and 2016 census populations from Wikipedia](https://www.youtube.com/embed/O9HKbdhqYzk)
+- The [`polite` R package](https://dmi3kno.github.io/polite/) [@polite] provides
+  a set of tools for responsibly scraping data from websites. 
diff --git a/references.bib b/references.bib
@@ -364,3 +364,33 @@ @book{wickham2019advanced
   publisher={CRC Press},
   url = {https://adv-r.hadley.nz/}
 }
+
+@Manual{here,
+  title = {{here R package}},
+  author = {Kirill M\"uller},
+  year = {2020},
+  url = {https://here.r-lib.org/}}
+
+@Manual{rio,
+  title = {{rio R package}},
+  author = {Thomas Leeper},
+  year = {2021},
+  url = {https://cloud.r-project.org/web/packages/rio/index.html}}
+
+@Manual{polite,
+  title = {{polite R package}},
+  author = {Dmytro Perepolkin},
+  year = {2021},
+  url = {https://dmi3kno.github.io/polite/}}
+
+@Manual{dplyr,
+  title = {{dplyr R package}},
+  author = {Hadley Wickham and Romain Fran\c{c}ois and Lionel Henry and Kirill M\"uller},
+  year = {2021},
+  url = {https://dplyr.tidyverse.org/}}
+
+@Manual{tidyselect,
+  title = {{tidyselect R package}},
+  author = {Lionel Henry and Hadley Wickham},
+  year = {2021},
+  url = {https://tidyselect.r-lib.org/}}
diff --git a/wrangling.Rmd b/wrangling.Rmd
@@ -1625,39 +1625,36 @@ found in Chapter \@ref(move-to-your-own-machine).
 
 ## Additional resources 
 
-  - As we mentioned earlier, `tidyverse` is actually an *R
-    meta package*: it installs and loads a collection of R packages that all
-    follow the tidy data philosophy we discussed above. One of the `tidyverse`
-    packages is `dplyr`&mdash;a data wrangling workhorse. You have already met many
-    of `dplyr`'s functions 
-    (`select`, `filter`, `mutate`, `arrange`, `summarize`, and `group_by`). 
-    To learn more about these functions and meet a few more useful
-    functions, we recommend you check out [this
-    chapter](https://stat545.com/block010_dplyr-end-single-table.html#where-were-we)
-    of the data wrangling, exploration, and analysis with R book.
-  - The [`dplyr` page on the Tidyverse website](https://dplyr.tidyverse.org/) is
-    another resource to learn more about the functions in this
-    chapter, the full set of arguments you can use, and other related functions.
-    The site also provides a very nice cheat sheet that summarizes many of the
-    data wrangling functions from this chapter.
-  - Check out the [`tidyselect` page](https://tidyselect.r-lib.org/reference/select_helpers.html) for a
-    comprehensive list of `select` helpers.
-  - [*R for Data Science*](https://r4ds.had.co.nz/) has a few chapters related to
-    data wrangling that go into more depth than this book. For example, the
-    [tidy data](https://r4ds.had.co.nz/tidy-data.html) chapter covers tidy data,
-    `pivot_longer`/`pivot_wider` and `separate`, but also covers missing values
-    and additional wrangling functions (like `unite`). The [data
-    transformation](https://r4ds.had.co.nz/transform.html) chapter covers
-    `select`, `filter`, `arrange`, `mutate`, and `summarize`. And the [`map`
-    functions](https://r4ds.had.co.nz/iteration.html#the-map-functions) chapter
-    provides more about the `map` functions.
-  - You will occasionally encounter a case where you need to iterate over items
-    in a data frame, but none of the above functions are flexible enough to do
-    what you want. In that case, you may consider using [a for
-    loop](https://r4ds.had.co.nz/iteration.html#iteration).
-  - There are many `select` helpers that can be used to efficiently subset 
-    columns in a data frame when paired with the `select` function, 
-    or other functions that also use the `tidyselect` syntax for column selection 
-    (e.g., `pivot-longer`). 
-    The [documentation for `select` helpers](https://tidyselect.r-lib.org/reference/select_helpers.html) 
-    is a useful reference to find the helper you need for your particular problem.
+- As we mentioned earlier, `tidyverse` is actually an *R
+  meta package*: it installs and loads a collection of R packages that all
+  follow the tidy data philosophy we discussed above. One of the `tidyverse`
+  packages is `dplyr`&mdash;a data wrangling workhorse. You have already met many
+  of `dplyr`'s functions 
+  (`select`, `filter`, `mutate`, `arrange`, `summarize`, and `group_by`). 
+  To learn more about these functions and meet a few more useful
+  functions, we recommend you check out Chapters 5-9 of the [STAT545 online notes](https://stat545.com/).
+  of the data wrangling, exploration, and analysis with R book.
+- The [`dplyr` R package documentation](https://dplyr.tidyverse.org/) [@dplyr] is
+  another resource to learn more about the functions in this
+  chapter, the full set of arguments you can use, and other related functions.
+  The site also provides a very nice cheat sheet that summarizes many of the
+  data wrangling functions from this chapter.
+- Check out the [`tidyselect` R package page](https://tidyselect.r-lib.org/reference/select_helpers.html) 
+  [@tidyselect] for a comprehensive list of `select` helpers. 
+  These helpers can be used to choose columns in a data frame when paired with  the `select` function 
+  (and other functions that use the `tidyselect` syntax, such as `pivot_longer`).
+  The [documentation for `select` helpers](https://tidyselect.r-lib.org/reference/select_helpers.html) 
+  is a useful reference to find the helper you need for your particular problem.
+- *R for Data Science* [@wickham2016r] has a few chapters related to
+  data wrangling that go into more depth than this book. For example, the
+  [tidy data chapter](https://r4ds.had.co.nz/tidy-data.html) covers tidy data,
+  `pivot_longer`/`pivot_wider` and `separate`, but also covers missing values
+  and additional wrangling functions (like `unite`). The [data
+  transformation chapter](https://r4ds.had.co.nz/transform.html) covers
+  `select`, `filter`, `arrange`, `mutate`, and `summarize`. And the [`map`
+  functions chapter](https://r4ds.had.co.nz/iteration.html#the-map-functions)
+  provides more about the `map` functions.
+- You will occasionally encounter a case where you need to iterate over items
+  in a data frame, but none of the above functions are flexible enough to do
+  what you want. In that case, you may consider using [a for
+  loop](https://r4ds.had.co.nz/iteration.html#iteration).