Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 145 additions & 63 deletions episodes/how-r-thinks-about-data.Rmd

Large diffs are not rendered by default.

161 changes: 117 additions & 44 deletions episodes/introduction-r-rstudio.Rmd

Large diffs are not rendered by default.

277 changes: 190 additions & 87 deletions episodes/visualizing-ggplot.Rmd

Large diffs are not rendered by default.

284 changes: 206 additions & 78 deletions episodes/working-with-data.Rmd

Large diffs are not rendered by default.

228 changes: 86 additions & 142 deletions instructors/instructor-notes.md

Large diffs are not rendered by default.

24 changes: 13 additions & 11 deletions learners/extra-challenges.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ teaching: 45
exercises: 3
---


```{r setup, include=FALSE}
knitr::opts_chunk$set(dpi = 200, out.height = 600, out.width = 600, R.options = list(max.print = 100))
```
Expand All @@ -14,22 +13,24 @@ library(tidyverse)
surveys <- read_csv("data/cleaned/surveys_complete_77_89.csv")
```

::::::::::::::::::::::::::::::::::::: challenge
::::::::::::::::::::::::::::::::::::: challenge

## Challenge: `ggplot2` syntax

There are some issues with these `ggplot2` examples. Can you figure out what is wrong with each one?
There are some issues with these `ggplot2` examples.
Can you figure out what is wrong with each one?

```{r, eval=FALSE}
ggplot(data = surveys,
mapping = aes(x = weight, y = hindfoot_length, color = "blue")) +
geom_point()
```

:::::::::::::::::::::::: solution

:::::::::::::::::::::::: solution

Our points don't actually turn out blue, because we defined the color inside of `aes()`. `aes()` is used for translating variables from the data into plot elements, like color. There is no variable in the data called "blue".
Our points don't actually turn out blue, because we defined the color inside of `aes()`.
`aes()` is used for translating variables from the data into plot elements, like color.
There is no variable in the data called "blue".

::::::::::::::::::::::::

Expand All @@ -39,7 +40,7 @@ ggplot(data = surveys,
geom_point()
```

:::::::::::::::::::::::: solution
:::::::::::::::::::::::: solution

Variable names inside `aes()` should not be wrapped in quotes.

Expand All @@ -51,7 +52,7 @@ ggplot(data = surveys,
+ geom_point()
```

:::::::::::::::::::::::: solution
:::::::::::::::::::::::: solution

When adding things like `geom_` or `scale_` functions to a `ggplot()`, you have to end a line with `+`, not begin a line with it.

Expand All @@ -62,7 +63,7 @@ ggplot(data = surveys, x = weight, y = hindfoot_length) +
geom_point()
```

:::::::::::::::::::::::: solution
:::::::::::::::::::::::: solution

When translating variables from the data, like `weight` and `hindfoot_length`, to elements of the plot, like `x` and `y`, you must put them inside `aes()`.

Expand All @@ -75,9 +76,10 @@ ggplot(data = surveys,
scale_color_continuous(type = "viridis")
```

:::::::::::::::::::::::: solution
:::::::::::::::::::::::: solution

`species_id` is a categorical variable, but `scale_color_continuous()` supplies a continuous color scale. `scale_color_discrete()` would give a discrete/categorical scale.
`species_id` is a categorical variable, but `scale_color_continuous()` supplies a continuous color scale.
`scale_color_discrete()` would give a discrete/categorical scale.

::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::
2 changes: 0 additions & 2 deletions learners/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,5 +100,3 @@ Cheat sheet of functions used in the lessons
- `inner_join()` # perform an inner join between two tables
- `src_sqlite()` # connect dplyr to a SQLite database file
- `copy_to()` # copy a data frame as a table into a database


115 changes: 63 additions & 52 deletions learners/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,90 +4,97 @@ title: Setup

## Preparations

Data Carpentry's teaching is hands-on, and to follow this lesson
learners must have R and RStudio installed on their computers. They also need
to be able to install a number of R packages, create directories, and download
files.
Data Carpentry's teaching is hands-on, and to follow this lesson learners must have R and RStudio installed on their computers.
They also need to be able to install a number of R packages, create directories, and download files.

To avoid troubleshooting during the lesson, learners should follow the
instructions below to download and install everything beforehand.
If the computer is managed by their organization's IT department
they might need help from an IT administrator.
To avoid troubleshooting during the lesson, learners should follow the instructions below to download and install everything beforehand.
If the computer is managed by their organization's IT department they might need help from an IT administrator.

### Install R and RStudio

R and RStudio are two separate pieces of software:
R and RStudio are two separate pieces of software:

- **R** is a programming language and software used to run code written in R.
- **RStudio** is an integrated development environment (IDE) that makes using R easier. In this course we use RStudio to interact with R.

* **R** is a programming language and software used to run code written in R.
* **RStudio** is an integrated development environment (IDE) that makes using R easier. In this course we use RStudio to interact with R.

If you don't already have R and RStudio installed, follow the instructions for your operating system below.
You have to install R before you install RStudio.
You have to install R before you install RStudio.

::::::: spoiler

## For Windows

* Download R from the [CRAN website](https://cran.r-project.org/bin/windows/base/release.htm).
* Run the `.exe` file that was just downloaded
* Go to the [RStudio download page](https://www.rstudio.com/products/rstudio/download/#download)
* Under *Installers* select **Windows Vista 10/11 - RSTUDIO-xxxx.yy.z-zzz.exe** (where x = year, y = month, and z represent version numbers)
* Double click the file to install it
* Once it's installed, open RStudio to make sure it works and you don't get any error messages.
- Download R from the [CRAN website](https://cran.r-project.org/bin/windows/base/release.htm).
- Run the `.exe` file that was just downloaded
- Go to the [RStudio download page](https://www.rstudio.com/products/rstudio/download/#download)
- Under *Installers* select **Windows Vista 10/11 - RSTUDIO-xxxx.yy.z-zzz.exe** (where x = year, y = month, and z represent version numbers)
- Double click the file to install it
- Once it's installed, open RStudio to make sure it works and you don't get any error messages.

:::::::::::::::::::::::::

:::::::::::::::: spoiler

## For MacOS

* Download R from the [CRAN website](https://cran.r-project.org/bin/macosx/).
* Select the `.pkg` file for the latest R version
* Double click on the downloaded file to install R
* It is also a good idea to install [XQuartz](https://www.xquartz.org/) (needed by some packages)
* Go to the [RStudio download page](https://www.rstudio.com/products/rstudio/download/#download)
* Under *Installers* select **Mac OS 13+ - RSTUDIO-xxxx.yy.z-zzz.dmg** (where x = year, y = month, and z represent version numbers)
* Double click the file to install RStudio
* Once it's installed, open RStudio to make sure it works and you don't get any error messages.
- Download R from the [CRAN website](https://cran.r-project.org/bin/macosx/).
- Select the `.pkg` file for the latest R version
- Double click on the downloaded file to install R
- It is also a good idea to install [XQuartz](https://www.xquartz.org/) (needed by some packages)
- Go to the [RStudio download page](https://www.rstudio.com/products/rstudio/download/#download)
- Under *Installers* select **Mac OS 13+ - RSTUDIO-xxxx.yy.z-zzz.dmg** (where x = year, y = month, and z represent version numbers)
- Double click the file to install RStudio
- Once it's installed, open RStudio to make sure it works and you don't get any error messages.

::::::::::::::::

::::::: spoiler

## For Linux

* Click on your distribution in the [Linux folder of the CRAN website](https://cran.r-project.org/bin/linux/). Linux Mint users should follow instructions for Ubuntu.
* Go through the instructions for your distribution to install R.
* Go to the [RStudio download page](https://www.rstudio.com/products/rstudio/download/#download)
* Select the relevant installer for your Linux system (Ubuntu/Debian or Fedora)
* Double click the file to install RStudio
* Once it's installed, open RStudio to make sure it works and you don't get any error messages.
- Click on your distribution in the [Linux folder of the CRAN website](https://cran.r-project.org/bin/linux/). Linux Mint users should follow instructions for Ubuntu.
- Go through the instructions for your distribution to install R.
- Go to the [RStudio download page](https://www.rstudio.com/products/rstudio/download/#download)
- Select the relevant installer for your Linux system (Ubuntu/Debian or Fedora)
- Double click the file to install RStudio
- Once it's installed, open RStudio to make sure it works and you don't get any error messages.

::::::::::::::::

### Update R and RStudio

If you already have R and RStudio installed, first check if your R version is up to date:

* When you open RStudio your R version will be printed in the console on the bottom left. Alternatively, you can type `sessionInfo()` into the console. If your R version is 4.0.0 or later, you don't need to update R for this lesson. If your version of R is older than that, download and install the latest version of R from the R project website [for Windows](https://cran.r-project.org/bin/windows/base/), [for MacOS](https://cran.r-project.org/bin/macosx/), or [for Linux](https://cran.r-project.org/bin/linux/)
* It is not necessary to remove old versions of R from your system, but if you wish to do so you can check [How do I uninstall R?](https://cran.r-project.org/bin/windows/base/rw-FAQ.html#How-do-I-UNinstall-R_003f)
* After installing a new version of R, you will have to reinstall all your packages with the new version. For Windows, there is a package called `installr` that can help you with upgrading your R version and migrate your package library. A similar package called `pacman` can help with updating R packages across
To update RStudio to the latest version, open RStudio and click on
`Help > Check for Updates`. If a new version is available follow the
instruction on screen. By default, RStudio will also automatically notify you
of new versions every once in a while.
- When you open RStudio your R version will be printed in the console on the bottom left.
Alternatively, you can type `sessionInfo()` into the console.
If your R version is 4.0.0 or later, you don't need to update R for this lesson.
If your version of R is older than that, download and install the latest version of R from the R project website [for Windows](https://cran.r-project.org/bin/windows/base/), [for MacOS](https://cran.r-project.org/bin/macosx/), or [for Linux](https://cran.r-project.org/bin/linux/)
- It is not necessary to remove old versions of R from your system, but if you wish to do so you can check [How do I uninstall R?](https://cran.r-project.org/bin/windows/base/rw-FAQ.html#How-do-I-UNinstall-R_003f)
- After installing a new version of R, you will have to reinstall all your packages with the new version.
For Windows, there is a package called `installr` that can help you with upgrading your R version and migrate your package library.
A similar package called `pacman` can help with updating R packages across
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfinished sentence?

To update RStudio to the latest version, open RStudio and click on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To update RStudio to the latest version, open RStudio and click on
To update RStudio to the latest version, open RStudio and click on `Help > Check for Updates`.

`Help > Check for Updates`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`Help > Check for Updates`.

If a new version is available follow the
instruction on screen.
By default, RStudio will also automatically notify you of new versions every once in a while.

::::::::::::::::::::::::::::: callout

The changes introduced by new R versions are usually backwards-compatible. That is, your old code should still work after updating your R version. However, if breaking changes happen, it is useful to know that you can have multiple versions of R installed in parallel and that you can switch between them in RStudio by going to `Tools > Global Options > General > Basic`.
The changes introduced by new R versions are usually backwards-compatible.
That is, your old code should still work after updating your R version.
However, if breaking changes happen, it is useful to know that you can have multiple versions of R installed in parallel and that you can switch between them in RStudio by going to `Tools > Global Options > General > Basic`.

While this may sound scary, it is **far more common** to run into issues due to using out-of-date versions of R or R packages. Keeping up with the latest versions of R, RStudio, and any packages you regularly use is a good practice.
While this may sound scary, it is **far more common** to run into issues due to using out-of-date versions of R or R packages.
Keeping up with the latest versions of R, RStudio, and any packages you regularly use is a good practice.

:::::::::::::::::::::::::::::

### Install required R packages

During the course we will need a number of R packages. Packages contain useful R code written by other people. We will use the packages `tidyverse`, and `ratdat`.
During the course we will need a number of R packages.
Packages contain useful R code written by other people.
We will use the packages `tidyverse`, and `ratdat`.

To try to install these packages, open RStudio and copy and paste the following command into the console window (look for a blinking cursor on the bottom left), then press the <kbd>Enter</kbd> (Windows and Linux) or <kbd>Return</kbd> (MacOS) to execute the command.

Expand All @@ -97,7 +104,7 @@ install.packages(c("tidyverse", "ratdat"))

Alternatively, you can install the packages using RStudio's graphical user interface by going to `Tools > Install Packages` and typing the names of the packages separated by a comma.

R tries to download and install the packages on your machine.
R tries to download and install the packages on your machine.

When the installation has finished, you can try to load the packages by pasting the following code into the console:

Expand All @@ -106,22 +113,26 @@ library(tidyverse)
library(ratdat)
```

If you do not see an error like `there is no package called ...` you are good to go!
If you do not see an error like `there is no package called '...'` you are good to go!

### Updating R packages

Generally, it is recommended to keep your R version and all packages up to date, because new versions bring improvements and important bugfixes. To update the packages that you have installed, click `Update` in the `Packages` tab in the bottom right panel of RStudio, or go to `Tools > Check for Package Updates...`
Generally, it is recommended to keep your R version and all packages up to date, because new versions bring improvements and important bugfixes.
To update the packages that you have installed, click `Update` in the `Packages` tab in the bottom right panel of RStudio, or go to `Tools > Check for Package Updates...`

You should update **all of the packages** required for the lesson, even if you installed them relatively recently.

Sometimes, package updates introduce changes that break your old code, which can be very frustrating. To avoid this problem, you can use a package called `renv`. It locks the package versions you have used for a given project and makes it straightforward to reinstall those exact package version in a new environment, for example after updating your R version or on another computer. However, the details are outside of the scope of this lesson.
Sometimes, package updates introduce changes that break your old code, which can be very frustrating.
To avoid this problem, you can use a package called `renv`.
It locks the package versions you have used for a given project and makes it straightforward to reinstall those exact package version in a new environment, for example after updating your R version or on another computer.
However, the details are outside of the scope of this lesson.

### Download the data

We will download the data directly from R during the lessons. However, if you are expecting problems with the network, it may be better to download the data beforehand and store it on your machine.
We will download the data directly from R during the lessons.
However, if you are expecting problems with the network, it may be better to download the data beforehand and store it on your machine.

The data files for the lesson can be downloaded manually:

- [cleaned data](../episodes/data/cleaned/surveys_complete_77_89.csv) and
- [zip file of raw data](../episodes/data/new_data.zip).

- [cleaned data](../episodes/data/cleaned/surveys_complete_77_89.csv) and
- [zip file of raw data](../episodes/data/new_data.zip).
Loading