Skip to content

Commit 23e651d

Browse files
working on adding colns in db using mutate
1 parent c9b277a commit 23e651d

File tree

1 file changed

+16
-5
lines changed

1 file changed

+16
-5
lines changed

source/reading.Rmd

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -616,19 +616,30 @@ response for us. So `dbplyr` does all the hard work of translating from R to SQL
616616
we can just stick with R!
617617

618618
With our `lang_db` table reference for the 2016 Canadian Census data in hand, we
619-
can mostly continue onward as if it were a regular data frame. For example,
620-
we can use the `filter` function
621-
to obtain only certain rows. Below we filter the data to include only Aboriginal languages.
619+
can mostly continue onward as if it were a regular data frame. For example, let's do the same exercise
620+
from Chapter \@ref(intro): we will obtain only those rows corresponding to Aboriginal languages, and keep only
621+
the `language` and `mother_tongue` columns.
622+
We can use the `filter` function to obtain only certain rows. Below we filter the data to include only Aboriginal languages.
622623

623624
```{r}
624625
aboriginal_lang_db <- filter(lang_db, category == "Aboriginal languages")
625626
aboriginal_lang_db
626627
```
627628

628629
Above you can again see the hints that this data is not actually stored in R yet:
629-
the source is a `lazy query [?? x 6]` and the output says `... with more rows` at the end
630+
the source is `SQL [?? x 6]` and the output says `... more rows` at the end
630631
(both indicating that R does not know how many rows there are in total!),
631-
and a database type `sqlite 3.36.0` is listed.
632+
and a database type `sqlite` is listed.
633+
We didn't use the `collect` function because we are not ready to bring the data into R yet. \index{collect}
634+
We can still use the database to do some work to obtain *only* the small amount of data we want to work with locally
635+
in R Let's add the second part of our database query: selecting only the `language` and `mother_tongue` columns
636+
using the `select` function.
637+
638+
```{r}
639+
aboriginal_lang_selected_db <- select(aboriginal_lang_db, language, mother_tongue)
640+
aboriginal_lang_selected_db
641+
```
642+
632643
In order to actually retrieve this data in R as a data frame,
633644
we use the `collect` function. \index{filter}
634645
Below you will see that after running `collect`, R knows that the retrieved

0 commit comments

Comments
 (0)