You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
<li>use a Jupyter notebook to execute provided R code</li>
438
438
<li>edit code and markdown cells in a Jupyter notebook</li>
439
439
<li>create new code and markdown cells in a Jupyter notebook</li>
440
-
<li>load the <code>tidyverse</code>library into R</li>
440
+
<li>load the <code>tidyverse</code>package into R</li>
441
441
<li>create new variables and objects in R using the assignment symbol</li>
442
442
<li>use the help and documentation tools in R</li>
443
-
<li>match the names of the following functions from the <code>tidyverse</code>library to their documentation descriptions:
443
+
<li>match the names of the following functions from the <code>tidyverse</code>package to their documentation descriptions:
444
444
<ul>
445
445
<li><code>read_csv</code></li>
446
446
<li><code>select</code></li>
@@ -502,8 +502,8 @@ <h2><span class="header-section-number">1.3</span> Loading a spreadsheet-like da
502
502
<li>does not have row names.</li>
503
503
</ul>
504
504
<p>Below you’ll see the code used to load the data into R using the <code>read_csv</code> function. But there is one extra step we need to do first. Since <code>read_csv</code> is not included in the base installation of R,
505
-
to be able to use it we have to load it from somewhere else: a collection of useful functions known as a <em>library</em>. The <code>read_csv</code> function in particular
506
-
is in the <code>tidyverse</code>library (more on this later), which we load using the <code>library</code> function.</p>
505
+
to be able to use it we have to load it from somewhere else: a collection of useful functions known as a <em>package</em>. The <code>read_csv</code> function in particular
506
+
is in the <code>tidyverse</code>package (more on this later), which we load using the <code>library</code> function.</p>
507
507
<p>Next, we call the <code>read_csv</code> function and pass it a single argument: the name of the file, <code>"can_lang.csv"</code>. We have to put quotes around filenames and other letters and words that we
508
508
use in our code to distinguish it from the special words that make up R programming language. This is the only argument we need to provide for this file, because our file satifies everthing else
509
509
the <code>read_csv</code> function expects in the default use-case (which we just discussed). Later in the course, we’ll learn more about how to deal with more complicated files where the default arguments are not
<li><p>choose the appropriate <code>tidyverse</code><code>read_*</code> function and function arguments to load a given plain text tabular data set into R</p></li>
400
-
<li><p>use <code>readxl</code>library’s <code>read_excel</code> function and arguments to load a sheet from an excel file into R</p></li>
401
-
<li><p>connect to a database using the <code>DBI</code>library’s <code>dbConnect</code> function</p></li>
402
-
<li><p>list the tables in a database using the <code>DBI</code>library’s <code>dbListTables</code> function</p></li>
403
-
<li><p>create a reference to a database table that is queriable using the <code>tbl</code> from the <code>dbplyr</code>library</p></li>
404
-
<li><p>retrieve data from a database query and bring it into R using the <code>collect</code> function from the <code>dbplyr</code>library</p></li>
400
+
<li><p>use <code>readxl</code>package’s <code>read_excel</code> function and arguments to load a sheet from an excel file into R</p></li>
401
+
<li><p>connect to a database using the <code>DBI</code>package’s <code>dbConnect</code> function</p></li>
402
+
<li><p>list the tables in a database using the <code>DBI</code>package’s <code>dbListTables</code> function</p></li>
403
+
<li><p>create a reference to a database table that is queriable using the <code>tbl</code> from the <code>dbplyr</code>package</p></li>
404
+
<li><p>retrieve data from a database query and bring it into R using the <code>collect</code> function from the <code>dbplyr</code>package</p></li>
405
405
<li><p>use <code>write_csv</code> to save a data frame to a <code>.csv</code> file</p></li>
406
406
<li><p>(<em>optional</em>) scrape data from the web</p>
407
407
<ul>
@@ -471,19 +471,19 @@ <h2><span class="header-section-number">2.4</span> Reading tabular data from a p
471
471
Non-Official & Non-Aboriginal languages,American Sign Language,2685,3020,1145,21930
472
472
Non-Official & Non-Aboriginal languages,Amharic,22465,12785,200,33670</code></pre>
473
473
<p>And here is a review of how we can use <code>read_csv</code> to load it into R. First we
474
-
load the <code>tidyverse</code>library to gain access to useful functions for reading the
474
+
load the <code>tidyverse</code>package to gain access to useful functions for reading the
<p>Note: it is normal and expected that a message is printed out after
479
-
loading the <code>tidyverse</code> and some libraries. Generally, this message let’s you
480
-
know if functions from the different libraries were loaded share the same name
479
+
loading the <code>tidyverse</code> and some packages. Generally, this message let’s you
480
+
know if functions from the different packages were loaded share the same name
481
481
(which is confusing to R), and if so, which one you can access using just it’s
482
-
name (and which one you need to refer the library name and the function name to
482
+
name (and which one you need to refer the package name and the function name to
483
483
refer to it, this is called masking). Additionally, the <code>tidyverse</code> is a special
484
-
R library - it is a meta-library or meta-package that bundles together several
485
-
related and commonly used packages. Because of this it lists the libraries it
486
-
does the job of loading. In future when we load this library in this book we
484
+
R package - it is a meta-package that bundles together several
485
+
related and commonly used packages. Because of this it lists the packages it
486
+
does the job of loading. In future when we load this package in this book we
487
487
will silence these messages to help with readability of the book.</p>
488
488
</blockquote>
489
489
<p>Next we use <code>read_csv</code> to load the data into R, and in that call we specify the
@@ -769,7 +769,7 @@ <h4><span class="header-section-number">2.6.1.1</span> Reading data from a SQLit
769
769
<p>Although it looks like we just got a data frame from the database, we didn’t! It’s a <em>reference</em>, showing us data that is still in the SQLite database (note the first two lines of the output).
770
770
It does this because databases are often more efficient at selecting, filtering and joining large data sets than R. And typically, the database will not even be
771
771
stored on your computer, but rather a more powerful machine somewhere on the web. So R is lazy and waits to bring this data into memory until you explicitly tell
772
-
it to do so using the <code>collect</code> function from the <code>dbplyr</code>library.</p>
772
+
it to do so using the <code>collect</code> function from the <code>dbplyr</code>package.</p>
773
773
<p>Here we will filter for only rows in the Aboriginal languages category according to the 2016 Canada Census, and then use <code>collect</code> to finally bring this data into R as a data frame.</p>
@@ -831,7 +831,7 @@ <h4><span class="header-section-number">2.6.1.2</span> Reading data from a Postg
831
831
<li><code>user</code> - the username for accessing the database</li>
832
832
<li><code>password</code> - the password for accessing the database</li>
833
833
</ul>
834
-
<p>Additionally, we must use the <code>RPostgres</code>library instead of <code>RSQLite</code> in the <code>dbConnect</code> function call.
834
+
<p>Additionally, we must use the <code>RPostgres</code>package instead of <code>RSQLite</code> in the <code>dbConnect</code> function call.
835
835
Below we demonstrate how to connect to a version of the <code>can_mov_db</code> database, which contains information about Canadian movies (<em>note - this is a synthetic, or artificial, database</em>).</p>
<h2><spanclass="header-section-number">2.7</span> Writing data from R to a <code>.csv</code> file</h2>
907
907
<p>At the middle and end of a data analysis, we often want to write a data frame that has changed (either through filtering, selecting, mutating or summarizing) to a file
908
-
to share it with others or use it for another step in the analysis. The most straightforward way to do this is to use the <code>write_csv</code> function from the <code>tidyverse</code>library.
908
+
to share it with others or use it for another step in the analysis. The most straightforward way to do this is to use the <code>write_csv</code> function from the <code>tidyverse</code>package.
909
909
The default arguments for this file are to use a comma (<code>,</code>) as the delimiter and include column names. Below we demonstrate creating a new version of the Canadian languages data set without the official languages category according to the Canadian 2016 Census, and then writing this to a <code>.csv</code> file:</p>
<p>Then we send the page object to the <code>html_nodes</code> function. We also provide that function with the CSS selectors we obtained from the selectorgadget tool. These should be surrounded by quotations. The <code>html_nodes</code> function select nodes from the HTML document using CSS selectors. Nodes are the HTML tag pairs as well as the content between the tags. For our CSS selector <code>td:nth-child(5)</code> and example node that would be selected would be: <code><td style="text-align:left;background:#f0f0f0;"><a href="/wiki/London,_Ontario" title="London, Ontario">London</a></td></code></p>
994
+
<p>We will use <code>head()</code> here to limit the print output of these vectors to 6 lines.</p>
0 commit comments