Skip to content

Commit 627c236

Browse files
ToC and headings' fix
1 parent 2864e34 commit 627c236

File tree

2 files changed

+29
-11
lines changed

2 files changed

+29
-11
lines changed

content/post/2023-01-30-r-basic-advanceds-variables-and-names-in-dplyr/index.Rmd

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,13 @@ date: '2023-01-30'
55
slug: r-basic-advanceds-variables-and-names-in-dplyr
66
categories: ['Tutorial']
77
tags: ['r', 'tutorial', 'dplyr', 'environments', 'rlang']
8+
description: "TODO."
9+
output:
10+
blogdown::html_page:
11+
toc: true
812
---
913

10-
# Intro
14+
## Intro
1115

1216
Hello everyone! After an extended hiatus for various reasons (from graduating college to navigating job changes and legal challenges), we're back and eager to breathe new life into this blog. Given my deep interest in the fundamentals of advanced methods, today we're delving into an essential topic every dplyr user will eventually face.
1317

@@ -23,7 +27,7 @@ library(dplyr)
2327
iris <- iris %>% slice(1:5)
2428
```
2529

26-
# Problem 1: Symbols vs. strings with names
30+
## Problem 1: Symbols vs. strings with names
2731

2832
Let's compare how we select columns in a data frame using base R versus dplyr:
2933

@@ -119,7 +123,7 @@ my_subset_with_symbols(iris, Petal.Length, Sepal.Width)
119123

120124
In this way we let dplyr know that `my_var_as_symbol` has to be passed directly as user provided it. We can think of embracing as of cut-paste operation. We tell dplyr: "Take what user provided in place of `my_var_as_symbol` in function call and plug it directly into `select`, without creating any intermediate variables.". Call to `my_subset_with_symbols()` is basically replaced with what lies inside of it.
121125

122-
# Problem 3: dynamic columns in purrr formulas in `across`
126+
## Problem 3: dynamic columns in purrr formulas in `across`
123127

124128
While the above solutions work seamlessly with functions like `dplyr::select()`, challenges arise when operations grow complex. Suppose we wish to craft a function, `do_magic`, that takes data, a special `column`, and several `others` columns. This function should add the special column to all others.
125129

content/post/2023-01-30-r-basic-advanceds-variables-and-names-in-dplyr/index.html

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,33 @@
55
slug: r-basic-advanceds-variables-and-names-in-dplyr
66
categories: ['Tutorial']
77
tags: ['r', 'tutorial', 'dplyr', 'environments', 'rlang']
8+
description: "TODO."
9+
output:
10+
blogdown::html_page:
11+
toc: true
812
---
913

1014

15+
<div id="TOC">
16+
<ul>
17+
<li><a href="#intro">Intro</a></li>
18+
<li><a href="#problem-1-symbols-vs.-strings-with-names">Problem 1: Symbols vs. strings with names</a></li>
19+
<li><a href="#problem-2-passing-column-names-as-arguments-to-custom-functions">Problem 2: Passing column names as arguments to custom functions</a></li>
20+
<li><a href="#problem-3-dynamic-columns-in-purrr-formulas-in-across">Problem 3: dynamic columns in purrr formulas in <code>across</code></a></li>
21+
<li><a href="#summary-next-steps">Summary &amp; Next Steps</a></li>
22+
<li><a href="#dive-deeper-resources-for-the-curious-minds">Dive Deeper: Resources for the Curious Minds:</a></li>
23+
</ul>
24+
</div>
1125

12-
<div id="intro" class="section level1">
13-
<h1>Intro</h1>
26+
<div id="intro" class="section level2">
27+
<h2>Intro</h2>
1428
<p>Hello everyone! After an extended hiatus for various reasons (from graduating college to navigating job changes and legal challenges), we’re back and eager to breathe new life into this blog. Given my deep interest in the fundamentals of advanced methods, today we’re delving into an essential topic every dplyr user will eventually face.</p>
1529
<p>dplyr is meticulously designed with the primary goal of making code workflows read as naturally and close to plain language as possible. This design philosophy manifests in two critical dimensions: <em>semantic</em> and <em>syntactic</em>.</p>
1630
<p>Semantically, the emphasis is on <strong>employing words with intuitive and easily understood meanings</strong>. For instance, dplyr and its friends adhere to a robust naming convention where function names typically take on verb forms, elucidating the action they perform.</p>
1731
<p>Syntactically, the <strong>arrangement and combination of these descriptive words is paramount</strong>. Arguably, this is even more critical to the user experience. One of the most evident manifestations of this syntactical approach is the tidyverse’s hallmark feature: <strong>the pipe operator</strong>. But we are not going to tackle this today. I will look into caveats of another essential and intuitive syntactic feature: the <strong>use of symbols instead of strings to refer to variables within datasets</strong>. This offers a more natural-feeling mode of interaction but, as I have found out over many years of using R, this feature can lead to some problems.</p>
1832
</div>
19-
<div id="problem-1-symbols-vs.-strings-with-names" class="section level1">
20-
<h1>Problem 1: Symbols vs. strings with names</h1>
33+
<div id="problem-1-symbols-vs.-strings-with-names" class="section level2">
34+
<h2>Problem 1: Symbols vs. strings with names</h2>
2135
<p>Let’s compare how we select columns in a data frame using base R versus dplyr:</p>
2236
<pre class="r"><code># base
2337
iris[, c(&quot;Sepal.Length&quot;, &quot;Sepal.Width&quot;)]
@@ -74,6 +88,7 @@ <h1>Problem 1: Symbols vs. strings with names</h1>
7488
## 3 4.7 3.2
7589
## 4 4.6 3.1
7690
## 5 5.0 3.6</code></pre>
91+
</div>
7792
<div id="problem-2-passing-column-names-as-arguments-to-custom-functions" class="section level2">
7893
<h2>Problem 2: Passing column names as arguments to custom functions</h2>
7994
<p>Differentiating between passing a variable name or a symbol becomes trickier when constructing functions that internally use dplyr verbs. Consider:</p>
@@ -100,9 +115,8 @@ <h2>Problem 2: Passing column names as arguments to custom functions</h2>
100115
my_subset_with_symbols(iris, Petal.Length, Sepal.Width)</code></pre>
101116
<p>In this way we let dplyr know that <code>my_var_as_symbol</code> has to be passed directly as user provided it. We can think of embracing as of cut-paste operation. We tell dplyr: “Take what user provided in place of <code>my_var_as_symbol</code> in function call and plug it directly into <code>select</code>, without creating any intermediate variables.”. Call to <code>my_subset_with_symbols()</code> is basically replaced with what lies inside of it.</p>
102117
</div>
103-
</div>
104-
<div id="problem-3-dynamic-columns-in-purrr-formulas-in-across" class="section level1">
105-
<h1>Problem 3: dynamic columns in purrr formulas in <code>across</code></h1>
118+
<div id="problem-3-dynamic-columns-in-purrr-formulas-in-across" class="section level2">
119+
<h2>Problem 3: dynamic columns in purrr formulas in <code>across</code></h2>
106120
<p>While the above solutions work seamlessly with functions like <code>dplyr::select()</code>, challenges arise when operations grow complex. Suppose we wish to craft a function, <code>do_magic</code>, that takes data, a special <code>column</code>, and several <code>others</code> columns. This function should add the special column to all others.</p>
107121
<p>Leveraging <code>dplyr::mutate(dplyr::across())</code> can achieve this. Its syntax is:</p>
108122
<pre class="r"><code>mutate(across(columns_to_mutate, function_to_apply))</code></pre>
@@ -148,6 +162,7 @@ <h4>Tip: when <code>all_of()</code> does not work, use <code>.data</code></h4>
148162
## 5 3.6 2.2 1.4 0.2 setosa</code></pre>
149163
<p>When you need to reference the underlying data within the context of functions, the <code>.data</code> pronoun comes to the rescue. As demonstrated, it operates similarly to directly accessing the data.</p>
150164
</div>
165+
</div>
151166
<div id="summary-next-steps" class="section level2">
152167
<h2>Summary &amp; Next Steps</h2>
153168
<p>Throughout this post, we ventured deep into some of the intricacies of dplyr. We’ve unraveled how the package strives to make our code both semantic and syntactic, all while simplifying complex operations. The power of symbols and the utility of functions like <code>all_of()</code> and <code>.data</code> demonstrate just how dynamic and adaptable dplyr can be, especially when working with variable column names. While we’ve covered much ground, the world of dplyr is vast and constantly evolving. We are aware that all this <em>embracing</em> and <em>tidyselect</em> rules might be intimidating, but we will be continue to explore more facets of the tidyverse in future posts of “basic advanceds”, aiming to empower you with advanced techniques that enhance your data analysis journey.</p>
@@ -158,4 +173,3 @@ <h2>Summary &amp; Next Steps</h2>
158173
<h2>Dive Deeper: Resources for the Curious Minds:</h2>
159174
<p>For those wishing to delve further or who may have lingering questions: <a href="https://dplyr.tidyverse.org/articles/programming.html">Dplyr official programming guide</a></p>
160175
</div>
161-
</div>

0 commit comments

Comments
 (0)