Skip to content

Commit 40303ce

Browse files
authored
Fixed outdated example in reshape vignette (#7150)
1 parent 0e5f928 commit 40303ce

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

vignettes/datatable-reshape.Rmd

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -142,31 +142,31 @@ So far we've seen features of `melt` and `dcast` that are implemented efficientl
142142
However, there are situations we might run into where the desired operation is not expressed in a straightforward manner. For example, consider the `data.table` shown below:
143143

144144
```{r}
145-
s2 <- "family_id age_mother dob_child1 dob_child2 dob_child3 gender_child1 gender_child2 gender_child3
146-
1 30 1998-11-26 2000-01-29 NA 1 2 NA
147-
2 27 1996-06-22 NA NA 2 NA NA
148-
3 26 2002-07-11 2004-04-05 2007-09-02 2 2 1
149-
4 32 2004-10-10 2009-08-27 2012-07-21 1 1 1
150-
5 29 2000-12-05 2005-02-28 NA 2 1 NA"
145+
s2 <- "family_id age_mother name_child1 name_child2 name_child3 gender_child1 gender_child2 gender_child3
146+
1 30 Ben Anna NA 1 2 NA
147+
2 27 Tom NA NA 2 NA NA
148+
3 26 Lia Sam Amy 2 2 1
149+
4 32 Max Zoe Joe 1 1 1
150+
5 29 Dan Eva NA 2 1 NA"
151151
DT <- fread(s2)
152152
DT
153153
## 1 = female, 2 = male
154154
```
155155

156-
And you'd like to combine (`melt`) all the `dob` columns together, and `gender` columns together. Using the old functionality, we could do something like this:
156+
And you'd like to combine (`melt`) all the `name` columns together, and `gender` columns together. Using the old functionality, we could do something like this:
157157

158158
```{r}
159159
DT.m1 = melt(DT, id.vars = c("family_id", "age_mother"))
160160
DT.m1[, c("variable", "child") := tstrsplit(variable, "_", fixed = TRUE)]
161161
DT.c1 = dcast(DT.m1, family_id + age_mother + child ~ variable, value.var = "value")
162162
DT.c1
163163
164-
str(DT.c1) ## gender column is class IDate now!
164+
str(DT.c1) ## gender column is character type now!
165165
```
166166

167167
#### Issues
168168

169-
1. What we wanted to do was to combine all the `dob` and `gender` type columns together respectively. Instead, we are combining *everything* together, and then splitting them again. I think it's easy to see that it's quite roundabout (and inefficient).
169+
1. What we wanted to do was to combine all the `name` and `gender` type columns together respectively. Instead, we are combining *everything* together, and then splitting them again. I think it's easy to see that it's quite roundabout (and inefficient).
170170

171171
As an analogy, imagine you've a closet with four shelves of clothes and you'd like to put together the clothes from shelves 1 and 2 together (in 1), and 3 and 4 together (in 3). What we are doing is more or less to combine all the clothes together, and then split them back on to shelves 1 and 3!
172172

@@ -189,9 +189,9 @@ Since we'd like for `data.table`s to perform this operation straightforward and
189189
The idea is quite simple. We pass a list of columns to `measure.vars`, where each element of the list contains the columns that should be combined together.
190190

191191
```{r}
192-
colA = paste0("dob_child", 1:3)
192+
colA = paste0("name_child", 1:3)
193193
colB = paste0("gender_child", 1:3)
194-
DT.m2 = melt(DT, measure.vars = list(colA, colB), value.name = c("dob", "gender"))
194+
DT.m2 = melt(DT, measure.vars = list(colA, colB), value.name = c("name", "gender"))
195195
DT.m2
196196
197197
str(DT.m2) ## col type is preserved
@@ -206,7 +206,7 @@ str(DT.m2) ## col type is preserved
206206
Usually in these problems, the columns we'd like to melt can be distinguished by a common pattern. We can use the function `patterns()`, implemented for convenience, to provide regular expressions for the columns to be combined together. The above operation can be rewritten as:
207207

208208
```{r}
209-
DT.m2 = melt(DT, measure.vars = patterns("^dob", "^gender"), value.name = c("dob", "gender"))
209+
DT.m2 = melt(DT, measure.vars = patterns("^name", "^gender"), value.name = c("name", "gender"))
210210
DT.m2
211211
```
212212

@@ -305,7 +305,7 @@ We can now provide **multiple `value.var` columns** to `dcast` for `data.table`s
305305

306306
```{r}
307307
## new 'cast' functionality - multiple value.vars
308-
DT.c2 = dcast(DT.m2, family_id + age_mother ~ variable, value.var = c("dob", "gender"))
308+
DT.c2 = dcast(DT.m2, family_id + age_mother ~ variable, value.var = c("name", "gender"))
309309
DT.c2
310310
```
311311

0 commit comments

Comments
 (0)