Skip to content

Commit 98cf24e

Browse files
authored
add list column example in intro vignette (#6558)
1 parent 6be4cdd commit 98cf24e

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

vignettes/datatable-intro.Rmd

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -643,6 +643,26 @@ DT[, print(list(c(a,b))), by = ID] # (2)
643643

644644
In (1), for each group, a vector is returned, with length = 6,4,2 here. However, (2) returns a list of length 1 for each group, with its first element holding vectors of length 6,4,2. Therefore, (1) results in a length of ` 6+4+2 = `r 6+4+2``, whereas (2) returns `1+1+1=`r 1+1+1``.
645645

646+
Flexibility of j allows us to store any list object as an element of data.table. For example, when statistical models are fit to groups, these models can be stored in a data.table. Code is concise and easy to understand.
647+
648+
```{r}
649+
## Do long distance flights cover up departure delay more than short distance flights?
650+
## Does cover up vary by month?
651+
flights[, `:=`(makeup = dep_delay - arr_delay)]
652+
653+
makeup.models <- flights[, .(fit = list(lm(makeup ~ distance))), by = .(month)]
654+
makeup.models[, .(coefdist = coef(fit[[1]])[2], rsq = summary(fit[[1]])$r.squared), by = .(month)]
655+
```
656+
Using data.frames, we need more complicated code to obtain same result.
657+
```{r}
658+
setDF(flights)
659+
flights.split <- split(flights, f = flights$month)
660+
makeup.models.list <- lapply(flights.split, function(df) c(month = df$month[1], fit = list(lm(makeup ~ distance, data = df))))
661+
makeup.models.df <- do.call(rbind, makeup.models.list)
662+
sapply(makeup.models.df[, "fit"], function(model) c(coefdist = coef(model)[2], rsq = summary(model)$r.squared)) |> t() |> data.frame()
663+
setDT(flights)
664+
```
665+
646666
## Summary
647667

648668
The general form of `data.table` syntax is:

0 commit comments

Comments
 (0)