Skip to content

Commit f1d8ca4

Browse files
committed
added information
1 parent f5ef6b8 commit f1d8ca4

File tree

2 files changed

+41
-2
lines changed

2 files changed

+41
-2
lines changed

man/data.table.Rd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,9 +100,9 @@ data.table(\dots, keep.rownames=FALSE, check.names=FALSE, key=NULL, stringsAsFac
100100
\item{by}{ Column names are seen as if they are variables (as in \code{j} when \code{with=TRUE}). The \code{data.table} is then grouped by the \code{by} and \code{j} is evaluated within each group. The order of the rows within each group is preserved, as is the order of the groups. \code{by} accepts:
101101
102102
\itemize{
103-
\item A single unquoted column name: e.g., \code{DT[, .(sa=sum(a)), by=x]}
103+
\item A single unquoted column name or expression, e.g., \code{DT[, .(sa=sum(a)), by=x]} or \code{by=x\%\%2}. This is a convenience; for multiple expressions a \code{list()} is required .
104104
105-
\item a \code{list()} of expressions of column names: e.g., \code{DT[, .(sa=sum(a)), by=.(x=x>0, y)]}
105+
\item a \code{list()} of expressions of column names, e.g., \code{DT[, .(sa=sum(a)), by=.(x>0, y)]}. Use a named list to set the names of the resulting grouping columns, e.g., \code{by=.(x_is_positive=x>0, y)}. As a concise shortcut for a \emph{single} expression, you can also use parentheses to name the output column, e.g., \code{by=(grp = x \%\% 2)}.
106106
107107
\item a single character string containing comma separated column names (where spaces are significant since column names may contain spaces even at the start or end): e.g., \code{DT[, sum(a), by="x,y,z"]}
108108

vignettes/datatable-programming.Rmd

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,45 @@ subset(iris, Species == "setosa")
4646

4747
Here, `subset` takes the second argument and evaluates it within the scope of the `data.frame` given as its first argument. This removes the need for variable repetition, making it less prone to errors, and makes the code more readable.
4848

49+
### Dynamic Grouping and Naming Syntax
50+
51+
Besides the programmatic use with `env`, `data.table` offers some powerful and concise syntax for interactive use, especially in the `by` argument.
52+
53+
```{r by_syntax_setup_concise}
54+
d = data.table(x = 1:4, y = 2:5)
55+
```
56+
57+
#### Grouping by Expressions in `by`
58+
59+
For convenience, `data.table` allows you to group by a single expression directly without `list()` or `.()`. To name the resulting grouping column, you have two options:
60+
61+
```{r by_syntax_naming_concise}
62+
# 1. The canonical way: a named list (required for multiple expressions)
63+
d[, sum(y), by = .(grp = x %% 2)]
64+
65+
# 2. A concise shortcut: parentheses (for a single expression)
66+
d[, sum(y), by = (grp = x %% 2)]
67+
```
68+
69+
The `(grp = ...)` syntax is a base R feature that `data.table` leverages to see the intended column name.
70+
71+
#### Important Contrast: Naming in `j` vs. `by`
72+
73+
This parentheses shortcut for naming does **not** work in `j`. In `j`, you must use the canonical `.(new_name = ...)` syntax to create a named column.
74+
75+
```{r by_syntax_j_concise}
76+
# Correct way to name a new column in `j`
77+
d[, .(sum_y = sum(y)), by = .(grp = x %% 2)]
78+
79+
# This will not create a column named 'sum_y'
80+
d[, (sum_y = sum(y)), by = .(grp = x %% 2)]
81+
```
82+
In the second case, the parentheses cause base R to evaluate the expression, returning only the final value. `data.table` then gives this unnamed result a default column name (`V1`).
83+
84+
**Takeaway:**
85+
* In `by`, `(name = expr)` is a valid shortcut for `.(name = expr)`.
86+
* In `j`, you must always use `.(name = expr)` to create a named column.
87+
4988
## Problem description
5089

5190
The problem with this kind of interface is that we cannot easily parameterize the code that uses it. This is because the expressions passed to those functions are substituted before being evaluated.

0 commit comments

Comments
 (0)