added information

venom1204 · venom1204 · commit f1d8ca40b347 · 2025-06-19T19:46:16.000Z
diff --git a/man/data.table.Rd b/man/data.table.Rd
@@ -100,9 +100,9 @@ data.table(\dots, keep.rownames=FALSE, check.names=FALSE, key=NULL, stringsAsFac
     \item{by}{ Column names are seen as if they are variables (as in \code{j} when \code{with=TRUE}). The \code{data.table} is then grouped by the \code{by} and \code{j} is evaluated within each group. The order of the rows within each group is preserved, as is the order of the groups. \code{by} accepts:
 
     \itemize{
-        \item A single unquoted column name: e.g., \code{DT[, .(sa=sum(a)), by=x]}
+         \item A single unquoted column name or expression, e.g., \code{DT[, .(sa=sum(a)), by=x]} or \code{by=x\%\%2}. This is a convenience; for multiple expressions a \code{list()} is required .
 
-        \item a \code{list()} of expressions of column names: e.g., \code{DT[, .(sa=sum(a)), by=.(x=x>0, y)]}
+        \item a \code{list()} of expressions of column names, e.g., \code{DT[, .(sa=sum(a)), by=.(x>0, y)]}. Use a named list to set the names of the resulting grouping columns, e.g., \code{by=.(x_is_positive=x>0, y)}. As a concise shortcut for a \emph{single} expression, you can also use parentheses to name the output column, e.g., \code{by=(grp = x \%\% 2)}.
 
         \item a single character string containing comma separated column names (where spaces are significant since column names may contain spaces even at the start or end): e.g., \code{DT[, sum(a), by="x,y,z"]}
 
diff --git a/vignettes/datatable-programming.Rmd b/vignettes/datatable-programming.Rmd
@@ -46,6 +46,45 @@ subset(iris, Species == "setosa")
 
 Here, `subset` takes the second argument and evaluates it within the scope of the `data.frame` given as its first argument. This removes the need for variable repetition, making it less prone to errors, and makes the code more readable.
 
+### Dynamic Grouping and Naming Syntax
+
+Besides the programmatic use with `env`, `data.table` offers some powerful and concise syntax for interactive use, especially in the `by` argument.
+
+```{r by_syntax_setup_concise}
+d = data.table(x = 1:4, y = 2:5)
+```
+
+#### Grouping by Expressions in `by`
+
+For convenience, `data.table` allows you to group by a single expression directly without `list()` or `.()`. To name the resulting grouping column, you have two options:
+
+```{r by_syntax_naming_concise}
+# 1. The canonical way: a named list (required for multiple expressions)
+d[, sum(y), by = .(grp = x %% 2)]
+
+# 2. A concise shortcut: parentheses (for a single expression)
+d[, sum(y), by = (grp = x %% 2)]
+```
+
+The `(grp = ...)` syntax is a base R feature that `data.table` leverages to see the intended column name.
+
+#### Important Contrast: Naming in `j` vs. `by`
+
+This parentheses shortcut for naming does **not** work in `j`. In `j`, you must use the canonical `.(new_name = ...)` syntax to create a named column.
+
+```{r by_syntax_j_concise}
+# Correct way to name a new column in `j`
+d[, .(sum_y = sum(y)), by = .(grp = x %% 2)]
+
+# This will not create a column named 'sum_y'
+d[, (sum_y = sum(y)), by = .(grp = x %% 2)]
+```
+In the second case, the parentheses cause base R to evaluate the expression, returning only the final value. `data.table` then gives this unnamed result a default column name (`V1`).
+
+**Takeaway:**
+*   In `by`, `(name = expr)` is a valid shortcut for `.(name = expr)`.
+*   In `j`, you must always use `.(name = expr)` to create a named column.
+
 ## Problem description
 
 The problem with this kind of interface is that we cannot easily parameterize the code that uses it. This is because the expressions passed to those functions are substituted before being evaluated.