Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,21 @@

18. `fwrite` now allows `dec` to be the same as `sep` for edge cases where only one will be written, e.g. 0-row or 1-column tables. [#7227](https://github.com/Rdatatable/data.table/issues/7227). Thanks @MichaelChirico for the report and @venom1204 for the fix.

22. Using `by=` or `keyby=` with a simple numeric or character vector in `j` (e.g. `DT[, 1:2, by=grp]`) used to silently ignore the grouping argument. This now issues a warning to alert the user that grouping is not applied in this syntax and guides them to use the `.SD` idiom instead. [#5397](https://github.com/Rdatatable/data.table/issues/5397). Thanks to @mcol for the report and @venom1204 for the fix.

```r
DT = data.table(a=1:4, grp=c(1,1,2,2))
DT[, 1, by = grp]
# a
# <int>
# 1: 1
# 2: 2
# 3: 3
# 4: 4
# Warning message:
# `by` or `keyby` is ignored when `j` is a numeric vector...
```

### NOTES

1. The following in-progress deprecations have proceeded:
Expand Down
10 changes: 10 additions & 0 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -739,6 +739,12 @@
if (!length(j) && !notj) return( null.data.table() )
if (is.factor(j)) j = as.character(j) # fix for FR: #358
if (is.character(j)) {
if (!missingby && (missing(with) || isTRUE(with))) {
warning(

Check warning on line 743 in R/data.table.R

View workflow job for this annotation

GitHub Actions / lint-r

file=R/data.table.R,line=743,col=13,[condition_call_linter] Use warning(., call. = FALSE) not to display the call in an error message.
"`by` or `keyby` is ignored when `j` is a character vector used for column selection. ",
"Perhaps you intended to use `.SD`? For example: DT[, .SD[, ", deparse(jsub), "], by = ...]"
)
}
if (notj) {
if (anyNA(idx <- chmatch(j, names_x)))
warningf(ngettext(sum(is.na(idx)), "column not removed because not found: %s", "columns not removed because not found: %s"),
Expand All @@ -762,6 +768,10 @@
# else the NA in ansvals are for join inherited scope (test 1973), and NA could be in irows from join and data in i should be returned (test 1977)
# in both cases leave to the R-level subsetting of i and x together further below
} else if (is.numeric(j)) {
if (!missingby && (missing(with) || isTRUE(with))) {
warning(

Check warning on line 772 in R/data.table.R

View workflow job for this annotation

GitHub Actions / lint-r

file=R/data.table.R,line=772,col=13,[condition_call_linter] Use warning(., call. = FALSE) not to display the call in an error message.
"`by` or `keyby` is ignored when `j` is a numeric vector used for column selection. ", "Perhaps you intended to use `.SD`? For example: DT[, .SD[, ", deparse(jsub), "], by = ...]")
}
j = as.integer(j)
if (any(w <- (j>ncol(x)))) stopf("Item %d of j is %d which is outside the column number range [1,ncol=%d]", idx <- which.first(w), j[idx], ncol(x))
j = j[j!=0L]
Expand Down
9 changes: 9 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -21620,3 +21620,12 @@ local({
test(2338.9, {fwrite(dd, f, forceDecimal=FALSE); fread(f)}, di)
})

# 5397 - keyby/key ignored if numeric indices used in j
DT = data.table(a=1:4, b=5:8, g=c(1,1,2,2))
test(2339.1, DT[, 1:2, by=g], DT[, 1:2], warning="`by` or `keyby` is ignored")
test(2339.2, DT[, 2:1, keyby=g], DT[, 2:1], warning="`by` or `keyby` is ignored")
test(2339.3, DT[, c("b", "a"), by=g, with=FALSE], DT[, c("b", "a")])
expected_sd = data.table(g=c(1,1,2,2), a=1:4, b=5:8)
test(2339.4, DT[, .SD[, 1:2], by=g], expected_sd)
expected_single_int = data.table(g=c(1,2), V1=c(1,1))
test(2339.5, DT[, 1, by=g], expected_single_int)
4 changes: 2 additions & 2 deletions man/data.table.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ data.table(\dots, keep.rownames=FALSE, check.names=FALSE, key=NULL, stringsAsFac

See \href{../doc/datatable-intro.html}{\code{vignette("datatable-intro")}} and \code{example(data.table)}.}

\item{by}{ Column names are seen as if they are variables (as in \code{j} when \code{with=TRUE}). The \code{data.table} is then grouped by the \code{by} and \code{j} is evaluated within each group. The order of the rows within each group is preserved, as is the order of the groups. \code{by} accepts:

\item{by}{ Column names are seen as if they are variables (as in \code{j} when \code{with=TRUE}). \emph{Note that `by` and `keyby` are ignored when `j` is a character or numeric vector used for selecting columns (i.e., when the internal `with=FALSE` is triggered).} The \code{data.table} is then grouped by the \code{by} and \code{j} is evaluated within each group. The order of the rows within each group is preserved, as is the order of the groups. \code{by} accepts:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use ` in Rd files like this, use \code{} instead

\itemize{
\item A single unquoted column name: e.g., \code{DT[, .(sa=sum(a)), by=x]}

Expand Down
Loading