Skip to content

Commit a014e38

Browse files
authored
dogroups: resize missing groups to length-1 (#7447)
Fixes: #7442
1 parent 2c68e5d commit a014e38

File tree

3 files changed

+8
-0
lines changed

3 files changed

+8
-0
lines changed

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,8 @@ See [#2611](https://github.com/Rdatatable/data.table/issues/2611) for details. T
348348

349349
24. Rolling functions now ensure there is no nested parallelism. It could have happened for vectorized input and `adaptive=TRUE`, [#7352](https://github.com/Rdatatable/data.table/issues/7352). Thanks @jangorecki for the fix.
350350

351+
25. By-group operations on missing rows (e.g. `foo[c(i, NA), bar, by=grp]`) now avoid leaving in data from the previous groups, [#7442](https://github.com/Rdatatable/data.table/issues/7442). Thanks @aitap for the report and the fix.
352+
351353
### NOTES
352354

353355
1. The following in-progress deprecations have proceeded:

inst/tests/tests.Rraw

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21865,3 +21865,8 @@ text = paste0(
2186521865
strrep("a", 500), "\n", "a"
2186621866
)
2186721867
test(2346, data.table::fread(text = text), data.table(mary = rep("mary", 99), had = "had", a = "a", little = "little", lamb = "lamb"), warning = "First discarded non-empty line")
21868+
21869+
# With 'i' containing NA and no GForce, columns for missing groups weren't fully missing, #7442
21870+
DT = data.table(a="a", grp=1L)
21871+
i = c(1, 1, 1, NA, NA)
21872+
test(2347, DT[i, .(result = all(is.na(grp) == is.na(a))), by = grp][,all(result)], options = list(datatable.optimize = 0))

src/dogroups.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,7 @@ SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols, SEX
217217
}
218218
if (istarts[i] == NA_INTEGER || (LENGTH(order) && iorder[ istarts[i]-1 ]==NA_INTEGER)) {
219219
for (int j=0; j<length(SDall); ++j) {
220+
SETLENGTH(VECTOR_ELT(SDall, j), 1);
220221
writeNA(VECTOR_ELT(SDall, j), 0, 1, false);
221222
// writeNA uses SET_ for STR and VEC, and we always use SET_ to assign to SDall always too. Otherwise,
222223
// this writeNA could decrement the reference for the old value which wasn't incremented in the first place.

0 commit comments

Comments
 (0)