Skip to content

Commit 4c5f1e7

Browse files
Gforce grouping var class (#5568)
* copy classes to grouping vars * add tests * add different optimization levels to test * add news * add output * fix news * fix typo * add NEWS info and tests about attributes * hone NEWS * hone comment * Reframe test annotation * tweak test * Second call site --------- Co-authored-by: Michael Chirico <[email protected]> Co-authored-by: Michael Chirico <[email protected]>
1 parent 40e2d74 commit 4c5f1e7

File tree

3 files changed

+13
-2
lines changed

3 files changed

+13
-2
lines changed

NEWS.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
4141
12. `setDT` is faster for data with many columns, thanks @MichaelChirico for reporting and fixing the issue, [#5426](https://github.com/Rdatatable/data.table/issues/5426).
4242
43-
13. `dcast`gains `value.var.in.dots`, `value.var.in.LHSdots` and `value.var.in.RHSdots` arguments, [#5824](https://github.com/Rdatatable/data.table/issues/5824). This allows the `value.var` variable(s) in `dcast` to be represented by `...` in the formula (if not otherwise mentioned). Thanks to @iago-pssjd for the report and PR.
43+
13. `dcast` gains `value.var.in.dots`, `value.var.in.LHSdots` and `value.var.in.RHSdots` arguments, [#5824](https://github.com/Rdatatable/data.table/issues/5824). This allows the `value.var` variable(s) in `dcast` to be represented by `...` in the formula (if not otherwise mentioned). Thanks to @iago-pssjd for the report and PR.
4444
4545
14. `fread` loads `.bgz` files directly, [#5461](https://github.com/Rdatatable/data.table/issues/5461). Thanks to @TMRHarrison for the request with proposed fix, and Benjamin Schwendinger for the PR.
4646
@@ -62,6 +62,8 @@
6262
6363
8. Adding a list column to an empty `data.table` works consistently with other column types, [#5738](https://github.com/Rdatatable/data.table/issues/5738). Thanks to Benjamin Schwendinger for the report and the fix.
6464
65+
9. In `DT[,j,by]`, `by` retains its attributes (e.g. class) when `j` is GForce optimized, [#5567](https://github.com/Rdatatable/data.table/issues/5567). Thanks to @danwwilson for the report, and @ben-schwen for the PR.
66+
6567
## NOTES
6668
6769
1. `transform` method for data.table sped up substantially when creating new columns on large tables. Thanks to @OfekShilon for the report and PR. The implemented solution was proposed by @ColeMiller1.

R/data.table.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1842,7 +1842,7 @@ replace_dot_alias = function(e) {
18421842
if (use.I) assign(".I", seq_len(nrow(x)), thisEnv)
18431843
ans = gforce(thisEnv, jsub, o__, f__, len__, irows) # irows needed for #971.
18441844
gi = if (length(o__)) o__[f__] else f__
1845-
g = lapply(grpcols, function(i) groups[[i]][gi])
1845+
g = lapply(grpcols, function(i) .Call(CsubsetVector, groups[[i]], gi)) # use CsubsetVector instead of [ to preserve attributes #5567
18461846

18471847
# returns all rows instead of one per group
18481848
nrow_funs = c("gshift")

inst/tests/tests.Rraw

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18585,3 +18585,12 @@ test(2262.5, null.data.table()[, c("a","b") := list(1:2, 3:4)], dt3)
1858518585
test(2262.6, set(null.data.table(), j=c("a","b"), value=list(1:2, 3:4)), dt3)
1858618586
test(2262.7, data.table(a=1, b=2)[, c("a", "b") := list(NULL, NULL)], null.data.table())
1858718587
test(2262.8, data.table(a=1, b=2)[, c("a", "b") := list(NULL)], null.data.table())
18588+
18589+
# GForce retains attributes in by arguments #5567
18590+
dt = data.table(a=letters[1:4], b=structure(1:4, class = c("class_b", "integer"), att=1), c=structure(c(1L,2L,1L,2L), class = c("class_c", "integer")))
18591+
test(2263.1, options=list(datatable.verbose=TRUE, datatable.optimize=0L), dt[, .N, b], data.table(b=dt$b, N=1L), output="GForce FALSE")
18592+
test(2263.2, options=list(datatable.verbose=TRUE, datatable.optimize=0L), dt[, .N, .(b,c)], data.table(b=dt$b, c=dt$c, N=1L), output="GForce FALSE")
18593+
test(2263.3, options=list(datatable.verbose=TRUE, datatable.optimize=0L), names(attributes(dt[, .N, b]$b)), c("class", "att"), output="GForce FALSE")
18594+
test(2263.4, options=list(datatable.verbose=TRUE, datatable.optimize=Inf), dt[, .N, b], data.table(b=dt$b, N=1L), output="GForce optimized j to")
18595+
test(2263.5, options=list(datatable.verbose=TRUE, datatable.optimize=Inf), dt[, .N, .(b,c)], data.table(b=dt$b, c=dt$c, N=1L), output="GForce optimized j to")
18596+
test(2263.6, options=list(datatable.verbose=TRUE, datatable.optimize=Inf), names(attributes(dt[, .N, b]$b)), c("class", "att"), output="GForce optimized j to")

0 commit comments

Comments
 (0)