Skip to content

Commit cff1d33

Browse files
augment NEWS
1 parent 91c9835 commit cff1d33

File tree

1 file changed

+22
-14
lines changed

1 file changed

+22
-14
lines changed

NEWS.md

Lines changed: 22 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,28 @@
1818

1919
6. `between()` gains the argument `ignore_tzone=FALSE`. Normally, a difference in time zone between `lower` and `upper` will produce an error, and a difference in time zone between `x` and either of the others will produce a message. Setting `ignore_tzone=TRUE` bypasses the checks, allowing both comparisons to proceed without error or message about time zones.
2020

21+
7. New helper function `fctr` as an extended version of `factor()`, [#4837](https://github.com/Rdatatable/data.table/issues/4837). Most notably, it supports (1) retaining input level ordering by default, i.e. `levels=unique(x)` as opposed to `levels = sort(unique(x))`; (2) `rev=` to reverse the levels; and (3) `sort=` to allow more feature parity with `factor()`. The choice of default is motivated by convenience in common case when order of elements needs be preserved, for example when using `dcast` or adding a legend to a plot.
22+
23+
```r
24+
d = data.table(id1=rep(1:2, each=3L), id2=letters[c(4:3,5L,3:5)], v1=1:6)
25+
dcast(d, id1 ~ factor(id2))
26+
# id1 c d e
27+
# 1: 1 2 1 3
28+
# 2: 2 4 5 6
29+
dcast(d, id1 ~ fctr(id2))
30+
# id1 d c e
31+
# 1: 1 1 2 3
32+
# 2: 2 5 4 6
33+
dcast(d, id1 ~ fctr(id2, sort=TRUE)) # same as factor()
34+
# id1 c d e
35+
# 1: 1 2 1 3
36+
# 2: 2 4 5 6
37+
dcast(d, id1 ~ fctr(id2, rev=TRUE))
38+
# id1 e c d
39+
# 1: 1 3 2 1
40+
# 2: 2 6 4 5
41+
```
42+
2143
### BUG FIXES
2244

2345
1. Custom binary operators from the `lubridate` package now work with objects of class `IDate` as with a `Date` subclass, [#6839](https://github.com/Rdatatable/data.table/issues/6839). Thanks @emallickhossain for the report and @aitap for the fix.
@@ -500,20 +522,6 @@ rowwiseDT(
500522

501523
### NEW FEATURES
502524

503-
0. New helper function `fctr` has been added, [#4837](https://github.com/Rdatatable/data.table/issues/4837). It is wrapper around base R `factor` using default arguments adjusted to retain original order. It has been added for convenience in case when order of elements needs be preserved, for example when using `dcast` or adding legend to plot.
504-
505-
```r
506-
d = data.table(id1=1:2, id2=letters[c(4:3,3:4)], v1=1:4)
507-
dcast(d, id1 ~ id2)
508-
# id1 c d
509-
#1: 1 3 1
510-
#2: 2 2 4
511-
dcast(d, id1 ~ fctr(id2))
512-
# id1 d c
513-
#1: 1 1 3
514-
#2: 2 4 2
515-
```
516-
517525
1. `nafill()` now applies `fill=` to the front/back of the vector when `type="locf|nocb"`, [#3594](https://github.com/Rdatatable/data.table/issues/3594). Thanks to @ben519 for the feature request. It also now returns a named object based on the input names. Note that if you are considering joining and then using `nafill(...,type='locf|nocb')` afterwards, please review `roll=`/`rollends=` which should achieve the same result in one step more efficiently. `nafill()` is for when filling-while-joining (i.e. `roll=`/`rollends=`/`nomatch=`) cannot be applied.
518526

519527
2. `mean(na.rm=TRUE)` by group is now GForce optimized, [#4849](https://github.com/Rdatatable/data.table/issues/4849). Thanks to the [h2oai/db-benchmark](https://github.com/h2oai/db-benchmark) project for spotting this issue. The 1 billion row example in the issue shows 48s reduced to 14s. The optimization also applies to type `integer64` resulting in a difference to the `bit64::mean.integer64` method: `data.table` returns a `double` result whereas `bit64` rounds the mean to the nearest integer.

0 commit comments

Comments
 (0)