Skip to content

Commit 3cfaee7

Browse files
authored
Merge branch 'master' into issue_4920
2 parents c921efa + 8dff81f commit 3cfaee7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+543
-334
lines changed

.dev/CRAN_Release.cmd

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -584,26 +584,29 @@ bunzip2 inst/tests/*.Rraw.bz2 # decompress *.Rraw again so as not to commit com
584584
# Many thanks!
585585
# Best, Tyson
586586
# ------------------------------------------------------------
587-
# DO NOT commit or push to GitHub. Leave 4 files (.dev/CRAN_Release.cmd, DESCRIPTION, NEWS and init.c) edited and not committed. Include these in a single and final bump commit below.
588-
# DO NOT even use a PR. Because PRs build binaries and we don't want any binary versions of even release numbers available from anywhere other than CRAN.
587+
589588
# Leave milestone open with a 'release checks' issue open. Keep updating status there.
590589
# ** If on EC2, shutdown instance. Otherwise get charged for potentially many days/weeks idle time with no alerts **
591590
# If it's evening, SLEEP.
592591
# It can take a few days for CRAN's checks to run. If any issues arise, backport locally. Resubmit the same even version to CRAN.
593592
# CRAN's first check is automatic and usually received within an hour. WAIT FOR THAT EMAIL.
594593
# When CRAN's email contains "Pretest results OK pending a manual inspection" (or similar), or if not and it is known why not and ok, then bump dev.
595594

596-
###### Bump dev for NON-PATCH RELEASE
597-
# 0. Close milestone to prevent new issues being tagged with it. The final 'release checks' issue can be left open in a closed milestone.
598-
# 1. Check that 'git status' shows 4 files in modified and uncommitted state: DESCRIPTION, NEWS.md, init.c and this .dev/CRAN_Release.cmd
599-
# 2. Bump minor version in DESCRIPTION to next odd number. Note that DESCRIPTION was in edited and uncommitted state so even number never appears in git.
600-
# 3. Add new heading in NEWS for the next dev version. Add "(submitted to CRAN on <today>)" on the released heading.
601-
# 4. Bump minor version in dllVersion() in init.c
602-
# 5. Bump 3 minor version numbers in Makefile
603-
# 6. Search and replace this .dev/CRAN_Release.cmd to update 1.16.99 to 1.16.99 inc below, 1.16.0 to 1.17.0 above, 1.15.0 to 1.16.0 below
604-
# 7. Another final gd to view all diffs using meld. (I have `alias gd='git difftool &> /dev/null'` and difftool meld: http://meldmerge.org/)
605-
# 8. Push to master with this consistent commit message: "1.17.0 on CRAN. Bump to 1.17.99"
606-
# 9. Take sha from the previous step and run `git tag 1.17.0 96c..sha..d77` then `git push origin 1.16.0` (not `git push --tags` according to https://stackoverflow.com/a/5195913/403310)
595+
###### After submission for NON-PATCH RELEASE
596+
# 0. Start a new branch `cran-x.y.0` with the code as submitted to CRAN
597+
# - Check that 'git status' shows 4 files in modified and uncommitted state: DESCRIPTION, NEWS.md, init.c and this .dev/CRAN_Release.cmd
598+
# - The branch should have one commit with precisely these 4 files being edited
599+
# 1. Follow up with a commit with this consistent commit message like: "1.17.0 on CRAN. Bump to 1.17.99" to this branch bumping to the next dev version
600+
# - Bump minor version in DESCRIPTION to next odd number. Note that DESCRIPTION was in edited and uncommitted state so even number never appears in git.
601+
# - Add new heading in NEWS for the next dev version. Add "(submitted to CRAN on <today>)" on the released heading.
602+
# - Bump minor version in dllVersion() in init.c
603+
# - Bump 3 minor version numbers in Makefile
604+
# - Search and replace this .dev/CRAN_Release.cmd to update 1.16.99 to 1.16.99 inc below, 1.16.0 to 1.17.0 above, 1.15.0 to 1.16.0 below
605+
# - Another final gd to view all diffs using meld. (I have `alias gd='git difftool &> /dev/null'` and difftool meld: http://meldmerge.org/)
606+
# 2. Ideally, no PRs are reviewed while a CRAN submission is pending. Any reviews that do happen MUST target this branch, NOT master!
607+
# 3. Once the submission lands on CRAN, merge this branch WITHOUT SQUASHING!
608+
# 4. Close milestone to prevent new issues being tagged with it. The final 'release checks' issue can be left open in a closed milestone.
609+
# 5. Take SHA from the "...on CRAN. Bump to ..." commit and run `git tag 1.17.0 96c..sha..d77` then `git push origin 1.17.0` (not `git push --tags` according to https://stackoverflow.com/a/5195913/403310)
607610
######
608611

609612
###### Branching policy for PATCH RELEASE

.github/workflows/R-CMD-check.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ jobs:
3636
R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
3737
RSPM: ${{ matrix.config.rspm }}
3838
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
39+
_R_CHECK_RD_CHECKRD_MINLEVEL_: -Inf
3940

4041
steps:
4142
- uses: actions/checkout@v4

.gitlab-ci.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ test-lin-rel:
114114
variables:
115115
_R_CHECK_FORCE_SUGGESTS_: "TRUE"
116116
OPENBLAS_MAIN_FREE: "1"
117+
_R_CHECK_RD_CHECKRD_MINLEVEL_: "-Inf"
117118
script:
118119
- *install-deps
119120
- echo 'CFLAGS=-g -O3 -flto=auto -fno-common -fopenmp -Wall -Wvla -pedantic -fstack-protector-strong -D_FORTIFY_SOURCE=2' > ~/.R/Makevars

GOVERNANCE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ Please also make a note in the change log under [`# Governance history`](#govern
108108

109109
# Finances and Funding
110110

111-
data.table is a [NumFOCUS](https://numfocus.org/) project. Donations to the data.table can be made at [https://numfocus.org/project/data-table]([https://numfocus.org/donate-to-data-table](https://app.hubspot.com/payments/FFWKWTTvKFdzqH?referrer=PAYMENT_LINK))
111+
data.table is a [NumFOCUS](https://numfocus.org/) project. Donations to data.table can be made at [https://numfocus.org/project/data-table](https://app.hubspot.com/payments/FFWKWTTvKFdzqH?referrer=PAYMENT_LINK).
112112

113113
*NumFOCUS is a 501(c)(3) non-profit charity in the United States; as such, donations to NumFOCUS are tax-deductible as allowed by law. As with any donation, you should consult with your personal tax adviser or the IRS about your particular tax situation.*
114114

NAMESPACE

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,8 +153,9 @@ if (getRversion() >= "3.6.0") {
153153

154154
# IDateTime support:
155155
export(as.IDate,as.ITime,IDateTime)
156-
export(second,minute,hour,yday,wday,mday,week,isoweek,month,quarter,year,yearmon,yearqtr)
156+
export(second,minute,hour,yday,wday,mday,week,isoweek,isoyear,month,quarter,year,yearmon,yearqtr)
157157

158+
if (getRversion() >= "4.3.0") S3method(chooseOpsMethod, IDate)
158159
S3method("[", ITime)
159160
S3method("+", IDate)
160161
S3method("-", IDate)

NEWS.md

Lines changed: 29 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,16 @@
1010

1111
### NEW FEATURES
1212

13-
1. New `sort_by()` method for data.tables, [#6662](https://github.com/Rdatatable/data.table/issues/6662). It uses `forder()` to improve upon the data.frame method and also match `DT[order(...)]` behavior with respect to locale. Thanks @rikivillalba for the suggestion and PR.
13+
1. New `sort_by()` method for data.tables, [#6662](https://github.com/Rdatatable/data.table/issues/6662). It uses `forder()` to improve upon the data.frame method and also matches `DT[order(...)]` behavior with respect to locale. Thanks @rikivillalba for the suggestion and PR.
14+
15+
```r
16+
DT = data.table(a=c(1L, 2L, 1L), b=c(3L, 1L, 2L))
17+
sort_by(DT, ~a + b)
18+
# a b
19+
# 1: 1 2
20+
# 2: 1 3
21+
# 3: 2 1
22+
```
1423

1524
2. `melt()` now supports using `patterns()` with `id.vars`, [#6867](https://github.com/Rdatatable/data.table/issues/6867). Thanks to Toby Dylan Hocking for the suggestion and PR.
1625

@@ -56,6 +65,10 @@
5665

5766
13. New `mergelist()` and `setmergelist()` similarly work _a la_ `Reduce()` to recursively merge a `list` of data.tables, [#599](https://github.com/Rdatatable/data.table/issues/599). Different join modes (_left_, _inner_, _full_, _right_, _semi_, _anti_, and _cross_) are supported through the `how` argument; duplicate handling goes through the `mult` argument. `setmergelist()` carefully avoids copies where one is not needed, e.g. in a 1:1 left join. Thanks Patrick Nicholson for the FR (in 2013!), @jangorecki for the PR, and @MichaelChirico for extensive reviews and fine-tuning.
5867

68+
14. `fcoalesce()` and `setcoalesce()` gain `nan` argument to control whether `NaN` values should be treated as missing (`nan=NA`, the default) or non-missing (`nan=NaN`), [#4567](https://github.com/Rdatatable/data.table/issues/4567). This provides full compatibility with `nafill()` behavior. Thanks to @ethanbsmith for the feature request and @Mukulyadav2004 for the implementation.
69+
70+
15. New function `isoyear()` has been implemented as a complement to `isoweek()`, returning the ISO 8601 year corresponding to a given date, [#7154](https://github.com/Rdatatable/data.table/issues/7154). Thanks to @ben-schwen and @MichaelChirico for the suggestion and @venom1204 for the implementation.
71+
5972
### BUG FIXES
6073

6174
1. `fread()` no longer warns on certain systems on R 4.5.0+ where the file owner can't be resolved, [#6918](https://github.com/Rdatatable/data.table/issues/6918). Thanks @ProfFancyPants for the report and PR.
@@ -74,7 +87,7 @@
7487
7588
8. A data.table with a column of class `vctrs_list_of` (from package {vctrs}) prints as expected, [#5948](https://github.com/Rdatatable/data.table/issues/5948). Before, they could be printed messily, e.g. printing every entry in a nested data.frame. Thanks @jesse-smith for the report, @DavisVaughan and @r2evans for contributing, and @MichaelChirico for the PR.
7689
77-
9. Fixed incorrect sorting of merges where the first column of a key is a factor with non-`sort()`-ed levels (e.g. `factor(1:2, 2:1)` and it is joined to a character column, [#5361](https://github.com/Rdatatable/data.table/issues/5361). Thanks to @gbrunick for the report and Benjamin Schwendinger for the fix.
90+
9. Fixed incorrect sorting of merges where the first column of a key is a factor with non-`sort()`-ed levels (e.g. `factor(1:2, 2:1)` and it is joined to a character column, [#5361](https://github.com/Rdatatable/data.table/issues/5361). Thanks to @gbrunick for the report, Benjamin Schwendinger for the fix, and @MichaelChirico for a follow-up fix caught by revdep testing.
7891
7992
10. Spurious warnings from internal code in `cube()`, `rollup()`, and `groupingsets()` are no longer surfaced to the caller, [#6964](https://github.com/Rdatatable/data.table/issues/6964). Thanks @ferenci-tamas for the report and @venom1204 for the fix.
8093
@@ -86,6 +99,12 @@
8699
87100
14. Filling columns of class Date with POSIXct (and vice versa) using `shift()` now yields a clear, informative error message specifying the class mismatch, [#5218](https://github.com/Rdatatable/data.table/issues/5218). Thanks @ashbaldry for the report and @ben-schwen for the fix.
88101
102+
15. `split.data.table()` output list elements retain the S3 class of the generating data.table, e.g. in `l=split(x, ...)` if `x` has class `my_class`, so will `l[[1]]` and so on, [#7105](https://github.com/Rdatatable/data.table/issues/7105). Thanks @m-muecke for the bug report and @MichaelChirico for the fix.
103+
104+
16. `between()` is now more robust with `integer64` arguments. Combining small integer `x` with certain large `integer64` bounds no longer misinterprets the bounds as `double`; if a `double` bound cannot be losslessly converted into `integer64` for comparison with `integer64` `x`, an error is signalled instead of returning a wrong answer with a warning; [#7164](https://github.com/Rdatatable/data.table/issues/7164). Thanks @aitap for the bug report and the fix.
105+
106+
17. `t1 - t2`, where one is an `IDate` and the other is a `Date`, are now consistent with the case where both are `IDate` or both are `Date`, [#4749](https://github.com/Rdatatable/data.table/issues/4749). Thanks @George9000 for the report and @MichaelChirico for the fix.
107+
89108
### NOTES
90109
91110
1. The following in-progress deprecations have proceeded:
@@ -107,21 +126,23 @@
107126
108127
5. A GitHub Actions workflow is now in place to warn the entire maintainer team, as well as any contributor following the GitHub repository, when the package is at risk of archival on CRAN [#7008](https://github.com/Rdatatable/data.table/issues/7008). Thanks @tdhock for the original report and @Bisaloo and @TysonStanley for the fix.
109128
110-
# data.table [v1.17.8](https://github.com/Rdatatable/data.table/milestone/41) (6 July 2025)
129+
6. Using a double vector in `set()`'s `i=` and/or `j=` no longer throws a warning about preferring integer, [#6594](https://github.com/Rdatatable/data.table/issues/6594). While it may improve efficiency to use integer, there's no guarantee it's an improvement and the difference is likely to be minimal. The coercion will still be reported under `datatable.verbose=TRUE`. For package/production use cases, static analyzers such as `lintr::implicit_integer_linter()` can also report when numeric literals should be rewritten as integer literals.
130+
131+
## data.table [v1.17.8](https://github.com/Rdatatable/data.table/milestone/41) (6 July 2025)
111132

112133
1. Internal functions used to signal errors are now marked as non-returning, silencing a compiler warning about potentially unchecked allocation failure. Thanks to Prof. Brian D. Ripley for the report and @aitap for the fix, [#7070](https://github.com/Rdatatable/data.table/pull/7070).
113134

114-
# data.table [v1.17.6](https://github.com/Rdatatable/data.table/milestone/40) (15 June 2025)
135+
## data.table [v1.17.6](https://github.com/Rdatatable/data.table/milestone/40) (15 June 2025)
115136

116137
1. On a heavily loaded machine, a `forder` thread could try to perform a zero-length copy from a null pointer, which was de-facto harmless but is against the C standard and was caught by additional CRAN checks, [#7051](https://github.com/Rdatatable/data.table/issues/7051). Thanks to @helske for the report and @aitap for the PR.
117138

118-
# data.table [v1.17.4](https://github.com/Rdatatable/data.table/milestone/39) (25 May 2025)
139+
## data.table [v1.17.4](https://github.com/Rdatatable/data.table/milestone/39) (25 May 2025)
119140

120141
1. The C code now avoids passing invalid data pointers from 0-length vectors to `memcpy()`, which previously caused undefined behaviour. Thanks to Prof. Brian D. Ripley for the report and Michael Chirico for the fix, [#6911](https://github.com/Rdatatable/data.table/pull/6911).
121142

122-
# data.table [v1.17.2](https://github.com/Rdatatable/data.table/milestone/38) (7 May 2025)
143+
## data.table [v1.17.2](https://github.com/Rdatatable/data.table/milestone/38) (7 May 2025)
123144

124-
## BUG FIXES
145+
### BUG FIXES
125146

126147
1. `fwrite(compress="gzip")` once again produces a gzip header when the column names are missing or disabled, [@6852](https://github.com/Rdatatable/data.table/issues/6852). Thanks @maxscheiber for the report and @aitap for the fix.
127148

@@ -137,7 +158,7 @@
137158

138159
7. `as.data.table()` now properly handles keys: specifying keys sets them, omitting keys preserves existing ones, and setting `key=NULL` clears them, [#6859](https://github.com/Rdatatable/data.table/issues/6859). Thanks @brookslogan for the report and @Mukulyadav2004 for the fix.
139160

140-
## NOTES
161+
### NOTES
141162

142163
1. Continued work to remove non-API C functions, [#6180](https://github.com/Rdatatable/data.table/issues/6180). Thanks Ivan Krylov for the PRs and for writing a clear and concise guide about the R API: https://aitap.codeberg.page/R-api/.
143164

R/IDateTime.R

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,8 @@ round.IDate = function(x, digits=c("weeks", "months", "quarters", "years"), ...)
9191
years = ISOdate(year(x), 1L, 1L)))
9292
}
9393

94+
chooseOpsMethod.IDate = function(x, y, mx, my, cl, reverse) inherits(y, "Date")
95+
9496
#Adapted from `+.Date`
9597
`+.IDate` = function(e1, e2) {
9698
if (nargs() == 1L)
@@ -115,7 +117,7 @@ round.IDate = function(x, digits=c("weeks", "months", "quarters", "years"), ...)
115117
if (storage.mode(e1) != "integer")
116118
internal_error("storage mode of IDate is somehow no longer integer") # nocov
117119
if (nargs() == 1L)
118-
stopf("unary - is not defined for \"IDate\" objects")
120+
stopf('unary - is not defined for "IDate" objects')
119121
if (inherits(e2, "difftime"))
120122
internal_error("difftime objects may not be subtracted from IDate, but Ops dispatch should have intervened to prevent this") # nocov
121123

@@ -127,7 +129,12 @@ round.IDate = function(x, digits=c("weeks", "months", "quarters", "years"), ...)
127129
# ii) .Date was newly exposed in R some time after 3.4.4
128130
}
129131
ans = as.integer(unclass(e1) - unclass(e2))
130-
if (!inherits(e2, "Date")) setattr(ans, "class", c("IDate", "Date"))
132+
if (inherits(e2, "Date")) {
133+
setattr(ans, "class", "difftime")
134+
setattr(ans, "units", "days")
135+
} else {
136+
setattr(ans, "class", c("IDate", "Date"))
137+
}
131138
ans
132139
}
133140

@@ -355,7 +362,7 @@ isoweek = function(x) as.integer(format(as.IDate(x), "%V"))
355362
# nearest_thurs = as.IDate(7L * (as.integer(x + 3L) %/% 7L))
356363
# year_start = as.IDate(format(nearest_thurs, '%Y-01-01'))
357364
# 1L + (nearest_thurs - year_start) %/% 7L
358-
365+
isoyear = function(x) as.integer(format(as.IDate(x), "%G"))
359366

360367
month = function(x) convertDate(as.IDate(x), "month")
361368
quarter = function(x) convertDate(as.IDate(x), "quarter")

R/between.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,8 @@ between = function(x, lower, upper, incbounds=TRUE, NAbounds=TRUE, check=FALSE,
3030
}
3131
if (is.i64(x)) {
3232
if (!requireNamespace("bit64", quietly=TRUE)) stopf("trying to use integer64 class when 'bit64' package is not installed") # nocov
33-
if (!is.i64(lower) && is.numeric(lower)) lower = bit64::as.integer64(lower)
34-
if (!is.i64(upper) && is.numeric(upper)) upper = bit64::as.integer64(upper)
33+
if (!is.i64(lower) && (is.integer(lower) || fitsInInt64(lower))) lower = bit64::as.integer64(lower)
34+
if (!is.i64(upper) && (is.integer(upper) || fitsInInt64(upper))) upper = bit64::as.integer64(upper)
3535
}
3636
is.supported = function(x) is.numeric(x) || is.character(x) || is.px(x)
3737
if (is.supported(x) && is.supported(lower) && is.supported(upper)) {

R/data.table.R

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2047,8 +2047,9 @@ replace_dot_alias = function(e) {
20472047
if (!.Call(CisOrderedSubset, irows, nrow(x)))
20482048
return(NULL)
20492049

2050-
# see #1010. don't set key when i has no key, but irows is ordered and !roll
2051-
if (roll && length(irows) != 1L)
2050+
# see #1010. don't set key when i has no key, but irows is ordered and isFALSE(roll)
2051+
# NB: roll could still be a string like 'nearest', #7146
2052+
if (!is.character(roll) && roll && length(irows) != 1L)
20522053
return(NULL)
20532054

20542055
new_key <- head(x_key, key_length)
@@ -2491,7 +2492,7 @@ Ops.data.table = function(e1, e2 = NULL)
24912492
}
24922493

24932494
split.data.table = function(x, f, drop = FALSE, by, sorted = FALSE, keep.by = TRUE, flatten = TRUE, ..., verbose = getOption("datatable.verbose")) {
2494-
if (!is.data.table(x)) stopf("x argument must be a data.table")
2495+
if (!is.data.table(x)) internal_error("x argument to split.data.table must be a data.table") # nocov
24952496
stopifnot(is.logical(drop), is.logical(sorted), is.logical(keep.by), is.logical(flatten))
24962497
# split data.frame way, using `f` and not `by` argument
24972498
if (!missing(f)) {
@@ -2566,8 +2567,11 @@ split.data.table = function(x, f, drop = FALSE, by, sorted = FALSE, keep.by = TR
25662567
setattr(ll, "names", nm)
25672568
# handle nested split
25682569
if (flatten || length(by) == 1L) {
2569-
for (x in ll) .Call(C_unlock, x)
2570-
lapply(ll, setDT)
2570+
for (xi in ll) .Call(C_unlock, xi)
2571+
out = lapply(ll, setDT)
2572+
# TODO(#2000): just let setDT handle this
2573+
if (!identical(old_class <- class(x), c("data.table", "data.frame"))) for (xi in out) setattr(xi, "class", old_class)
2574+
out
25712575
# alloc.col could handle DT in list as done in: c9c4ff80bdd4c600b0c4eff23b207d53677176bd
25722576
} else if (length(by) > 1L) {
25732577
lapply(ll, split.data.table, drop=drop, by=by[-1L], sorted=sorted, keep.by=keep.by, flatten=flatten)

R/setops.R

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -160,8 +160,8 @@ all.equal.data.table = function(target, current, trim.levels=TRUE, check.attribu
160160
return(sprintf(
161161
"%s. 'target': %s. 'current': %s.",
162162
gettext("Datasets have different keys"),
163-
if(length(k1)) brackify(k1) else gettextf("has no key"),
164-
if(length(k2)) brackify(k2) else gettextf("has no key")
163+
if(length(k1)) brackify(k1) else gettext("has no key"),
164+
if(length(k2)) brackify(k2) else gettext("has no key")
165165
))
166166
}
167167
# check index
@@ -171,8 +171,8 @@ all.equal.data.table = function(target, current, trim.levels=TRUE, check.attribu
171171
return(sprintf(
172172
"%s. 'target': %s. 'current': %s.",
173173
gettext("Datasets have different indices"),
174-
if(length(i1)) brackify(i1) else gettextf("has no index"),
175-
if(length(i2)) brackify(i2) else gettextf("has no index")
174+
if(length(i1)) brackify(i1) else gettext("has no index"),
175+
if(length(i2)) brackify(i2) else gettext("has no index")
176176
))
177177
}
178178

0 commit comments

Comments
 (0)