Skip to content

Commit 873cee7

Browse files
committed
Merge branch 'master' into fread_quote_sep
2 parents 11e0c48 + 6de436c commit 873cee7

40 files changed

+1105
-180
lines changed

.github/workflows/R-CMD-check-occasional.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
on:
22
schedule:
33
- cron: '17 13 23 * *' # 23rd of month at 13:17 UTC
4+
workflow_dispatch:
45

56
# A more complete suite of checks to run monthly; each PR/merge need not pass all these, but they should pass before CRAN release
67
name: R-CMD-check-occasional

.github/workflows/R-CMD-check.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ on:
55
branches:
66
- master
77
pull_request:
8+
workflow_dispatch:
89

910
name: R-CMD-check
1011

.github/workflows/code-quality.yaml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,44 @@ on:
22
push:
33
branches: [master]
44
pull_request:
5+
workflow_dispatch:
56

67
name: code-quality
78

89
jobs:
10+
changes:
11+
runs-on: ubuntu-latest
12+
outputs:
13+
r: ${{ steps.filter.outputs.r }}
14+
c: ${{ steps.filter.outputs.c }}
15+
po: ${{ steps.filter.outputs.po }}
16+
md: ${{ steps.filter.outputs.md }}
17+
rd: ${{ steps.filter.outputs.rd }}
18+
steps:
19+
- uses: actions/checkout@v4
20+
with:
21+
fetch-depth: 0 # ensure diff against base is available for PRs
22+
- uses: dorny/paths-filter@v3
23+
id: filter
24+
with:
25+
filters: |
26+
r:
27+
- '**/*.R'
28+
- '**/*.Rmd'
29+
c:
30+
- '**/*.c'
31+
- '**/*.h'
32+
po:
33+
- 'po/**/*.po'
34+
md:
35+
- '**/*.md'
36+
- '**/*.Rmd'
37+
rd:
38+
- 'man/**/*.Rd'
39+
940
lint-r:
41+
needs: changes
42+
if: needs.changes.outputs.r == 'true' || github.event_name == 'workflow_dispatch'
1043
runs-on: ubuntu-latest
1144
env:
1245
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
@@ -31,6 +64,8 @@ jobs:
3164
LINTR_ERROR_ON_LINT: true
3265
R_LINTR_LINTER_FILE: .ci/.lintr
3366
lint-c:
67+
needs: changes
68+
if: needs.changes.outputs.c == 'true' || github.event_name == 'workflow_dispatch'
3469
runs-on: ubuntu-latest
3570
steps:
3671
- uses: actions/checkout@v4
@@ -42,6 +77,8 @@ jobs:
4277
run: |
4378
Rscript .ci/lint.R .ci/linters/c src '[.][ch]$'
4479
lint-po:
80+
needs: changes
81+
if: needs.changes.outputs.po == 'true' || github.event_name == 'workflow_dispatch'
4582
runs-on: ubuntu-latest
4683
steps:
4784
- uses: actions/checkout@v4
@@ -55,13 +92,17 @@ jobs:
5592
run: |
5693
Rscript .ci/lint.R .ci/linters/po po '[.]po$'
5794
lint-md:
95+
needs: changes
96+
if: needs.changes.outputs.md == 'true' || github.event_name == 'workflow_dispatch'
5897
runs-on: ubuntu-latest
5998
steps:
6099
- uses: actions/checkout@v4
61100
- uses: r-lib/actions/setup-r@v2
62101
- name: Lint
63102
run: Rscript .ci/lint.R .ci/linters/md . '[.]R?md$'
64103
lint-rd:
104+
needs: changes
105+
if: needs.changes.outputs.rd == 'true' || github.event_name == 'workflow_dispatch'
65106
runs-on: ubuntu-latest
66107
steps:
67108
- uses: actions/checkout@v4

.github/workflows/performance-tests.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ on:
1010
- 'R/**'
1111
- 'src/**'
1212
- '.ci/atime/**'
13+
workflow_dispatch:
1314

1415
jobs:
1516
comment:

.github/workflows/rchk.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,14 @@
1818
on:
1919
push:
2020
branches: [master]
21+
paths:
22+
- '.github/workflows/rchk.yaml'
23+
- 'src/**'
2124
pull_request:
25+
paths:
26+
- '.github/workflows/rchk.yaml'
27+
- 'src/**'
28+
workflow_dispatch:
2229

2330
name: 'rchk'
2431

.github/workflows/test-coverage.yaml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,20 @@
33
on:
44
push:
55
branches: [master]
6+
paths:
7+
- '.github/workflows/test-coverage.yaml'
8+
- 'inst/tests/**'
9+
- 'R/**'
10+
- 'src/**'
11+
- 'tests/**'
612
pull_request:
13+
paths:
14+
- '.github/workflows/test-coverage.yaml'
15+
- 'inst/tests/**'
16+
- 'R/**'
17+
- 'src/**'
18+
- 'tests/**'
19+
workflow_dispatch:
720

821
name: test-coverage.yaml
922

DESCRIPTION

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,5 +104,6 @@ Authors@R: c(
104104
person("Reino", "Bruner", role="ctb"),
105105
person(given="@badasahog", role="ctb", comment="GitHub user"),
106106
person("Vinit", "Thakur", role="ctb"),
107-
person("Mukul", "Kumar", role="ctb")
107+
person("Mukul", "Kumar", role="ctb"),
108+
person("Ildikó", "Czeller", role="ctb")
108109
)

NAMESPACE

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ export(frollmax)
5858
export(frollmin)
5959
export(frollprod)
6060
export(frollmedian)
61+
export(frollvar)
62+
export(frollsd)
6163
export(frollapply)
6264
export(frolladapt)
6365
export(nafill)

NEWS.md

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,11 @@
3838
3939
1. `data.table(x=1, <expr>)`, where `<expr>` is an expression resulting in a 1-column matrix without column names, will eventually have names `x` and `V2`, not `x` and `V1`, consistent with `data.table(x=1, <expr>)` where `<expr>` results in an atomic vector, for example `data.table(x=1, cbind(1))` and `data.table(x=1, 1)` will both have columns named `x` and `V2`. In this release, the matrix case continues to be named `V1`, but the new behavior can be activated by setting `options(datatable.old.matrix.autoname)` to `FALSE`. See point 5 under Bug Fixes for more context; this change will provide more internal consistency as well as more consistency with `data.frame()`.
4040
41+
2. The behavior of `week()` will be changed in a future release to calculate weeks sequentially (days 1-7 as week 1), which is a potential breaking change. For now, the current "legacy" behavior, where week numbers advance every 7th day of the year (e.g., day 7 starts week 2), remains the default, and a deprecation warning will be issued when the old and new behaviors differ. Users can control this behavior with the temporary option `options(datatable.week = "...")`:
42+
* `"sequential"`: Opt-in to the new, sequential behavior (no warning).
43+
* `"legacy"`: Continue using the legacy behavior but suppress the deprecation warning.
44+
See [#2611](https://github.com/Rdatatable/data.table/issues/2611) for details. Thanks @MichaelChirico for the report and @venom1204 for the implementation.
45+
4146
### NEW FEATURES
4247
4348
1. New `sort_by()` method for data.tables, [#6662](https://github.com/Rdatatable/data.table/issues/6662). It uses `forder()` to improve upon the data.frame method and also matches `DT[order(...)]` behavior with respect to locale. Thanks @rikivillalba for the suggestion and PR.
@@ -246,7 +251,7 @@
246251
#9: 2025-09-22 9 8 9.0
247252
```
248253

249-
19. New rolling functions: `frollmin`, `frollprod` and `frollmedian`, have been implemented, towards [#2778](https://github.com/Rdatatable/data.table/issues/2778). Thanks to @jangorecki for implementation. Implementation of rolling median is based on a novel algorithm "sort-median" described by [@suomela](https://github.com/suomela) in his 2014 paper [Median Filtering is Equivalent to Sorting](https://arxiv.org/abs/1406.1717). "sort-median" scales very well, not only for size of input vector but also for size of rolling window.
254+
19. Other new rolling functions: `frollmin`, `frollprod`, `frollmedian`, `frollvar` and `frollsd`, have been implemented, resolving long standing issue [#2778](https://github.com/Rdatatable/data.table/issues/2778). Thanks to @jangorecki for implementation. Implementation of rolling median is based on a novel algorithm "sort-median" described by [@suomela](https://github.com/suomela) in his 2014 paper [Median Filtering is Equivalent to Sorting](https://arxiv.org/abs/1406.1717). "sort-median" scales very well, not only for size of input vector but also for size of rolling window.
250255
```r
251256
rollmedian = function(x, n) {
252257
ans = rep(NA_real_, nx<-length(x))
@@ -291,6 +296,7 @@
291296
# user system elapsed
292297
# 0.028 0.000 0.005
293298
```
299+
20. `fread()` now supports the `comment.char` argument to skip trailing comments or comment-only lines, consistent with `read.table()`, [#856](https://github.com/Rdatatable/data.table/issues/856). The default remains `comment.char = ""` (no comment parsing) for backward compatibility and performance, in contrast to `read.table(comment.char = "#")`. Thanks to @arunsrinivasan and many others for the suggestion and @ben-schwen for the implementation.
294300

295301
### BUG FIXES
296302

@@ -332,8 +338,13 @@
332338
333339
19. Ellipsis elements like `..1` are correctly excluded when searching for variables in "up-a-level" syntax inside `[`, [#5460](https://github.com/Rdatatable/data.table/issues/5460). Thanks @ggrothendieck for the report and @MichaelChirico for the fix.
334340
335-
20. `fread()` auto-detects separators for single-column files consisting solely of quoted values (e.g. `"this_that"\n"2025-01-01 00:00:01"`), [#7366](https://github.com/Rdatatable/data.table/issues/7366). Thanks @arunsrinivasan
336-
for the report and @ben-schwen for the fix.
341+
20. `forderv` could segfault on keys with long runs of identical bytes (e.g., many duplicate columns) because the single-group branch tail-recursed radix-by-radix until the C stack ran out, [#4300](https://github.com/Rdatatable/data.table/issues/4300). This is a major problem since sorting is extensively used in `data.table`. Thanks @quantitative-technologies for the report and @ben-schwen for the fix.
342+
343+
21. `[` now preserves existing key(s) when new columns are added before them, instead of incorrectly setting a new column as key, [#7364](https://github.com/Rdatatable/data.table/issues/7364). Thanks @czeildi for the bug report and the fix.
344+
345+
22. `setDTthreads(percent=)` and `setDTthreads(threads=)` now respect `OMP_NUM_THREADS` and `omp_get_max_threads()`, ensuring consistency with `setDTthreads()` (no arguments) when OpenMP environment variables are set, [#7165](https://github.com/Rdatatable/data.table/issues/7165). Previously, explicitly setting a thread count or percentage would ignore these OpenMP limits, potentially exceeding the user's intended thread cap. Thanks to @bastistician for the report and @ben-schwen for the fix.
346+
347+
20. `fread()` auto-detects separators for single-column files consisting solely of quoted values (e.g. `"this_that"\n"2025-01-01 00:00:01"`), [#7366](https://github.com/Rdatatable/data.table/issues/7366). Thanks @arunsrinivasan for the report and @ben-schwen for the fix.
337348

338349
### NOTES
339350

@@ -531,6 +542,8 @@ rowwiseDT(
531542

532543
21. `setDT(get0('var'))` now correctly modifies `var` by reference, consistent with the long-standing behavior of `setDT(get('var'))`, [#6864](https://github.com/Rdatatable/data.table/issues/6864). Thanks to @rikivillalba for the report and @venom1204 for the fix.
533544

545+
22. `fread()` could fail to read Mac CSV files (with `\r` line endings) if the file contained any `\n` character, such as a final `\r\n`. This was fixed by detecting the predominant line ending in a sample of the file, [#4186](https://github.com/Rdatatable/data.table/issues/4186). Thanks to @MPagel for the report and @ben-schwen for the fix.
546+
534547
### NOTES
535548

536549
1. There is a new vignette on joins! See `vignette("datatable-joins")`. Thanks to Angel Feliz for authoring it! Feedback welcome. This vignette has been highly requested since 2017: [#2181](https://github.com/Rdatatable/data.table/issues/2181).

R/data.table.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1448,7 +1448,7 @@ replace_dot_alias = function(e) {
14481448
if (SD_only)
14491449
jvnames = jnames = sdvars
14501450
else
1451-
jnames = as.character(Filter(is.name, jsub)[-1L])
1451+
jnames = vapply_1c(jsub, function(x) if (is.name(x)) as.character(x) else NA_character_)[-1L]
14521452
key_idx = chmatch(key, jnames)
14531453
missing_keys = which(is.na(key_idx))
14541454
if (length(missing_keys) && missing_keys[1L] == 1L) return(NULL)

0 commit comments

Comments
 (0)