You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .ci/atime/tests.R
+28-2Lines changed: 28 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,5 @@
1
+
pval.thresh<-0.001# to reduce false positives.
2
+
1
3
# Test case adapted from https://github.com/Rdatatable/data.table/issues/6105#issue-2268691745 which is where the issue was reported.
2
4
# https://github.com/Rdatatable/data.table/pull/6107 fixed performance across 3 ways to specify a column as Date, and we test each individually.
3
5
extra.args.6107<- c(
@@ -13,6 +15,7 @@ for (extra.arg in extra.args.6107){
13
15
tmp_csv= tempfile()
14
16
fwrite(DT, tmp_csv)
15
17
},
18
+
FasterIO="60a01fa65191c44d7997de1843e9a1dfe5be9f72", # First commit of the PR (https://github.com/Rdatatable/data.table/pull/6925/commits) that reduced time usage
16
19
Slow="e9087ce9860bac77c51467b19e92cf4b72ca78c7", # Parent of the merge commit (https://github.com/Rdatatable/data.table/commit/a77e8c22e44e904835d7b34b047df2eff069d1f2) of the PR (https://github.com/Rdatatable/data.table/pull/6107) that fixes the issue
17
20
Fast="a77e8c22e44e904835d7b34b047df2eff069d1f2") # Merge commit of the PR (https://github.com/Rdatatable/data.table/pull/6107) that fixes the issue
# Test case adapted from https://github.com/Rdatatable/data.table/pull/7022#discussion_r2107900643
136
+
"fread disk overhead improved in #6925"=atime::atime_test(
137
+
N=2^seq(0, 20), # smaller N because we are doing multiple fread calls.
138
+
setup= {
139
+
fwrite(iris[1], iris.csv<- tempfile())
140
+
},
141
+
expr= replicate(N, data.table::fread(iris.csv)),
142
+
Fast="60a01fa65191c44d7997de1843e9a1dfe5be9f72", # First commit of the PR (https://github.com/Rdatatable/data.table/pull/6925/commits) that reduced time usage
143
+
Slow="e25ea80b793165094cea87d946d2bab5628f70a6"# Parent of the first commit (https://github.com/Rdatatable/data.table/commit/60a01fa65191c44d7997de1843e9a1dfe5be9f72)
144
+
),
145
+
131
146
# Performance regression discussed in https://github.com/Rdatatable/data.table/issues/4311
132
147
# Test case adapted from https://github.com/Rdatatable/data.table/pull/4440#issuecomment-632842980 which is the fix PR.
133
148
"shallow regression fixed in #4440"=atime::atime_test(
Before="f339aa64c426a9cd7cf2fcb13d91fc4ed353cd31", # Parent of the first commit https://github.com/Rdatatable/data.table/commit/fcc10d73a20837d0f1ad3278ee9168473afa5ff1 in the PR https://github.com/Rdatatable/data.table/pull/6393/commits with major change to fwrite with gzip.
252
268
PR="3630413ae493a5a61b06c50e80d166924d2ef89a"), # Close-to-last merge commit in the PR.
253
269
254
-
tests=extra.test.list)
270
+
# Test case created directly using the atime code below (not adapted from any other benchmark), based on the PR, Removes unnecessary data.table call from as.data.table.array https://github.com/Rdatatable/data.table/pull/7010
271
+
"as.data.table.array improved in #7010"=atime::atime_test(
Slow="73d79edf8ff8c55163e90631072192301056e336", # Parent of the first commit in the PR (https://github.com/Rdatatable/data.table/commit/8397dc3c993b61a07a81c786ca68c22bc589befc)
278
+
Fast="8397dc3c993b61a07a81c786ca68c22bc589befc"), # Commit in the PR (https://github.com/Rdatatable/data.table/pull/7019/commits) that removes inefficiency
As contributors and maintainers of this project, and in the interest of fostering an open and welcoming community, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
1
+
The R data.table project adheres to NumFOCUS's Code of Conduct.
2
2
3
-
We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, religion, or nationality.
3
+
# The NumFOCUS Code of Conduct
4
4
5
-
Examples of unacceptable behavior by participants include:
5
+
## The Short Version
6
6
7
-
* The use of sexualized language or imagery
8
-
* Personal attacks
9
-
* Trolling or insulting/derogatory comments
10
-
* Public or private harassment
11
-
* Publishing other's private information, such as physical or electronic addresses, without explicit permission
12
-
* Other unethical or unprofessional conduct
7
+
Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate for NumFOCUS.
13
8
14
-
Project members with the Committer role have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
9
+
All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery is not appropriate.
15
10
16
-
By adopting this Code of Conduct, project members commit themselves to fairly and consistently apply these principles to every aspect of managing this project. Project maintainers who do not follow or enforce the Code of Conduct may be permanently removed from the project team.
11
+
NumFOCUS is dedicated to providing a harassment-free community for everyone, regardless of gender, sexual orientation, gender identity and expression, disability, physical appearance, body size, race, or religion. We do not tolerate harassment of community members in any form.
12
+
Thank you for helping make this a welcoming, friendly community for all.
17
13
18
-
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community.
14
+
[Code of Conduct Reporting Form](https://numfocus.typeform.com/to/ynjGdT)
19
15
20
-
21
-
## Reporting
22
-
23
-
Project members with the Committer role or the CRAN Maintainer role are pledged to promptly address any reported issues. Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to any individual with this role.
24
-
25
-
Those who prefer to report in a way that is independent of the current Committers and Maintainer may instead contact the Community Engagement Coordinator by e-mailing [r.data.table\@gmail.com](mailto:[email protected]). Messages sent to this e-mail address will be visible only to the current Community Engagement Coordinator, a position always held by an individual who is not a Committer or CRAN Maintainer of the package.
26
-
27
-
The current Committers are Toby Dylan Hocking (@tdhock), Matt Dowle (@mattdowle), Arun Srinivasan (@arunsrinivasan), Jan Gorecki (@jangorecki), Michael Chirico (@MichaelChirico), Benjamin Schwendinger (@ben-schwen), and Ivan Krylov (@aitap).
28
-
29
-
The current CRAN Maintainer is Tyson Barrett (@tysonstanley).
30
-
31
-
The current Community Engagement Coordinator is Kelly Bodwin (@kbodwin).
32
-
33
-
All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. Complaint respondents are obligated to maintain confidentiality with regard to the reporter of an incident.
34
-
35
-
This Code of Conduct is adapted from the [Contributor Covenant, version 1.3.0](https://www.contributor-covenant.org/version/1/3/0/code-of-conduct/), available at [https://www.contributor-covenant.org/version/1/3/0/](https://www.contributor-covenant.org/version/1/3/0/), and the Swift Code of Conduct.
16
+
For the full version of the Code of Conduct, please visit: [https://numfocus.org/code-of-conduct](https://numfocus.org/code-of-conduct).
Copy file name to clipboardExpand all lines: GOVERNANCE.md
+6-13Lines changed: 6 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,23 +108,14 @@ Please also make a note in the change log under [`# Governance history`](#govern
108
108
109
109
# Finances and Funding
110
110
111
-
There is currently no mechanism for the data.table project to receive funding as an entity.
111
+
data.table is a [NumFOCUS](https://numfocus.org/) project. Donations to the data.table can be made at [https://numfocus.org/project/data-table]([https://numfocus.org/donate-to-data-table](https://app.hubspot.com/payments/FFWKWTTvKFdzqH?referrer=PAYMENT_LINK))
112
112
113
-
Funding support for this project therefore may come in two forms:
113
+
*NumFOCUS is a 501(c)(3) non-profit charity in the United States; as such, donations to NumFOCUS are tax-deductible as allowed by law. As with any donation, you should consult with your personal tax adviser or the IRS about your particular tax situation.*
114
114
115
-
## Individual external funding
116
115
117
-
Any individual developer or community member of data.table may apply for and receive funding for their work on the project. Individuals or groups seeking funding support are strongly encouraged to consult directly with the data.table Project Members (by initiating an Issue on GitHub) to ensure funds are used meaningfully. Formally, however, decisions about use of funds are governed by the individual grantee(s) and their contract with the funding agency.
116
+
## Decision-making for funding use
118
117
119
-
There is no guarantee that funded work will be incorporated into the data.table package; any contributions, whether funded or unfunded, are subject to the same review process as outlined above.
120
-
121
-
## Direct donations
122
-
123
-
Direct donations to the project may be made via GitHub Sponsorships, which allow individuals to fund a specific developer. If the current CRAN Maintainer offers a personal sponsorship option, donations may be made to them to support the project in general.
124
-
125
-
## Decision-making for future opportunities
126
-
127
-
We here outline a procedure for disbursing funds, should this project in the future become a directly fundable entity (e.g. an LLC or a subsidiary of an umbrella LLC).
118
+
We here outline a procedure for disbursing funds acquired through direct donations via NumFOCUS or grant-style research funding.
128
119
129
120
Funds acquired by the data.table project will be disbursed at the discretion of the **Committers**, defined as above. The **CRAN Maintainer** will have authority to make final decisions in the event that no consensus is reached among committers prior to deadlines for use of funds, and will be responsible for disbursement logistics.
130
121
@@ -148,6 +139,8 @@ data.table Version line in DESCRIPTION typically has the following meanings
148
139
149
140
# Governance history
150
141
142
+
May 2025: update Finance and CoC language for NumFOCUS incorporation.
143
+
151
144
Feb 2025: add Finances and Funding section, update Code of Conduct section to be a brief summary and reference the broader CoC document.
152
145
153
146
Jan 2025: clarify that edits to governance should notify all committers, and that role names are proper nouns (i.e., upper-case) throughout.
Copy file name to clipboardExpand all lines: NEWS.md
+13-1Lines changed: 13 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,11 @@ frollsum(c(1,2,3,Inf,5,6), 2)
27
27
28
28
4.`as.Date()` method for `IDate` no longer coerces to `double`[#6922](https://github.com/Rdatatable/data.table/issues/6922). Thanks @MichaelChirico for the report and PR. The only effect should be on overly-strict tests that assert `Date` objects have `double` storage, which is not in general true, especially from R 4.5.0.
29
29
30
-
5. Multiple improvements has been added to rolling functions. Request came from @gpierard who needed left aligned, adaptive, rolling max, [#5438](https://github.com/Rdatatable/data.table/issues/5438). There was no `frollmax` function yet. Adaptive rolling functions did not have support for `align="left"`. `frollapply` did not support `adaptive=TRUE`. Available alternatives were base R `mapply` or self-join using `max` and grouping `by=.EACHI`. As a follow up of his request, following features has been added:
30
+
5.`as.data.table()` is slightly more efficient at converting arrays to data.tables, [#7019](https://github.com/Rdatatable/data.table/pull/7019). Thanks @eliocamp.
31
+
32
+
6.`between()` gains the argument `ignore_tzone=FALSE`. Normally, a difference in time zone between `lower` and `upper` will produce an error, and a difference in time zone between `x` and either of the others will produce a message. Setting `ignore_tzone=TRUE` bypasses the checks, allowing both comparisons to proceed without error or message about time zones.
33
+
34
+
7. Multiple improvements has been added to rolling functions. Request came from @gpierard who needed left aligned, adaptive, rolling max, [#5438](https://github.com/Rdatatable/data.table/issues/5438). There was no `frollmax` function yet. Adaptive rolling functions did not have support for `align="left"`. `frollapply` did not support `adaptive=TRUE`. Available alternatives were base R `mapply` or self-join using `max` and grouping `by=.EACHI`. As a follow up of his request, following features has been added:
31
35
- new function `frollmax`, applies `max` over a rolling window.
32
36
- support for `align="left"` for adaptive rolling function.
33
37
- support for `adaptive=TRUE` in `frollapply`.
@@ -85,6 +89,8 @@ As of now, adaptive rolling max has no _on-line_ implemention (`algo="fast"`), i
85
89
86
90
8.`fread()` no longer warns on certain systems on R 4.5.0+ where the file owner can't be resolved, [#6918](https://github.com/Rdatatable/data.table/issues/6918). Thanks @ProfFancyPants for the report and PR.
87
91
92
+
9. Joins to extended data.frames, e.g. `x[i, col := x.col1 + i.col2]` where `i` is a `tbl`, can use the `x.` and `i.` prefix forms, [#6998](https://github.com/Rdatatable/data.table/issues/6998). Thanks @MichaelChirico for the bug and PR.
93
+
88
94
### NOTES
89
95
90
96
1. Continued work to remove non-API C functions, [#6180](https://github.com/Rdatatable/data.table/issues/6180). Thanks Ivan Krylov for the PRs and for writing a clear and concise guide about the R API: https://aitap.codeberg.page/R-api/.
@@ -98,6 +104,12 @@ As of now, adaptive rolling max has no _on-line_ implemention (`algo="fast"`), i
98
104
99
105
3. {data.table} now depends on R 3.4.0 (2017).
100
106
107
+
4. Changes to `fread()` output and errors:
108
+
109
+
+ When the size of the file exceeds the size of the address space, `fread()` now signals an informative error instead of trying to map its size modulo the address space.
110
+
+ On non-Windows systems, `fread()` now prints the reason why the file couldn't be opened, which could also be due to it being too large to map.
111
+
+ With `verbose=TRUE`, file sizes are now printed using correct binary SI prefixes (the sizes have always been reported as bytes denominated in powers of `2^10`, so e.g. `1024*1024` bytes was reported as `1 MB` where `1 MiB` or `1.05 MB` is correct).
112
+
101
113
## data.table [v1.17.0](https://github.com/Rdatatable/data.table/milestone/34) (20 Feb 2025)
# lower/upper should be more tightly linked than x/lower, so error
24
22
# if the former don't match but only inform if they latter don't
25
23
if (tzs[2L]!=tzs[3L]) {
26
24
stopf("'between' lower= and upper= are both POSIXct but have different tzone attributes: %s. Please align their time zones.", brackify(tzs[2:3], quote=TRUE))
27
-
# otherwise the check in between.c that lower<=upper can (correctly) fail for this reason
28
25
}
29
26
if (tzs[1L]!=tzs[2L]) {
30
27
messagef("'between' arguments are all POSIXct but have mismatched tzone attributes: %s. The UTC times will be compared.", brackify(tzs, quote=TRUE))
0 commit comments