You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: GOVERNANCE.md
+5-2Lines changed: 5 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -94,8 +94,9 @@ A pull request can be merged by any committer, if there is one approving review,
94
94
95
95
## Changing this GOVERNANCE.md document
96
96
97
-
There is no special process for changing this document (submit a PR
98
-
and ask for review).
97
+
There is no special process for changing this document. Submit a PR and ask for review; the group `@Rdatatable/committers` will automatically be assigned to ensure all current Committers are aware of the change.
98
+
99
+
Please also make a note in the change log under [`# Governance history`](#governance-history)
99
100
100
101
# Code of conduct
101
102
@@ -123,6 +124,8 @@ data.table Version line in DESCRIPTION typically has the following meanings
123
124
124
125
# Governance history
125
126
127
+
Jan 2025: clarify that edits to governance should notify all committers.
128
+
126
129
Feb 2024: change team name/link maintainers to committers, to be consistent with role defined in governance.
127
130
128
131
Nov-Dec 2023: initial version drafted by Toby Dylan Hocking and
Copy file name to clipboardExpand all lines: NEWS.md
+19-18Lines changed: 19 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,7 +69,7 @@ rowwiseDT(
69
69
70
70
6.`fread()`gains`logicalYN`argumenttoreadcolumnsconsistingonlyofstrings`Y`, `N`as `logical` (asopposedtocharacter), [#4563](https://github.com/Rdatatable/data.table/issues/4563). The default is controlled by option `datatable.logicalYN`, itself defaulting to `FALSE`, for back-compatibility -- some smaller tables (especially sharded tables) might inadvertently read a "true" string column as `logical` and cause bugs. This is particularly important for tables with a column named `y` or `n` -- automatic header detection under `logicalYN=TRUE` will see these values in the first row as being "data" as opposed to column names. A parallel option was not included for `fwrite()` at this time -- users looking for a compact representation of logical columns can still use `fwrite(logical01=TRUE)`. We also opted for now to check only `Y`, `N` and not `Yes`/`No`/`YES`/`NO`.
71
71
72
-
7.`fwrite()`with`compress="gzip"`producescompatiblegzfileswhencomposedofmultipleindependentchunksowingtoparallelization, [#6356](https://github.com/Rdatatable/data.table/issues/6356). Earlier `fwrite()` versions could have issues with HTTP upload using `Content-Encoding: gzip` and `Transfer-Encoding: chunked`. Thanks to @oliverfoster for report and @philippechataignon for the fix.
72
+
7.`fwrite()`with`compress="gzip"`producescompatiblegzfileswhencomposedofmultipleindependentchunksowingtoparallelization, [#6356](https://github.com/Rdatatable/data.table/issues/6356). Earlier `fwrite()` versions could have issues with HTTP upload using `Content-Encoding: gzip` and `Transfer-Encoding: chunked`. Thanks to @oliverfoster for report and @philippechataignon for the fix. Thanks also @aitap for pre-release testing that found some possible memory leaks in the initial fix.
73
73
74
74
8.`fwrite()`gainsanewparameter`compressLevel`tocontrolcompressionlevelforgzip, [#5506](https://github.com/Rdatatable/data.table/issues/5506). This parameter balances compression speed and total compression, and corresponds directly to the analogous command-line parameter, e.g. `compressLevel=4` corresponds to passing `-4`; the default, `6`, matches the command-line default, i.e. equivalent to passing `-6`. Thanks @mgarbuzov for the request and @philippechataignon for implementing.
75
75
@@ -109,31 +109,29 @@ rowwiseDT(
109
109
110
110
8.Fixedpossiblesegfaultin`setDT(df); attr(df, key) <- value; set(df, ...)`, i.e.addingcolumnstoanobjectwith`set()`thatwasconvertedtodata.tablewith`setDT()`andlaterhadattributesaddwith`attr<-`, [#6410](https://github.com/Rdatatable/data.table/issues/6410). Thanks to @hongyuanjia for the report and @ben-schwen for the PR. Note that `setattr()` should be preferred for adding attributes to a data.table.
111
111
112
-
9.`setDT()`nolongermodifiestheclassofothernamesboundtotheorigindata.frame, e.g., in`DF1 <- data.frame(a=1); DF2 <- DF1; setDT(DF2)`, `DF1`'s class will not change. [#4784](https://github.com/Rdatatable/data.table/issues/4784). Thanks @OfekShilon for the report and fix.
112
+
9.`DT[1, on=NULL]`nowworksforreturningthefirstrow, [#6579](https://github.com/Rdatatable/data.table/issues/6579). Thanks to @Kodiologist for the report and @tdhock for the PR.
113
113
114
-
10. `DT[1, on=NULL]` now works for returning the first row, [#6579](https://github.com/Rdatatable/data.table/issues/6579). Thanks to @Kodiologist for the report and @tdhock for the PR.
114
+
10.`tables()`nowreturnsthecorrectsizefordata.tablesover2GiB, [#6607](https://github.com/Rdatatable/data.table/issues/6607). Thanks to @vlulla for the report and the PR.
115
115
116
-
11. `tables()` now returns the correct size for data.tables over 2GiB, [#6607](https://github.com/Rdatatable/data.table/issues/6607). Thanks to @vlulla for the report and the PR.
116
+
11.`rbindlist(l, use.names=TRUE)`cannowhandledifferentencodingsforthecolumnnamesindifferententriesof`l`, [#5452](https://github.com/Rdatatable/data.table/issues/5452). Thanks to @MEO265 for the report, and Benjamin Schwendinger for the fix.
117
117
118
-
12. `rbindlist(l, use.names=TRUE)` can now handle different encodings for the column names in different entries of `l`, [#5452](https://github.com/Rdatatable/data.table/issues/5452). Thanks to @MEO265 for the report, and Benjamin Schwendinger for the fix.
118
+
12.Addeda`data.frame`methodfor`format_list_item()`tofixerrorprintingdata.tableswithcolumnscontaining1-columndata.frames, [#6592](https://github.com/Rdatatable/data.table/issues/6592). Thanks to @r2evans for the bug report and fix.
119
119
120
-
13. Added a `data.frame` method for `format_list_item()` to fix error printing data.tables with columns containing 1-column data.frames, [#6592](https://github.com/Rdatatable/data.table/issues/6592). Thanks to @r2evans for the bug report and fix.
121
-
122
-
14. Auto-printing gets some substantial improvements
120
+
13.Auto-printinggetssomesubstantialimprovements
123
121
-Suppressionin`knitr`documentsisnowdonebyimplementingamethodfor`knit_print`insteadoflookingupthecallstack, [#6589](https://github.com/Rdatatable/data.table/pull/6589). The old way was fragile and wound up broken by some implementation changes in {knitr}. Thanks to @jangorecki for the report [#6509](https://github.com/Rdatatable/data.table/issues/6509) and @aitap for the fix.
124
122
-`print()`methodsforS3subclassesof data.table (e.g.anobjectofclass`c("my.table", "data.table", "data.frame")`) nolongerprintwhereplaindata.tableswouldn't, e.g. `myDT[, y := 2]`, [#3029](https://github.com/Rdatatable/data.table/issues/3029). The improved detection of auto-printing scenarios has the added benefit of _allowing_ print in highly explicit statements like `print(DT[, y := 2])`, obviating our recommendation since v1.9.6 to append `[]` to signal "please print me".
125
123
126
-
15.Joinsof`integer64`and`double`columnssucceedwhenthe`double`columnhaslossless`integer64`representation, [#4167](https://github.com/Rdatatable/data.table/issues/4167) and [#6625](https://github.com/Rdatatable/data.table/issues/6625). Previously, this only worked when the double column had lossless _32-bit_ integer representation. Thanks @MichaelChirico for the reports and fix.
124
+
14. Joins of `integer64` and `double` columns succeed when the `double` column has lossless `integer64` representation, [#4167](https://github.com/Rdatatable/data.table/issues/4167) and [#6625](https://github.com/Rdatatable/data.table/issues/6625). Previously, this only worked when the double column had lossless _32-bit_ integer representation. Thanks @MichaelChirico for the reports and fix.
127
125
128
-
16.`DT[order(...)]`bettermatches`base::order()`behavior by (1) recognizingthe`method=` argument (anderroringsincethisisnotsupported) and (2) acceptingavectorof`TRUE`/`FALSE`in`decreasing=`asanalternativetousing`-a`toconvey"sort `a` decreasing", [#4456](https://github.com/Rdatatable/data.table/issues/4456). Thanks @jangorecki for the FR and @MichaelChirico for the PR.
126
+
15. `DT[order(...)]` better matches `base::order()` behavior by (1) recognizing the `method=` argument (and erroring since this is not supported) and (2) accepting a vector of `TRUE`/`FALSE` in `decreasing=` as an alternative to using `-a` to convey "sort `a` decreasing", [#4456](https://github.com/Rdatatable/data.table/issues/4456). Thanks @jangorecki for the FR and @MichaelChirico for the PR.
129
127
130
-
17.Assignmentwith`:=`toanS4slotofanunder-allocateddata.tablenowworks, [#6704](https://github.com/Rdatatable/data.table/issues/6704). Thanks @MichaelChirico for the report and fix.
128
+
16. Assignment with `:=` to an S4 slot of an under-allocated data.table now works, [#6704](https://github.com/Rdatatable/data.table/issues/6704). Thanks @MichaelChirico for the report and fix.
131
129
132
-
18.`as.data.table()`methodfor`data.frame`s (especiallythosewithextendedclasses) ismoreconsistentwith`as.data.frame()`withrespecttorentionofattributes, [#5699](https://github.com/Rdatatable/data.table/issues/5699). Thanks @jangorecki for the report and fix.
130
+
17. `as.data.table()` method for `data.frame`s (especially those with extended classes) is more consistent with `as.data.frame()` with respect to rention of attributes, [#5699](https://github.com/Rdatatable/data.table/issues/5699). Thanks @jangorecki for the report and fix.
133
131
134
-
19.Groupedqueriesonkeyedtablesnolongerreturnanincorrectlykeyedresultifthe_adhoc_`by=`listhassomefunction call (inparticular, afunctionwhichhappenstoreturnastrictlydecreasingfunctionofthekeys), e.g.`by=.(a = rev(a))`, [#5583](https://github.com/Rdatatable/data.table/issues/5583). Thanks @AbrJA for the report and @MichaelChirico for the fix.
132
+
18. Grouped queries on keyed tables no longer return an incorrectly keyed result if the _ad hoc_ `by=` list has some function call (in particular, a function which happens to return a strictly decreasing function of the keys), e.g. `by=.(a = rev(a))`, [#5583](https://github.com/Rdatatable/data.table/issues/5583). Thanks @AbrJA for the report and @MichaelChirico for the fix.
135
133
136
-
20.Assigning`list(NULL)`toalistcolumnnowreplacesthecolumnwith`list(NULL)`, insteadofdeletingthecolumn [#5558](https://github.com/Rdatatable/data.table/issues/5558). This behavior is now consistent with base `data.frame`. Thanks @tdhock for the report and @joshhwuu for the fix. This is due to a fundamental ambiguity from both allowing list columns _and_ making the use of `list()` to wrap `j=` arguments optional. We think that the code behaves as expected in all cases now. See the below for some illustration:
134
+
19. Assigning `list(NULL)` to a list column now replaces the column with `list(NULL)`, instead of deleting the column [#5558](https://github.com/Rdatatable/data.table/issues/5558). This behavior is now consistent with base `data.frame`. Thanks @tdhock for the report and @joshhwuu for the fix. This is due to a fundamental ambiguity from both allowing list columns _and_ making the use of `list()` to wrap `j=` arguments optional. We think that the code behaves as expected in all cases now. See the below for some illustration:
137
135
138
136
```r
139
137
DT = data.table(L=list(1L), i=2L, c='a')
@@ -154,7 +152,7 @@ rowwiseDT(
154
152
DT[, c('L', 'i') := list(NULL, 3L)] # delete L, assign to i
155
153
DT[, c('L', 'i') := list(list(NULL), NULL)] # assign to L, delete i
156
154
```
157
-
21.Anintegeroverflowin`fread()`withlineslongerthan`2^(31/2)`bytesisprevented, [#6729](https://github.com/Rdatatable/data.table/issues/6729). The typical impact was no worse than a wrong initial allocation size, corrected later. Thanks to @TaikiSan21 for the report and @aitap for the fix.
155
+
20. An integer overflow in `fread()` with lines longer than `2^(31/2)` bytes is prevented, [#6729](https://github.com/Rdatatable/data.table/issues/6729). The typical impact was no worse than a wrong initial allocation size, corrected later. Thanks to @TaikiSan21 for the report and @aitap for the fix.
10. Deprecation of `logicalAsInt` argument to `fwrite()` has been upgraded from a warning (since v1.15.0) to an error. It will be removed in the next release.
11. Deprecation of `fread(autostart=)` has been upgraded to an error. It has been warning since v1.11.0 (6 years ago). The argument will be removed in the next release.
12. Deprecation of `droplevels(in.place=TRUE)` (warning since v1.16.0) has been upgraded from warning to error. The argument will be removed in the next release.
184
+
11.Betterhandlingofmultibytecharactersin`print()`, addedin1.16.0,hasthesideeffectofpossiblyignoringinvisiblecharacterslike`\n`or`\t`forthepurposesofcountingwidthfor`datatable.prettyprint.char`.That's because we switched to using `strtrim()` over `substring()`, the latter of which is explicitly discouraged for the purposes of truncating strings, whereas the former of which has platform-dependent behavior for whether invisible characters count towards string width.
184
185
185
186
# data.table [v1.16.4](https://github.com/Rdatatable/data.table/milestone/36) 4 December 2024
(setattr(secs, "class", "ITime")) # the first line that creates sec will create a local copy so we can use setattr() to avoid potential copy of class()<-
if (is.data.table(x)) return(as.data.table.data.table(x)) # S3 is weird, #6739. Also # nocov; this is tested in 2302.{2,3}, not sure why it doesn't show up in coverage.
217
218
if (!identical(class(x), "data.frame")) return(as.data.table(as.data.frame(x)))
218
219
if (!isFALSE(keep.rownames)) {
219
220
# can specify col name to keep.rownames, #575; if it's the same as key,
0 commit comments