You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fwrite with correct file length
* gzip length and crc are manually computed in each thread and then
added/combined
* gzip header is minimal
* remove some old debug code
* Escape with NOZLIB for compilation succeed without zlib
* Move zlib check at start to avoid oufile deletion
* Indent and add comments
* Buffers unification
* Restore schedule(dynamic) more efficient and progress
* Use alloc_size to see allocation when verbose
* Test if stream init succeded
* Add cast to avoid warnings on Windows
* More explicit timing messages
* Free stream structs
* Add option to control compression level for fwrite with gzip
* Rework namings and default value
* Rename gzipLevel to compressLevel
* compressLevel param documentation
* Put zlib initialization together
* Refact buffSize, numBatchs and numBatches
* Add missing NOZLIB
* Increase outputs in last message when verbose
* No real init for stream_thread when is_gzip false
* Minor corrections
* Uses %zu format for size_t
* Last verbose msg was not printed when not is_gzip
* minor operator ws change
* Add test for compressLevel=1
* Add url link in compressLevel documentation
* Add 2 lines in NEWS for fwrite fix and compressLevel
* tidy-up, expand NEWS for compressLevel
* Use match.arg() for arg validation
* add a test for the other extreme compressLevel=9
* partial test fix
* fix updated test errors
* confirmed NEWS wording, fix typo
* fix order
* weak ordering
* place in 1.17.0 NEWS
* Add parenthesis to be more explicit
* Add comment for DeflateInit2
* typo
* Add parenthesis to be more explicit (2)
Co-authored-by: Michael Chirico <[email protected]>
* Try to emphasize that '-' is "command flag hyphen", not "negative"
* Convert Toby'd comment to atime_test()
* Remove INTERNAL_STOP
* Increase coverage
* add // # nocov for STOP, like previous version
* add a test when naLen > width
* remove test of buffMB done in fwrite.R
* Try to fix nocov error
* Another attempt to increase coverage
* Add more nocov
* More judicious #nocov, keep INTERNAL_STOP
* eol='' coverage
* buffMB<line width
* Similar for buffMB vs. header width
* 0-row table verbose output
---------
Co-authored-by: Benjamin Schwendinger <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
Slow="fd24a3105953f7785ea7414678ed8e04524e6955", # Parent of the merge commit (https://github.com/Rdatatable/data.table/commit/ed72e398df76a0fcfd134a4ad92356690e4210ea) of the PR (https://github.com/Rdatatable/data.table/pull/5054) that fixes the issue
234
-
Fast="ed72e398df76a0fcfd134a4ad92356690e4210ea"), # Merge commit of the PR (https://github.com/Rdatatable/data.table/pull/5054) that fixes the issue
234
+
Fast="ed72e398df76a0fcfd134a4ad92356690e4210ea"), # Merge commit of the PR (https://github.com/Rdatatable/data.table/pull/5054) that fixes the issue # Test case created directly using the atime code below (not adapted from any other benchmark), based on the issue/fix PR https://github.com/Rdatatable/data.table/pull/5054#issue-930603663 "melt should be more efficient when there are missing input columns."
235
+
236
+
# Test case created from @tdhock's comment https://github.com/Rdatatable/data.table/pull/6393#issuecomment-2327396833, in turn adapted from @philippechataignon's comment https://github.com/Rdatatable/data.table/pull/6393#issuecomment-2326714012
Before="f339aa64c426a9cd7cf2fcb13d91fc4ed353cd31", # Parent of the first commit https://github.com/Rdatatable/data.table/commit/fcc10d73a20837d0f1ad3278ee9168473afa5ff1 in the PR https://github.com/Rdatatable/data.table/pull/6393/commits with major change to fwrite with gzip.
247
+
PR="3630413ae493a5a61b06c50e80d166924d2ef89a"), # Close-to-last merge commit in the PR.
Copy file name to clipboardExpand all lines: NEWS.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -69,6 +69,10 @@ rowwiseDT(
69
69
70
70
6.`fread()`gains`logicalYN`argumenttoreadcolumnsconsistingonlyofstrings`Y`, `N`as `logical` (asopposedtocharacter), [#4563](https://github.com/Rdatatable/data.table/issues/4563). The default is controlled by option `datatable.logicalYN`, itself defaulting to `FALSE`, for back-compatibility -- some smaller tables (especially sharded tables) might inadvertently read a "true" string column as `logical` and cause bugs. This is particularly important for tables with a column named `y` or `n` -- automatic header detection under `logicalYN=TRUE` will see these values in the first row as being "data" as opposed to column names. A parallel option was not included for `fwrite()` at this time -- users looking for a compact representation of logical columns can still use `fwrite(logical01=TRUE)`. We also opted for now to check only `Y`, `N` and not `Yes`/`No`/`YES`/`NO`.
71
71
72
+
7.`fwrite()`with`compress="gzip"`producescompatiblegzfileswhencomposedofmultipleindependentchunksowingtoparallelization, [#6356](https://github.com/Rdatatable/data.table/issues/6356). Earlier `fwrite()` versions could have issues with HTTP upload using `Content-Encoding: gzip` and `Transfer-Encoding: chunked`. Thanks to @oliverfoster for report and @philippechataignon for the fix.
73
+
74
+
8.`fwrite()`gainsanewparameter`compressLevel`tocontrolcompressionlevelforgzip, [#5506](https://github.com/Rdatatable/data.table/issues/5506). This parameter balances compression speed and total compression, and corresponds directly to the analogous command-line parameter, e.g. `compressLevel=4` corresponds to passing `-4`; the default, `6`, matches the command-line default, i.e. equivalent to passing `-6`. Thanks @mgarbuzov for the request and @philippechataignon for implementing.
75
+
72
76
## BUG FIXES
73
77
74
78
1.`fwrite()`respects`dec=','`fortimestamp columns (`POSIXct`or`nanotime`) withsub-secondaccuracy, [#6446](https://github.com/Rdatatable/data.table/issues/6446). Thanks @kav2k for pointing out the inconsistency and @MichaelChirico for the PR.
@@ -304,7 +308,7 @@ rowwiseDT(
304
308
305
309
5.Inputfilesarenowkeptopenduring`mmap()`whenrunningunderEmscripten, [emscripten-core/emscripten#20459](https://github.com/emscripten-core/emscripten/issues/20459). This avoids an error in `fread()` when running in WebAssembly, [#5969](https://github.com/Rdatatable/data.table/issues/5969). Thanks to @maek-ies for the report and @georgestagg for the PR.
a.Thisnowtriggersawarning, notamessage, sincerelyingonthisdefaultoftensignalsunexpectedduplicatesinthedata, [#5386](https://github.com/Rdatatable/data.table/issues/5386). The warning is classed as `dt_missing_fun_aggregate_warning`, allowing for more targeted handling in user code. Thanks @MichaelChirico for the suggestion and @Nj221102 for the fix.
310
314
@@ -1019,7 +1023,7 @@ rowwiseDT(
1019
1023
1020
1024
14. The options `datatable.print.class` and `datatable.print.keys` are now `TRUE` by default. They have been available since v1.9.8 (Nov 2016) and v1.11.0 (May 2018) respectively.
1021
1025
1022
-
15. Thanks to @ssh352, Václav Tlapák, Cole Miller, András Svraka and Toby Dylan Hocking for reporting and bisecting a significant performance regression in dev. This was fixed before release thanks to a PR by Jan Gorecki, [#5463](https://github.com/Rdatatable/data.table/pull/5463).
1026
+
15. Thanks to @ssh352, Václav Tlapák, Cole Miller, András Svraka and Toby Dylan Hocking for reporting and bisecting a significant performance regression in dev. This was fixed before release thanks to a PR by Jan Gorecki, [#5463](https://github.com/Rdatatable/data.table/pull/5463).
1023
1027
1024
1028
16. `key(x) <- value` is now fully deprecated (from warning to error). Use `setkey()` to set a table'skey.Westartedwarningnottousethisapproachin2012, withastrongerwarningstartingin2019 (1.12.2).Thisfunctionwillberemovedinthenextrelease.
0 commit comments