Skip to content

Commit d45546b

Browse files
fwrite(): when col.names=FALSE, produce a gzip header, avoid a leak (#6853)
* fwrite(): emit gzip header even if !col.names Previously, fwrite() emitted a raw DEFLATE stream if col.names was FALSE (and thus headerLen was zero). Decompressing such files is possible, but not trivial. * Avoid a memory leak in fwrite(compress="gzip") Previously, the zlib object zstrm was initialized whenever compression was enabled, at least in order to calculate the buffer size for the compressed data stream, but was only used for compression and subsequently deallocated only when there was a header with column names. Make sure to deflateEnd(&strm) even if there was no header. * NEWS entry * comment the source of this #endif for clarity * Clean up the test file Co-Authored-By: Michael Chirico <[email protected]> * Report write() errno instead of just plain -1 Co-Authored-By: Michael Chirico <[email protected]> --------- Co-authored-by: Michael Chirico <[email protected]>
1 parent 3a80920 commit d45546b

File tree

3 files changed

+36
-12
lines changed

3 files changed

+36
-12
lines changed

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@
1010

1111
1. Custom binary operators from the `lubridate` package now work with objects of class `IDate` as with a `Date` subclass, [#6839](https://github.com/Rdatatable/data.table/issues/6839). Thanks @emallickhossain for the report and @aitap for the fix.
1212

13+
2. `fwrite(compress="gzip")` once again produces a gzip header when the column names are missing or disabled, [@6852](https://github.com/Rdatatable/data.table/issues/6852). Thanks @maxscheiber for the report and @aitap for the fix.
14+
1315
## NOTES
1416

1517
1. Continued work to remove non-API C functions, [#6180](https://github.com/Rdatatable/data.table/issues/6180). Thanks Ivan Krylov for the PRs and for writing a clear and concise guide about the R API: https://aitap.codeberg.page/R-api/.

inst/tests/tests.Rraw

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10024,10 +10024,13 @@ if (!haszlib()) {
1002410024
fwrite(DT, file=f3<-tempfile(), compress="gzip") # compress to filename not ending .gz
1002510025
fwrite(DT, file=f4<-tempfile(), compress="gzip", compressLevel=1) # test compressLevel
1002610026
fwrite(DT, file=f5<-tempfile(), compress="gzip", compressLevel=9)
10027+
# col.names=FALSE must not disable gzip header, #6852
10028+
fwrite(DT, file=f6<-tempfile(), compress="gzip", col.names=FALSE)
1002710029
test(1658.441, file.info(f3)$size, file.info(f1)$size)
1002810030
test(1658.442, file.info(f4)$size >= file.info(f1)$size)
1002910031
test(1658.443, file.info(f1)$size >= file.info(f5)$size)
10030-
unlink(c(f1,f2,f3,f4,f5))
10032+
test(1658.444, fread(f6, col.names = c("a", "b")), DT)
10033+
unlink(c(f1,f2,f3,f4,f5,f6))
1003110034
}
1003210035
DT = data.table(a=1:3, b=list(1:4, c(3.14, 100e10), c("foo", "bar", "baz")))
1003310036
test(1658.45, fwrite(DT), output=c("a,b","1,1|2|3|4","2,3.14|1e+12","3,foo|bar|baz"))

src/fwrite.c

Lines changed: 30 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -829,9 +829,28 @@ void fwriteMain(fwriteMainArgs args)
829829
zbuffSize / MEGA, nth, errno, strerror(errno));
830830
// # nocov end
831831
}
832-
}
832+
len = 0;
833+
crc = crc32(0L, Z_NULL, 0);
833834

834-
#endif
835+
if (f != -1) {
836+
// write minimal gzip header, but not on the console
837+
static const char header[] = "\037\213\10\0\0\0\0\0\0\3";
838+
int ret0 = WRITE(f, header, (sizeof header) - 1);
839+
compress_len += (sizeof header) - 1;
840+
841+
if (ret0 == -1) {
842+
// # nocov start
843+
int errwrite = errno; // capture write errno now in case close fails with a different errno
844+
CLOSE(f);
845+
free(buffPool);
846+
free(zbuffPool);
847+
deflateEnd(&strm);
848+
STOP(_("Failed to write gzip header. Write returned %d"), errwrite);
849+
// # nocov end
850+
}
851+
}
852+
}
853+
#endif // #NOZLIB
835854

836855
// write header
837856

@@ -873,15 +892,10 @@ void fwriteMain(fwriteMainArgs args)
873892
*ch = '\0';
874893
DTPRINT("%s", buff); // # notranslate
875894
} else {
876-
int ret0=0, ret1=0, ret2=0;
895+
int ret1=0, ret2=0;
877896
#ifndef NOZLIB
878897
if (args.is_gzip) {
879898
char* zbuff = zbuffPool;
880-
// write minimal gzip header
881-
char* header = "\037\213\10\0\0\0\0\0\0\3";
882-
ret0 = WRITE(f, header, 10);
883-
compress_len += 10;
884-
crc = crc32(0L, Z_NULL, 0);
885899

886900
size_t zbuffUsed = zbuffSize;
887901
len = (size_t)(ch - buff);
@@ -898,21 +912,26 @@ void fwriteMain(fwriteMainArgs args)
898912
#ifndef NOZLIB
899913
}
900914
#endif
901-
if (ret0 == -1 || ret1 || ret2 == -1) {
915+
if (ret1 || ret2 == -1) {
902916
// # nocov start
903917
int errwrite = errno; // capture write errno now in case close fails with a different errno
904918
CLOSE(f);
905919
free(buffPool);
906920
#ifndef NOZLIB
907921
free(zbuffPool);
908922
#endif
909-
if (ret0 == -1) STOP(_("Failed to write gzip header. Write returned %d"), ret0);
910-
else if (ret1) STOP(_("Failed to compress gzip. compressbuff() returned %d"), ret1);
923+
if (ret1) STOP(_("Failed to compress gzip. compressbuff() returned %d"), ret1);
911924
else STOP(_("%s: '%s'"), strerror(errwrite), args.filename);
912925
// # nocov end
913926
}
914927
}
915928
}
929+
#ifndef NOZLIB
930+
else {
931+
// was unconditionally initialized for zbuffSize, not used for header
932+
deflateEnd(&strm);
933+
}
934+
#endif
916935
if (verbose)
917936
DTPRINT(_("Initialization done in %.3fs\n"), 1.0*(wallclock()-t0));
918937

0 commit comments

Comments
 (0)