Skip to content

Conversation

@aitap
Copy link
Member

@aitap aitap commented Mar 6, 2025

gzip header generation was previously tied to header printing, together with some initialization (like crc, len) and de-initialization (like deflateEnd(&strm)). Moved it outside the headerLen branch.

Fixes: #6852.

aitap added 3 commits March 6, 2025 14:57
Previously, fwrite() emitted a raw DEFLATE stream if col.names was FALSE
(and thus headerLen was zero). Decompressing such files is possible, but
not trivial.
Previously, the zlib object zstrm was initialized whenever compression
was enabled, at least in order to calculate the buffer size for the
compressed data stream, but was only used for compression and
subsequently deallocated only when there was a header with column names.

Make sure to deflateEnd(&strm) even if there was no header.
@aitap aitap requested a review from MichaelChirico as a code owner March 6, 2025 12:06
@codecov
Copy link

codecov bot commented Mar 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.59%. Comparing base (6a8634e) to head (133878a).
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6853   +/-   ##
=======================================
  Coverage   98.59%   98.59%           
=======================================
  Files          79       79           
  Lines       14661    14664    +3     
=======================================
+ Hits        14455    14458    +3     
  Misses        206      206           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Mar 6, 2025

Comparison Plot

Generated via commit 133878a

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 4 minutes and 37 seconds
Installing different package versions 8 minutes and 15 seconds
Running and plotting the test cases 2 minutes and 19 seconds

@MichaelChirico
Copy link
Member

NB: this will be patched into 1.17.2:

###### Branching policy for PATCH RELEASE
# Patch releases handle regressions and CRAN-required off-cycle fixes (e.g. when changes to r-devel break data.table).
# * Start a branch `patch-x.y.z` corresponding to the target release version. It should be branched from the previous CRAN release,
# e.g. branch `patch-1.15.4` from tag `v1.15.2`, `patch-1.15.6` from tag `v1.15.4`, etc.
# * Add fixes for issues tagged to the corresponding milestone as PRs targeting the `master` branch. These PRs should be reviewed as usual
# and include a NEWS item under the ongoing development release.
# * After merging a patch PR, `cherry-pick` the corresponding commit into the target patch branch. Edit the NEWS item to be under the heading
# for the patch release.
# * Delete the patch branch after the corresponding tag is pushed.

aitap and others added 2 commits March 6, 2025 22:48
Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this done!

@MichaelChirico MichaelChirico merged commit d45546b into master Mar 6, 2025
10 of 11 checks passed
@MichaelChirico MichaelChirico deleted the fix6852 branch March 6, 2025 22:04
}
}
}
#ifndef NOZLIB
Copy link
Member

@ben-schwen ben-schwen Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly #6857 is not an issue for me when deleting this block 😕

@aitap

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if args.is_gzip is not true then we never call init_stream so this is actually not surprising anymore.

MichaelChirico added a commit that referenced this pull request Mar 10, 2025
…leak (#6853)

* fwrite(): emit gzip header even if !col.names

Previously, fwrite() emitted a raw DEFLATE stream if col.names was FALSE
(and thus headerLen was zero). Decompressing such files is possible, but
not trivial.

* Avoid a memory leak in fwrite(compress="gzip")

Previously, the zlib object zstrm was initialized whenever compression
was enabled, at least in order to calculate the buffer size for the
compressed data stream, but was only used for compression and
subsequently deallocated only when there was a header with column names.

Make sure to deflateEnd(&strm) even if there was no header.

* NEWS entry

* comment the source of this #endif for clarity

* Clean up the test file

Co-Authored-By: Michael Chirico <[email protected]>

* Report write() errno instead of just plain -1

Co-Authored-By: Michael Chirico <[email protected]>

---------

Co-authored-by: Michael Chirico <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fwrite() emits invalid gzips when col.names = FALSE as of 1.17.0

3 participants