Skip to content

feat: Add support for Zlib compression/decompression #417

Open
oskardanielsson-yubico wants to merge 4 commits intodevelopfrom
oskar/zlib
Open

feat: Add support for Zlib compression/decompression #417
oskardanielsson-yubico wants to merge 4 commits intodevelopfrom
oskar/zlib

Conversation

@oskardanielsson-yubico
Copy link
Contributor

Description

Certificates compressed with zlib can now be decompressed in the Piv application

Fixes: # <link to issue(s)>

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How has this been tested?

Unit tests that test the compression and decompression have been added and run.

Test configuration:

  • OS version: MacOS 15.6

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@github-actions
Copy link
Contributor

github-actions bot commented Feb 24, 2026

Test Results: Windows

    2 files      2 suites   16s ⏱️
4 051 tests 4 032 ✅ 19 💤 0 ❌
4 053 runs  4 034 ✅ 19 💤 0 ❌

Results for commit ee2a19f.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 24, 2026

Test Results: Ubuntu

    2 files      2 suites   48s ⏱️
4 043 tests 4 024 ✅ 19 💤 0 ❌
4 045 runs  4 026 ✅ 19 💤 0 ❌

Results for commit ee2a19f.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 24, 2026

Test Results: MacOS

    4 files      4 suites   40s ⏱️
4 025 tests 4 025 ✅ 0 💤 0 ❌
4 027 runs  4 027 ✅ 0 💤 0 ❌

Results for commit ee2a19f.

♻️ This comment has been updated with latest results.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Zlib compression/decompression to enable the PIV application to handle certificates compressed with zlib format. The implementation includes a new internal ZLibStream class that provides RFC 1950-compliant zlib compression/decompression, along with format auto-detection in the certificate retrieval logic to support both the standard GZip format and a custom Zlib format used by some third-party PIV middleware.

Changes:

  • Added a new ZLibStream class implementing RFC 1950 zlib compression/decompression with Adler-32 checksum validation
  • Enhanced certificate decompression in PivSession to auto-detect and support both GZip and Zlib formats
  • Added comprehensive unit tests for the ZLibStream class covering compression, decompression, edge cases, and error conditions

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.

File Description
Yubico.YubiKey/src/Yubico/YubiKey/Cryptography/ZLibStream.cs New internal class implementing zlib compression/decompression with header validation and Adler-32 checksum
Yubico.YubiKey/src/Yubico/YubiKey/Piv/PivSession.KeyPairs.cs Updated certificate decompression to detect and handle both GZip and Zlib formats, improved exception handling with inner exception
Yubico.YubiKey/tests/unit/Yubico/YubiKey/Cryptography/ZLibStreamTests.cs Comprehensive unit tests for ZLibStream including compression levels, error conditions, Adler-32 validation, and format compatibility

Comment on lines +297 to +301
if (!buffer.IsEmpty)
{
byte[] temp = buffer.ToArray();
UpdateAdler32(temp, 0, temp.Length);
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The async WriteAsync method creates a temporary array copy (line 299) to update the Adler-32 checksum. This allocation and copy operation happens on every async write, which could impact performance. For ReadOnlyMemory<byte>, consider using MemoryMarshal.TryGetArray to access the underlying array when possible, or use the Span property to avoid the allocation. Alternatively, if the buffer is backed by an array, you can avoid the copy by checking if it's contiguous memory.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI vs AI difference of opinion

Comment on lines +504 to +513
private void UpdateAdler32(byte[] buffer, int offset, int count)
{
const uint modAdler = 65521;

for (int i = offset; i < offset + count; i++)
{
_adlerA = (_adlerA + buffer[i]) % modAdler;
_adlerB = (_adlerB + _adlerA) % modAdler;
}
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Adler-32 implementation performs modulo operation on every byte, which can be inefficient for large data. The standard optimization is to defer the modulo operation and only apply it after processing a certain number of bytes (typically 5550 bytes, which is the largest value where overflow won't occur). This optimization is commonly used in production zlib implementations and can significantly improve performance when compressing large data. Consider optimizing UpdateAdler32 to batch the modulo operations.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI vs AI difference of opinion

@github-actions
Copy link
Contributor

Code Coverage

Package Line Rate Branch Rate Complexity Health
Yubico.Core 45% 36% 3038
Yubico.YubiKey 50% 46% 14352
Summary 49% (24903 / 50858) 44% (6005 / 13700) 17390

Minimum allowed line rate is 40%

@DennisDyallo
Copy link
Collaborator

@claude can you take a good look at this? Code review, correctness, quality, maintainability, codebase alignment, security, etc.

@claude
Copy link

claude bot commented Feb 26, 2026

Claude finished @DennisDyallo's task in 4m 19s —— View job


Code Review

Tasks

  • Gather context and understand the request
  • Read changed files
  • Analyze ZLibStream implementation (correctness, security, quality)
  • Analyze PivSession.KeyPairs.cs changes
  • Analyze ExceptionMessages changes
  • Analyze unit tests
  • Post comprehensive review

Overall Assessment

The implementation is solid — well-structured, properly documented, and follows codebase conventions. The core logic (RFC 1950 header parsing, Adler-32 computation, GIDS format detection) is correct. However there are several issues ranging from a meaningful correctness gap to minor style nits worth addressing before merge.


🔴 High: Missing Adler-32 Validation During Decompression

ZLibStream.cs — decompression path

The class advertises RFC 1950 compliance (// This implementation is based on RFC 1950) and the docs say it handles "a 4-byte Adler-32 checksum trailer." However, during decompression the trailer bytes are never consumed or verified. After DeflateStream finishes reading the compressed payload, the last 4 bytes of the underlying stream (the Adler-32) remain unread. Crucially, the decompressed data is never checksummed.

This means:

  • A zlib stream where the deflate data is intact but the Adler-32 is wrong will silently succeed.
  • A stream where the deflate data is corrupted might be caught by DeflateStream's own internal CRC, but the Adler-32 integrity layer provides no extra protection.

For the specific use case (certificate decompression off a YubiKey), this is low-risk in practice. But the class's own documentation sets the expectation of RFC 1950 compliance, and callers may use it beyond this one callsite.

Either:

  1. Document the limitation explicitly — update the class summary to state that Adler-32 is computed during compression but is not verified during decompression.
  2. Implement the validation — intercept reads to track the rolling checksum, then after DeflateStream is exhausted, read the 4 trailing bytes from BaseStream and compare. This is non-trivial but makes the class genuinely RFC-compliant.

Fix this →


🟠 Medium: GIDS Length Guard Is Too Loose

PivSession.KeyPairs.cs line 713

if (data[0] == 0x01 && data[1] == 0x00 && data.Length > 4)

data.Length > 4 permits a 5-byte buffer. With a 4-byte GIDS header consumed, DecompressZlib is passed a 1-byte stream. ZLibStream.ReadAndValidateZLibHeader() will then fail on the second ReadByte() (returns -1) and throw InvalidDataException with an internal "Unexpected end of zlib header" message, which gets wrapped by the outer catch in GetCertificate.

The minimum usable size is 4 (GIDS header) + 2 (zlib CMF/FLG) = 6 bytes. Change to:

if (data[0] == 0x01 && data[1] == 0x00 && data.Length >= 6)

Fix this →


🟠 Medium: Hardcoded Exception Message in GenerateKeyPair

PivSession.KeyPairs.cs line 179

throw new InvalidOperationException("Error generating key pair: " + response);

Every other exception in this file (and across the SDK) uses ExceptionMessages. This string doesn't belong in source code. It should be added to ExceptionMessages.resx and accessed via ExceptionMessages.YourNewKey.

Fix this →


🟡 Low: Adler-32 Trailer Correctness Is Not Tested End-to-End

ZLibStreamTests.cs

ComputeAdler32_KnownInput_ReturnsExpectedChecksum correctly validates the standalone checksum method. But there is no test that:

  1. Compresses data with ZLibStream
  2. Reads back the final 4 bytes of the compressed output
  3. Asserts those bytes equal the expected Adler-32 of the original data

Without this, a regression in WriteAdler32Trailer (e.g. wrong byte order) would go undetected. A test using the "Wikipedia" constant (0x11E60398) would serve this well.

Fix this →


🟡 Low: Missing Test for GIDS Decompressed Length Mismatch

ZLibStreamTests.cs

DecompressGids validates the decompressed length against the GIDS header value and throws with a formatted message if they differ. This error path has no test coverage. Add a test that constructs a GIDS blob with an incorrect expected length and asserts that InvalidOperationException is thrown.


🟡 Low: GIDS Test Does Not Exercise the Production Code Path

ZLibStreamTests.csDecompress_GidsFormat_StripsHeaderAndDecompresses (line 53)

This test manually strips the 4-byte header and passes the rest to ZLibStream directly. It validates the mechanism but doesn't call DecompressWithFormatDetection or DecompressGids. Consider adding a test that passes a complete GIDS-formatted byte array through the same entry point used by GetCertificate (which calls DecompressWithFormatDetection internally). This would give confidence that the format detection and dispatch logic is correct.


🟡 Low: GIDS Expected Length Limited to 16 Bits (Worth a Comment)

PivSession.KeyPairs.cs line 741

int expectedLength = data[2] | (data[3] << 8);

This reads 2 bytes, giving a max expected decompressed length of 65535 bytes. This is fine for X.509 certificates (well under 64 KB), but a comment noting this limitation would prevent future confusion if this code is reused. The GIDS standard may support larger lengths with a 4-byte field.


✅ What's Done Well

  • Class design: ZLibStream follows the standard .NET Stream pattern cleanly — proper sealed class, CanRead/CanWrite properties, Dispose lifecycle, leaveOpen support. The existing .NET GZipStream API is mirrored well.
  • RFC 1950 header handling: CMF/FLG validation (compression method check, FDICT flag check, % 31 checksum) is correct and well-commented.
  • Adler-32 computation: The algorithm is correct — verified against the well-known "Wikipedia" → 0x11E60398 test vector. The UpdateAdler32 instance method and ComputeAdler32 static method are consistent.
  • Exception messages: All exception strings (except one, noted above) correctly use ExceptionMessages.resx. The new resource entries follow the existing naming and formatting conventions.
  • Format detection logic: The DecompressWithFormatDetection approach is clean — GZip magic bytes first, then GIDS prefix, then throw. The GIDS decompressed-length validation is a good defensive check.
  • Test coverage: The test suite is comprehensive for ZLibStream itself — round-trips, all compression levels, edge cases (empty input, large data, truncated header, wrong compression method, disposed stream, leaveOpen). The known-good test vector for decompression (789c...) is valuable for interoperability.
  • Multi-target compatibility: The #if NETSTANDARD2_1_OR_GREATER guards for ReadAsync(Memory<byte>), WriteAsync(ReadOnlyMemory<byte>), and CopyTo(Stream, int) are correct.
  • Codebase alignment: The internal sealed modifier, namespace placement (Yubico.YubiKey.Cryptography), XML doc coverage, and resource file usage all align with existing conventions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants