Skip to content

Conversation

@RenjiSann
Copy link
Collaborator

The latest GNU release added tests to cksum-c which made our implementation fail.

Namely, the testcase that was problematic was the following:

echo 'SHA1+++++++++++++++++++++++=  /dev/null' | ${CKSUM} -c -a sha1

This command used to produce an "improperly formatted line" error, but is now required to fail at the string digest comparison.

The error here is that the digest is an invalid BASE64 string: although it is the right length, the last + breaks the padding, so our code previously failed to decode it.

This MR makes the difference between Base64 strings that are known to be invalid before decoding, which should be considered as improperly formatted lines, and those that are only recognized to be invalid after decoding, for padding issues, which are to be regarded as properly formatted compared to the obtained checksum.

@codspeed-hq
Copy link

codspeed-hq bot commented Nov 28, 2025

CodSpeed Performance Report

Merging #9511 will not alter performance

Comparing RenjiSann:cksum-fix-cksum-c (ba5ded0) with main (003f21a)

Summary

✅ 126 untouched
⏩ 6 skipped1

Footnotes

  1. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@RenjiSann RenjiSann force-pushed the cksum-fix-cksum-c branch 2 times, most recently from cd5feaa to e522a31 Compare November 28, 2025 12:42
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/cksum/cksum-c is no longer failing!

@RenjiSann RenjiSann requested a review from sylvestre November 28, 2025 14:09
@RenjiSann RenjiSann marked this pull request as ready for review November 28, 2025 14:09
@RenjiSann RenjiSann changed the title cksum: Fix CNU test cksum-c.sh after bump to 9.9 cksum: Fix GNU test cksum-c.sh after bump to 9.9 Nov 28, 2025
@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/cksum/cksum-c is no longer failing!

@github-actions
Copy link

GNU testsuite comparison:

Congrats! The gnu test tests/cksum/cksum-c is no longer failing!

line_info: &LineInfo,
len_hint: Option<usize>,
checksum: &String,
bytelen_hint: Option<usize>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use byte_len_hint because "byte" and "length" are words on their own.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

// checksum with it.
let digest_char_length_hint = match (algo_kind, algo_byte_len) {
(AlgoKind::Blake2b, Some(bytelen)) => Some(bytelen * 2),
(AlgoKind::Blake2b, Some(bytelen)) => Some(bytelen),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (plus for consistency reasons, as one line above algo_byte_len is used).

Suggested change
(AlgoKind::Blake2b, Some(bytelen)) => Some(bytelen),
(AlgoKind::Blake2b, Some(byte_len)) => Some(byte_len),

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

This commit differentiates Base64 strings that are known to be invalid
before decoding (because their length is not a multiple of 4), from
Base64 strings that are invalid at decoding (padding is invalid).
@github-actions
Copy link

github-actions bot commented Dec 1, 2025

GNU testsuite comparison:

Congrats! The gnu test tests/cksum/cksum-c is no longer failing!

Comment on lines +502 to +507
match base64_simd::forgiving_decode_to_vec(checksum.as_bytes()) {
Ok(buffer) if checks_hint(buffer.len()) => Some(hex::encode(buffer).into()),
// The resulting length is not as expected
Ok(_) => None,
Err(_) => None,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A functional approach might be cleaner, though I'm aware it's subjective. Feel free to ignore this comment ;-)

    base64_simd::forgiving_decode_to_vec(checksum.as_bytes())
        .ok()
        .and_then(|buffer| {
            if checks_hint(buffer.len()) {
                Some(hex::encode(buffer).into())
            } else {
                None
            }
        })

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with you. However, I'm changing this in the following MR #9532, so I will keep it like so on this MR and avoid a funny rebase :)

@cakebaker cakebaker merged commit b2feb82 into uutils:main Dec 1, 2025
128 of 129 checks passed
@cakebaker
Copy link
Contributor

Congrats! The gnu test tests/cksum/cksum-c is no longer failing!

Kudos :)

@RenjiSann RenjiSann deleted the cksum-fix-cksum-c branch December 1, 2025 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants