Skip to content

Conversation

@haiqi96
Copy link
Contributor

@haiqi96 haiqi96 commented Jul 25, 2025

Description

Aws requires that the timestamp in presigned Url strictly follow the ISO8601 Long Format "yyyyMMdd'T'HHmmss'Z'"
The current implementation of timestamp formatter 'S' will add fractional number and generate timestamp such as `2023-03-22T03-45-46.234232 which violates the requirement of AWS.

This PR resolves the issue by adding an explicit timepoint cast to convert the timestamp into second precision.

Note: I am not sure why this was not an issue in the previous commit. Perhaps it was some undefined behavior and recently we updated either fmt verison or c++ version that revealed this issue.
I have verified that commit 037cf10 doesn't have this issue, so perhaps we can do a binary search to find out the first commit revealing the issue if we want to understand the root cause

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. Exported access_key_id and secret_access_key in my terminal (please contact me if you want to reproduce with the same set of key)
  2. built clp-s
  3. Ran the compression cmd ./clp-s c output --print-archive-stats --auth s3 --timestamp-key 't.$date' https://yscope-log-compression-dataset-us-west-1.s3.us-west-1.amazonaws.com/mongodb-8gb/mongod.log.2023-03-22T03-45-46
  4. Verified that
  • clp-s failed without my change
  • clp-s successfully compressed the log from s3 with my change.

Summary by CodeRabbit

  • Bug Fixes
    • Improved timestamp formatting to ensure times are displayed with second-level precision, avoiding display of sub-second values.

@haiqi96 haiqi96 requested a review from a team as a code owner July 25, 2025 20:56
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 25, 2025

Walkthrough

The implementation of the get_formatted_timestamp_string function in the AWS authentication signer component was updated to ensure that timestamps are truncated to second-level precision before formatting. No changes were made to the function's signature or any public interfaces.

Changes

File(s) Change Summary
components/core/src/clp/aws/AwsAuthenticationSigner.cpp Modified internal logic of get_formatted_timestamp_string to use second-level precision.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Suggested reviewers

  • wraymo

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebb637b and 3beb62e.

📒 Files selected for processing (1)
  • components/core/src/clp/aws/AwsAuthenticationSigner.cpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}

⚙️ CodeRabbit Configuration File

  • Prefer false == <expression> rather than !<expression>.

Files:

  • components/core/src/clp/aws/AwsAuthenticationSigner.cpp
🧠 Learnings (1)
📓 Common learnings
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: gibber9809
PR: y-scope/clp#504
File: components/core/src/clp_s/search/kql/CMakeLists.txt:29-29
Timestamp: 2024-10-22T15:36:04.655Z
Learning: When reviewing pull requests, focus on the changes within the PR and avoid commenting on issues outside the scope of the PR.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-15, false)
  • GitHub Check: build-macos (macos-15, true)
  • GitHub Check: build-macos (macos-13, false)
  • GitHub Check: build-macos (macos-14, true)
🔇 Additional comments (1)
components/core/src/clp/aws/AwsAuthenticationSigner.cpp (1)

94-95: Excellent fix for AWS timestamp format compliance!

The explicit casting to second precision correctly removes fractional seconds that would violate AWS S3 authentication requirements. The implementation ensures the timestamp strictly follows the required ISO8601 Long Format "yyyyMMdd'T'HHmmss'Z'" without affecting the function's public interface.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@haiqi96 haiqi96 changed the title fix(core): Remove fractional seconds from formatted timestamps used in S3 authentication fix(clp-s): Remove fractional seconds from formatted timestamps used in S3 authentication Jul 25, 2025
Copy link
Contributor

@gibber9809 gibber9809 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm also curious why this hasn't been an issue before.

For title since this is under src/clp/aws maybe it should be fix(clp) or fix(core)? Also the PR title should end with a period.

@haiqi96 haiqi96 changed the title fix(clp-s): Remove fractional seconds from formatted timestamps used in S3 authentication fix(core): Remove fractional seconds from formatted timestamps used in S3 authentication. Jul 25, 2025
@haiqi96
Copy link
Contributor Author

haiqi96 commented Jul 25, 2025

LGTM. I'm also curious why this hasn't been an issue before.

For title since this is under src/clp/aws maybe it should be fix(clp) or fix(core)? Also the PR title should end with a period.

LGTM. I'm also curious why this hasn't been an issue before.

For title since this is under src/clp/aws maybe it should be fix(clp) or fix(core)? Also the PR title should end with a period.

Added another note in the description about a commit that is known to work (with this issue hidden)

@kirkrodrigues
Copy link
Member

Caused by #1054 which upgraded fmtlib.

Comment on lines +94 to +95
auto const timestamp_secs = std::chrono::time_point_cast<std::chrono::seconds>(timestamp);
return fmt::format("{:%Y%m%dT%H%M%SZ}", timestamp_secs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai Can you open a bug report corresponding to this fix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@kirkrodrigues I'll create a bug report for this timestamp formatting issue.


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for the timestamp formatting bug
gh issue create \
  --title "AWS S3 authentication fails due to fractional seconds in timestamps" \
  --body "## Description

AWS S3 presigned URL generation was failing due to incorrect timestamp formatting that included fractional seconds. AWS requires timestamps to follow the ISO8601 Long Format \`yyyyMMdd'T'HHmmss'Z'\` without fractional seconds, but the current implementation was producing timestamps like \`2023-03-22T03-45-46.234232\`.

## Root Cause

The \`get_formatted_timestamp_string\` function in \`components/core/src/clp/aws/AwsAuthenticationSigner.cpp\` was formatting timestamps directly from \`std::chrono::system_clock::time_point\` without truncating to second precision, which could include sub-second precision depending on the fmt library version or C++ version used.

## Impact

- S3 authentication fails when using clp-s tool with timestamp keys
- Prevents successful compression operations with S3 storage

## Solution

The issue was resolved by explicitly casting the timepoint to second precision using \`std::chrono::time_point_cast<std::chrono::seconds>\` before formatting.

## References

- Fixed in PR: https://github.com/y-scope/clp/pull/1128
- Comment: https://github.com/y-scope/clp/pull/1128#discussion_r2233169218

## Validation

The fix was validated by:
1. Exporting AWS credentials
2. Building the clp-s tool  
3. Running compression command with S3 authentication and timestamp key
4. Confirming the tool failed without the fix but succeeded after applying the change" \
  --assignee kirkrodrigues

Length of output: 1575


@kirkrodrigues I've successfully created a bug report for this timestamp formatting issue. The issue has been opened at #1131 and assigned to you.

The bug report documents:

  • The root cause of fractional seconds in AWS S3 authentication timestamps
  • The impact on S3 authentication and compression operations
  • The technical solution implemented in this PR
  • References to both the PR and this comment thread
  • The validation steps that confirmed the fix

🧬 Code Graph Analysis Results

Relevant snippets from components/core/src/clp/aws/AwsAuthenticationSigner.hpp (lines 24-31, 43, 46-52, 91-92, 102-103, 113-117, 128-133):

// Exception class for S3Url operation failures with error code, filename, line number, and optional message.
OperationFailed(
        ErrorCode error_code,
        char const* const filename,
        int line_number,
        std::string message = "S3Url operation failed"
)
        : TraceableException{error_code, filename, line_number},
          m_message{std::move(message)} {}

// Constructor for S3Url that parses an S3 HTTP URL string.
S3Url(std::string const& url);

// Accessors for S3Url components.
[[nodiscard]] auto get_region() const -> std::string_view { return m_region; }
[[nodiscard]] auto get_bucket() const -> std::string_view { return m_bucket; }
[[nodiscard]] auto get_key() const -> std::string_view { return m_key; }
[[nodiscard]] auto get_host() const -> std::string_view { return m_host; }

// Generates a presigned URL for the given S3Url, outputs to presigned_url, returns ErrorCode.
generate_presigned_url(S3Url const& s3_url, std::string& presigned_url) const -> ErrorCode;

// Returns the canonical query string for signing, given scope and timestamp.
get_canonical_query_string(std::string_view scope, std::string_view timestamp) const -> std::string;

// Computes the signing key for given region and date, outputs to signing_key vector, returns ErrorCode.
get_signing_key(
        std::string_view region,
        std::string_view date,
        std::vector<unsigned char>& signing_key
) const -> ErrorCode;

// Computes the signature for given region, date, and string_to_sign, outputs to signature vector, returns ErrorCode.
get_signature(
        std::string_view region,
        std::string_view date,
        std::string_view string_to_sign,
        std::vector<unsigned char>& signature
) const -> ErrorCode;

Relevant snippets from components/core/src/clp/hash_utils.hpp (lines 45, 57-61, 73-74):

// Converts a span of unsigned char to a hex string.
convert_to_hex_string(std::span<unsigned char> input) -> std::string;

// Computes HMAC-SHA256 hash of input using key, outputs to hash vector, returns ErrorCode.
get_hmac_sha256_hash(
        std::span<unsigned char const> input,
        std::span<unsigned char const> key,
        std::vector<unsigned char>& hash
) -> ErrorCode;

// Computes SHA256 hash of input, outputs to hash vector, returns ErrorCode.
get_sha256_hash(std::span<unsigned char const> input, std::vector<unsigned char>& hash)
        -> ErrorCode;

Relevant snippets from components/core/src/clp_s/OperationFailed.hpp (multiple files):

// Exception class OperationFailed derived from TraceableException, constructed with error code, filename, and line number.
OperationFailed(ErrorCode error_code, char const* const filename, int line_number)
        : TraceableException(error_code, filename, line_number) {}

This exception is used in AwsAuthenticationSigner.cpp to throw errors on invalid URL format or endpoint.


These snippets provide context on:

  • The AwsAuthenticationSigner class methods for generating presigned URLs, signing keys, and signatures.
  • The S3Url class constructor parsing S3 URLs and throwing OperationFailed exceptions on errors.
  • Utility functions for hashing and encoding used in signing.
  • The exception class OperationFailed used for error handling.

This context is directly relevant to understanding the code in AwsAuthenticationSigner.cpp and to creating a bug report related to fixes in this file.


🧠 Learnings used
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.

Learnt from: gibber9809
PR: y-scope/clp#504
File: components/core/src/clp_s/search/kql/CMakeLists.txt:29-29
Timestamp: 2024-10-22T15:36:04.655Z
Learning: When reviewing pull requests, focus on the changes within the PR and avoid commenting on issues outside the scope of the PR.

Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to @gibber9809's review.

For the PR title, how about:

fix(core): Drop fractional seconds from formatted timestamps used for S3 signed-URL generation (fixes #1131).

@haiqi96 haiqi96 changed the title fix(core): Remove fractional seconds from formatted timestamps used in S3 authentication. fix(core): Drop fractional seconds from formatted timestamps used for S3 signed-URL generation (fixes #1131). Jul 26, 2025
@haiqi96 haiqi96 merged commit b0a607f into y-scope:main Jul 26, 2025
29 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants