Conversation
Signed-off-by: Kanthi Subramanian <subkanthi@gmail.com>
b5bde22 to
2908d6d
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9bcbbe60be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| // Use key (parsed without URI decoding) so that percent-encoded | ||
| // characters in object keys (e.g. %2F in Iceberg partition paths) are preserved. | ||
| std::string key_to_use = target_decomposed.key; |
There was a problem hiding this comment.
Parse object key with S3::URI for HTTP URLs
Deriving key_to_use from target_decomposed.key regresses path-style S3 locations like https://s3.amazonaws.com/<bucket>/<key> (and MinIO-style http(s)://endpoint/<bucket>/<key>): SchemeAuthorityKey leaves the bucket in the key, while bucket selection is already taken from s3_uri.bucket. That makes reads use a bucket-prefixed key against a storage already scoped to that bucket, so objects are looked up under the wrong path and can return 404 for valid files.
Useful? React with 👍 / 👎.
|
AI audit note: This review comment was generated by AI (gpt-5.3-codex). Audit update for PR #1516 (S3 key decoding in Confirmed defectsMedium: Path-style HTTP(S) S3 URLs build an incorrect object key (
|
Signed-off-by: Kanthi Subramanian <subkanthi@gmail.com>
29c2450 to
e8d4d20
Compare
AI audit note: This review comment was generated by AI (gpt-5.3-codex). Audit update for PR #1516 (Fixed s3 decoding): Confirmed defects: No confirmed defects in reviewed scope. Coverage summary: Scope reviewed: src/Storages/ObjectStorage/Utils.cpp (resolveObjectStorageForPath S3 branch), src/Storages/ObjectStorage/tests/gtest_scheme_authority_key.cpp, and tests/integration/test_database_iceberg/test.py additions in PR diff. |
| { | ||
| struct ClientFake : DB::S3::Client | ||
| { | ||
| explicit ClientFake() |
There was a problem hiding this comment.
Similar to gtest_readbuffer_s3.cpp
| // because the bucket lives in the URL path, not the hostname. | ||
| // Strip the bucket prefix so the key is relative to the bucket. | ||
| if (!s3_uri.is_virtual_hosted_style | ||
| && key_to_use.starts_with(s3_uri.bucket + "/")) |
There was a problem hiding this comment.
Does this work correctly when path really starts with bucket name? Like s3://bucket/bucket/....?
There was a problem hiding this comment.
For this s3://bucket/bucket/.... the s3::URI should set is_virtual_hosted_style to true and this if block should be skipped.
Added gtest https://github.com/Altinity/ClickHouse/pull/1516/changes#diff-a4536876f73836ceeb9f561d2a1ac710fc377262dfbfb9840cd395f948162c10R135
PR #1516 CI Verification ReportCI Results Overview
PR's New Test ValidationUnit Tests (gtest) — All 5 PassedAll 5 new
Integration Test — Passed in 2 ConfigurationsThe new
The existing CI Failures1. Stateless tests (arm_asan, targeted) — Known Flaky, Not PR-RelatedJob: Stateless tests (arm_asan, targeted) Two tests failed:
Neither test is related to S3 key decoding or Iceberg partition handling. The PR does not modify any stateless test files. Related to PR: No — Pre-existing flaky tests in targeted runner 2. GrypeScan (-alpine) — CVE in Base ImageCVE-2026-2673 (High) in Alpine base image OpenSSL packages. Same failure across all PRs on Related to PR: No — Base image vulnerability ConclusionVerdict: Ready to merge — No PR-related failures. |
closes: #1348
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Use key (parsed without URI decoding) so that percent-encoded characters in object keys are preserved. Fixes issues where there retrieving data returns a 404 when there are slashes in s3 path.
Documentation entry for user-facing changes
Use key (parsed without URI decoding) so that percent-encoded characters in object keys are preserved. Fixes issues where there retrieving data returns a 404 when there are slashes in s3 path.
CI/CD Options
Exclude tests:
Regression jobs to run: