*: move redo s3 check from cli to server#4281
*: move redo s3 check from cli to server#4281ti-chi-bot[bot] merged 13 commits intopingcap:masterfrom
Conversation
|
Hi @Debra-He. Thanks for your PR. I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @Debra-He! |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCLI changefeed creation now disables redo storage I/O checks when completing replica configs; ReplicaConfig/Consistent gained options to toggle redo I/O validation, redo storage validation gained an options-aware entrypoint, tests added, and redo writer batching loop logic was reworked. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as "CLI"
participant Server as "TiCDC Server"
participant Storage as "External Storage (S3 / FS)"
CLI->>Server: create changefeed + replica config (EnableRedoIOCheck=false)
Server->>Storage: validate/init external storage (I/O checks)
alt storage reachable and valid
Storage-->>Server: success
Server-->>CLI: changefeed created (ack)
else storage unreachable or invalid
Storage-->>Server: error
Server-->>CLI: create failed (error)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Debra-He, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the validation process for redo log external storage configurations, moving the check from the client-side command-line interface to the TiCDC server. This change addresses potential issues in deployment environments where the CLI might not have direct network access to external storage services like S3, ensuring that such validations occur in the appropriate server context during changefeed initialization. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request moves the redo log S3 connectivity check from the CLI to the TiCDC server. This is achieved by removing the call to ReplicaConfig.ValidateAndAdjust in the CLI's changefeed creation logic. While this correctly prevents the CLI from performing network checks it might not be able to, it also disables all other client-side configuration validations, which could degrade user experience by delaying error feedback. I've added a comment suggesting a more targeted refactoring to address this.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cdc/cli/cli_changefeed_create.go`:
- Around line 154-156: The url.Parse error returned from the check on
o.commonChangefeedOptions.sinkURI should be wrapped before returning to preserve
stack context; replace `if _, err :=
url.Parse(o.commonChangefeedOptions.sinkURI); err != nil { return err }` with a
wrapped error using Go error wrapping (e.g. `return fmt.Errorf("invalid sink URI
%q: %w", o.commonChangefeedOptions.sinkURI, err)`), and add the fmt import if
missing so the function (in cli_changefeed_create.go) returns the wrapped error.
|
@zier-one: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test all |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/redo/writer/file/file.go (1)
316-319: Consider wrapping error for consistency.Other error returns in this function (lines 313 and 324) use
errors.Trace(err), but this one returns the error directly. Whilewrite()internally wraps some errors (so double-wrapping should be avoided per guidelines), wrapping here would maintain uniform traceability for any paths inwrite()that don't wrap.♻️ Optional fix for consistency
err := w.write(e) if err != nil { - return err + return errors.Trace(err) }As per coding guidelines, "upstream callers should propagate wrapped errors without wrapping again"—if
write()consistently wraps all its errors internally, this change is unnecessary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/file/file.go` around lines 316 - 319, The return from w.write(e) is unwrapped but other error returns in this function use errors.Trace; update the return to return errors.Trace(err) to maintain consistent error tracing (callers: the call site that does err := w.write(e) inside the same function), i.e., replace the direct return of err with return errors.Trace(err) while keeping the existing behavior of write() unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/redo/writer/file/file.go`:
- Around line 321-326: The flushing condition uses `if num >
redo.DefaultFlushBatchSize` which causes an off-by-one (flushes at size+1);
change the check to `if num >= redo.DefaultFlushBatchSize` so `flush()` (and
subsequent `e.PostFlush()`) runs when the batch reaches
`redo.DefaultFlushBatchSize` exactly; update the conditional around `num`,
`flush()`, and `e.PostFlush()` accordingly.
---
Nitpick comments:
In `@pkg/redo/writer/file/file.go`:
- Around line 316-319: The return from w.write(e) is unwrapped but other error
returns in this function use errors.Trace; update the return to return
errors.Trace(err) to maintain consistent error tracing (callers: the call site
that does err := w.write(e) inside the same function), i.e., replace the direct
return of err with return errors.Trace(err) while keeping the existing behavior
of write() unchanged.
|
/retest |
|
@Debra-He: Cannot trigger testing until a trusted user reviews the PR and leaves an DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/redo/writer/file/file.go`:
- Around line 308-309: In the select branch that handles <-ctx.Done() (where the
code currently returns ctx.Err()), ensure you flush any pending batch work and
run queued cacheEventPostFlush callbacks before returning: call the existing
batch-flush/drain routine (the function that processes the in-flight/write batch
and invokes cacheEventPostFlush callbacks) to drain and execute all pending
callbacks, wait for it to complete or time out appropriately, and only then
return ctx.Err(); update the <-ctx.Done() case to perform that flush/drain step
instead of returning immediately.
| case <-ctx.Done(): | ||
| return ctx.Err() |
There was a problem hiding this comment.
Flush pending callbacks before exiting on context cancellation.
Returning immediately on ctx.Done() drops queued cacheEventPostFlush callbacks for already-written events. Please flush pending batch work before exit.
🔧 Proposed fix
- case <-ctx.Done():
- return ctx.Err()
+ case <-ctx.Done():
+ if num > 0 {
+ if err := flush(); err != nil {
+ return errors.Trace(err)
+ }
+ }
+ return ctx.Err()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case <-ctx.Done(): | |
| return ctx.Err() | |
| case <-ctx.Done(): | |
| if num > 0 { | |
| if err := flush(); err != nil { | |
| return errors.Trace(err) | |
| } | |
| } | |
| return ctx.Err() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/redo/writer/file/file.go` around lines 308 - 309, In the select branch
that handles <-ctx.Done() (where the code currently returns ctx.Err()), ensure
you flush any pending batch work and run queued cacheEventPostFlush callbacks
before returning: call the existing batch-flush/drain routine (the function that
processes the in-flight/write batch and invokes cacheEventPostFlush callbacks)
to drain and execute all pending callbacks, wait for it to complete or time out
appropriately, and only then return ctx.Err(); update the <-ctx.Done() case to
perform that flush/drain step instead of returning immediately.
|
/test all |
|
/retest |
|
Please add a test for the redo file Writer. |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: flowbehappy, lidezhu, wk989898, zier-one The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required |
What problem does this PR solve?
Issue Number: close #4122
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Summary by CodeRabbit
New Features
Bug Fixes
Improvements
Tests