Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the performance and efficiency of the redo log writer by introducing an asynchronous, multi-threaded encoding pipeline. It decouples the CPU-intensive event encoding from the I/O-bound file writing, allowing these operations to run in parallel. Additionally, it incorporates batch processing improvements across Kafka, Pulsar, and redo sinks, leading to more efficient event handling and reduced overhead. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughBatch per-row events into preallocated slices for Kafka, Pulsar, and Redo sinks; reset batch buffers after consumption; add an encoding worker group that produces framed polymorphic redo events and pipe encoding output into memory file writers for concurrent encoding and file-writing. Changes
Sequence Diagram(s)sequenceDiagram
participant Producer as Client/Producer
participant EncWg as EncodingWorkerGroup
participant FileWg as FileWorkerGroup
participant Storage as FileCache/Disk
Producer->>EncWg: AddEvent(redoEvent)
EncWg->>EncWg: toPolymorphicRedoEvent (marshal, frame)
EncWg->>FileWg: Send *polymorphicRedoEvent
FileWg->>Storage: Write(event.data, commitTs)
Storage->>Storage: Buffer & flush to disk
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/test all |
There was a problem hiding this comment.
Code Review
The pull request refactors the redo log writer by separating event encoding into a dedicated encodingWorkerGroup to improve modularity and enable parallel processing. It also updates Kafka and Pulsar sinks to use batching for better throughput. However, a critical reliability issue was identified in the fileWorkerGroup where unchecked nil values from newFileCache can cause process panics, requiring proper error handling at all call sites of newFileCache.
| file := f.newFileCache(data, event.commitTs) | ||
| if err := f.syncWriteFile(egCtx, file); err != nil { |
There was a problem hiding this comment.
The newFileCache function can return nil if an error occurs during the initial write to the buffer (e.g., if LZ4 compression fails). The current implementation does not check for this nil return value before passing it to syncWriteFile, which will result in a nil pointer dereference and a process panic when accessing file.maxCommitTs or other fields. This can be used to cause a Denial of Service (DoS) by crashing the TiCDC process.
file := f.newFileCache(data, event.commitTs)
if file == nil {
return errors.ErrUnexpected.FastGenByArgs("failed to create file cache")
}
if err := f.syncWriteFile(egCtx, file); err != nil {| file := f.newFileCache(data, commitTs) | ||
| f.files = append(f.files, file) |
There was a problem hiding this comment.
The newFileCache function can return nil if an error occurs. Appending a nil value to f.files will cause a panic in subsequent calls to writeToCache when it attempts to access the last element of the slice (e.g., file.fileSize). This leads to a process crash and Denial of Service.
file := f.newFileCache(data, commitTs)
if file == nil {
return errors.ErrUnexpected.FastGenByArgs("failed to create file cache")
}
f.files = append(f.files, file)| file := f.newFileCache(data, commitTs) | ||
| f.files = append(f.files, file) |
There was a problem hiding this comment.
Similar to the previous finding, newFileCache is called and its result is appended to f.files without a nil check. This will cause a panic in later operations that assume all elements in f.files are non-nil.
file := f.newFileCache(data, commitTs)
if file == nil {
return errors.ErrUnexpected.FastGenByArgs("failed to create file cache")
}
f.files = append(f.files, file)There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@downstreamadapter/sink/kafka/sink.go`:
- Around line 234-237: The capacity passed to make() must be an int; change the
rowsCount declaration from uint64 to an int by using int(event.Len()) (or
introduce an int variable rowsCountInt := int(event.Len())) and then use that
int for events := make([]*commonEvent.MQRowEvent, 0, rowsCountInt); if you still
need a uint64 version for toRowCallback or elsewhere, derive it with
uint64(rowsCountInt) when calling toRowCallback or other functions (update
references to rowsCount accordingly). Ensure the symbols involved are rowsCount
(replace type), event.Len(), toRowCallback, and events := make(...).
In `@pkg/redo/writer/memory/file_worker.go`:
- Around line 213-214: The bgWriteLogs function (signature with egCtx
context.Context, inputCh <-chan *polymorphicRedoEvent) treats a closed inputCh
as an unexpected error; change the receive logic to detect a closed channel (use
the comma-ok form when reading from inputCh) and treat that as a normal shutdown
path by returning context.Canceled (or nil if the surrounding contract expects
no error) instead of producing an error; ensure any loop or single-receive code
that currently assumes a value exists is updated to check ok and cleanly exit
using egCtx for cancellation propagation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b8ad85f1-e740-4119-9439-dc7da2e0824b
📒 Files selected for processing (8)
downstreamadapter/sink/kafka/sink.godownstreamadapter/sink/pulsar/sink.godownstreamadapter/sink/redo/sink.godownstreamadapter/sink/redo/sink_test.gopkg/redo/writer/memory/encoding_worker.gopkg/redo/writer/memory/encoding_worker_test.gopkg/redo/writer/memory/file_worker.gopkg/redo/writer/memory/mem_log_writer.go
| egCtx context.Context, inputCh <-chan *polymorphicRedoEvent, | ||
| ) (err error) { |
There was a problem hiding this comment.
Handle closed inputCh as a normal shutdown path.
bgWriteLogs now consumes an upstream channel that may be intentionally closed (from the encoding worker group). The current single-value receive path turns this into an unexpected error, which can surface on normal cancellation instead of returning context.Canceled.
🔧 Proposed fix
- case event := <-inputCh:
- if event == nil {
- log.Error("inputCh of redo file worker is closed unexpectedly")
- return errors.ErrUnexpected.FastGenByArgs("inputCh of redo file worker is closed unexpectedly")
- }
+ case event, ok := <-inputCh:
+ if !ok {
+ if err := egCtx.Err(); err != nil {
+ return errors.Trace(err)
+ }
+ log.Info("redo file worker input channel closed",
+ zap.String("keyspace", f.cfg.ChangeFeedID.Keyspace()),
+ zap.String("changefeed", f.cfg.ChangeFeedID.Name()))
+ return nil
+ }
+ if event == nil {
+ log.Warn("received nil redo event in file worker",
+ zap.String("keyspace", f.cfg.ChangeFeedID.Keyspace()),
+ zap.String("changefeed", f.cfg.ChangeFeedID.Name()))
+ continue
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/redo/writer/memory/file_worker.go` around lines 213 - 214, The
bgWriteLogs function (signature with egCtx context.Context, inputCh <-chan
*polymorphicRedoEvent) treats a closed inputCh as an unexpected error; change
the receive logic to detect a closed channel (use the comma-ok form when reading
from inputCh) and treat that as a normal shutdown path by returning
context.Canceled (or nil if the surrounding contract expects no error) instead
of producing an error; ensure any loop or single-receive code that currently
assumes a value exists is updated to check ok and cleanly exit using egCtx for
cancellation propagation.
|
/test all |
|
/test all |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/redo/writer/memory/encoding_worker.go (1)
104-117: UseWarnonly on error shutdowns; useInfofor normal stops.Line 106 currently logs a warning even when
err == nil, which can create noisy warning signals during expected lifecycle shutdown.Proposed refactor
defer func() { - log.Warn("redo encoding workers closed", - zap.String("keyspace", e.changefeed.Keyspace()), - zap.String("changefeed", e.changefeed.Name()), - zap.Error(err)) + if err != nil { + log.Warn("redo encoding workers closed", + zap.String("keyspace", e.changefeed.Keyspace()), + zap.String("changefeed", e.changefeed.Name()), + zap.Error(err)) + } else { + log.Info("redo encoding workers closed", + zap.String("keyspace", e.changefeed.Keyspace()), + zap.String("changefeed", e.changefeed.Name())) + } if err != nil { select { case e.closed <- err:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/memory/encoding_worker.go` around lines 104 - 117, The Run method in encodingWorkerGroup logs a warning unconditionally on shutdown; change the deferred log in encodingWorkerGroup.Run to log at Info level when err == nil and at Warn (or keep Warn) only when err != nil — i.e., inspect the named return error variable err inside the defer and call log.Info(...) with the same fields for normal shutdown, and log.Warn(...) (including zap.Error(err)) when err is non-nil so expected stops aren’t noisy.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/redo/writer/memory/encoding_worker.go`:
- Around line 58-60: The marshal error from codec.MarshalRedoLog in
encoding_worker.go should be wrapped with the standardized ErrMarshalFailed
before returning; replace the current direct return of err after the call to
codec.MarshalRedoLog(rl, nil) with returning
errors.WrapError(errors.ErrMarshalFailed, err) (use the same pattern as in
pkg/redo/writer/file/file.go) so callers can consistently classify marshal
failures.
- Around line 163-173: The select in input() and output() allows sends to
e.inputChs[idx] or outputCh to succeed concurrently with reading from e.closed,
causing events to be accepted after shutdown; add an atomic boolean (e.g.,
e.stopped or e.shuttingDown using sync/atomic) that's set when closing e.closed,
and check that flag before attempting any send in both input() and output(): if
the flag indicates shutdown, return the appropriate errors.ErrRedoWriterStopped
(or wrap) instead of sending; only perform the channel send when the atomic
check shows not-shutdown to ensure sends cannot succeed once shutdown begins.
---
Nitpick comments:
In `@pkg/redo/writer/memory/encoding_worker.go`:
- Around line 104-117: The Run method in encodingWorkerGroup logs a warning
unconditionally on shutdown; change the deferred log in encodingWorkerGroup.Run
to log at Info level when err == nil and at Warn (or keep Warn) only when err !=
nil — i.e., inspect the named return error variable err inside the defer and
call log.Info(...) with the same fields for normal shutdown, and log.Warn(...)
(including zap.Error(err)) when err is non-nil so expected stops aren’t noisy.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 677165b2-1907-4076-9e9a-7fa46447d256
📒 Files selected for processing (2)
pkg/redo/writer/memory/encoding_worker.gopkg/redo/writer/memory/mem_log_writer.go
| select { | ||
| case <-ctx.Done(): | ||
| return ctx.Err() | ||
| case err := <-e.closed: | ||
| if err != nil { | ||
| return errors.WrapError(errors.ErrRedoWriterStopped, err) | ||
| } | ||
| return errors.ErrRedoWriterStopped | ||
| case e.inputChs[idx] <- event: | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's see the file structure and get a sense of the file size
wc -l pkg/redo/writer/memory/encoding_worker.goRepository: pingcap/ticdc
Length of output: 102
🏁 Script executed:
# Now let's examine the specific lines mentioned: 163-173 and 182-188
cat -n pkg/redo/writer/memory/encoding_worker.go | sed -n '160,195p'Repository: pingcap/ticdc
Length of output: 958
🏁 Script executed:
# Let's also find where e.closed is closed to understand the shutdown flow
rg -n 'close\(' pkg/redo/writer/memory/encoding_worker.go | head -20Repository: pingcap/ticdc
Length of output: 78
🏁 Script executed:
# Let's see the full context of the encoding_worker.go struct and initialization
cat -n pkg/redo/writer/memory/encoding_worker.go | head -100Repository: pingcap/ticdc
Length of output: 3595
🏁 Script executed:
# Find the shutdown logic around line 116
cat -n pkg/redo/writer/memory/encoding_worker.go | sed -n '100,130p'Repository: pingcap/ticdc
Length of output: 1015
🏁 Script executed:
# Look for Stop, Close, or Shutdown methods
rg -n 'func \(e \*encodingWorkerGroup\) (Stop|Close|Shutdown)' pkg/redo/writer/memory/encoding_worker.goRepository: pingcap/ticdc
Length of output: 39
🏁 Script executed:
# Get the full context of the Stop/Close method and surrounding code
cat -n pkg/redo/writer/memory/encoding_worker.go | sed -n '100,150p'Repository: pingcap/ticdc
Length of output: 1833
🏁 Script executed:
# Look for where inputChs or outputCh are closed
rg -n 'close\(e\.(inputChs|outputCh)' pkg/redo/writer/memory/encoding_worker.goRepository: pingcap/ticdc
Length of output: 39
🏁 Script executed:
# Check the runWorker method more completely to see the full logic
cat -n pkg/redo/writer/memory/encoding_worker.go | sed -n '137,160p'Repository: pingcap/ticdc
Length of output: 885
🏁 Script executed:
# Look for any Cleanup/Close method or other shutdown logic
rg -n 'func \(e \*encodingWorkerGroup\)' pkg/redo/writer/memory/encoding_worker.goRepository: pingcap/ticdc
Length of output: 392
select statement allows channel sends to succeed after shutdown, risking silent event loss.
In the input() (lines 163-173) and output() (lines 179-189) functions, the <-e.closed and channel send cases coexist in the same select statement. When e.closed is closed (line 116), both branches become simultaneously ready—reading from closed e.closed returns immediately, and sends to inputChs[idx] or outputCh can still succeed if channels have buffer space. This allows callers to return success even after shutdown has begun. Once the worker goroutines exit, events sent after shutdown will accumulate in the channels and be silently discarded.
A safer approach is to guard send paths with an atomic flag or similar mechanism that prevents any sends from succeeding post-shutdown.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/redo/writer/memory/encoding_worker.go` around lines 163 - 173, The select
in input() and output() allows sends to e.inputChs[idx] or outputCh to succeed
concurrently with reading from e.closed, causing events to be accepted after
shutdown; add an atomic boolean (e.g., e.stopped or e.shuttingDown using
sync/atomic) that's set when closing e.closed, and check that flag before
attempting any send in both input() and output(): if the flag indicates
shutdown, return the appropriate errors.ErrRedoWriterStopped (or wrap) instead
of sending; only perform the channel send when the atomic check shows
not-shutdown to ensure sends cannot succeed once shutdown begins.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
pkg/redo/writer/memory/encoding_worker.go (1)
163-170:⚠️ Potential issue | 🔴 CriticalPrevent post-shutdown event acceptance in
input/output.After
e.closedis closed inRun, both<-e.closedand the send branch can be ready in the sameselect. A send can win and return success after shutdown starts, which risks silently dropping accepted events.#!/bin/bash set -euo pipefail # Verify shutdown and send are in the same select blocks. rg -n 'case err := <-e\.closed|case e\.inputChs\[idx\] <- event|case e\.outputCh <- event' pkg/redo/writer/memory/encoding_worker.go # Show full context around Run/input/output. cat -n pkg/redo/writer/memory/encoding_worker.go | sed -n '104,190p'Also applies to: 176-183
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/memory/encoding_worker.go` around lines 163 - 170, The select in the input and output methods can race with shutdown because a send to e.inputChs[idx] or e.outputCh can win even after e.closed is closed; fix by first non-blocking checking e.closed (select { case <-e.closed: return errors.ErrRedoWriterStopped.FastGenByArgs(...); default: }) and only then performing a blocking send that selects between ctx.Done() and the target channel (but no longer includes <-e.closed in that send-select). Update the input and output functions to use this two-step pattern and reference e.closed, e.inputChs (in input), and e.outputCh (in output).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/redo/writer/memory/encoding_worker.go`:
- Around line 164-165: Change the bare return of ctx.Err() in the select case
(the "case <-ctx.Done()" branch in encoding_worker.go) to return
errors.Trace(ctx.Err()) so the context error is wrapped and stack traces are
preserved; add the appropriate import for the errors package if missing and
ensure the change compiles (replace `return ctx.Err()` with `return
errors.Trace(ctx.Err())` in that branch).
---
Duplicate comments:
In `@pkg/redo/writer/memory/encoding_worker.go`:
- Around line 163-170: The select in the input and output methods can race with
shutdown because a send to e.inputChs[idx] or e.outputCh can win even after
e.closed is closed; fix by first non-blocking checking e.closed (select { case
<-e.closed: return errors.ErrRedoWriterStopped.FastGenByArgs(...); default: })
and only then performing a blocking send that selects between ctx.Done() and the
target channel (but no longer includes <-e.closed in that send-select). Update
the input and output functions to use this two-step pattern and reference
e.closed, e.inputChs (in input), and e.outputCh (in output).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6c214174-6636-4b9f-8f6b-4846f02f86f7
📒 Files selected for processing (1)
pkg/redo/writer/memory/encoding_worker.go
|
/retest |
| require.ErrorIs(b, eg.Wait(), context.Canceled) | ||
| } | ||
|
|
||
| type mockBatchWriter struct { |
There was a problem hiding this comment.
use gomock to generate
|
/test all |
|
@wk989898: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Signed-off-by: wk989898 <nhsmwk@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/redo/writer/memory/file_worker.go (2)
377-380:⚠️ Potential issue | 🟡 MinorWrap the write error with
errors.Trace(err).
file.writer.Write(data)may delegate to the lz4 writer (third-party library). Per coding guidelines, errors from library calls should be wrapped to attach stack traces.🔧 Proposed fix
_, err = file.writer.Write(data) if err != nil { - return err + return errors.Trace(err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/memory/file_worker.go` around lines 377 - 380, The error returned from file.writer.Write(data) should be wrapped with errors.Trace before returning; in the block that checks "if err != nil { return err }" replace the bare return with "return errors.Trace(err)" and ensure the errors package used by the project (e.g., errors.Trace) is imported in file_worker.go. This change applies to the code around file.writer.Write in the file_worker.go writer routine.
282-293:⚠️ Potential issue | 🟡 MinorWrap errors from library calls with
errors.Trace(err).Both
file.writer.Close()(which may call the lz4 closer) andextStorage.WriteFileare library calls. Per coding guidelines, these errors should be wrapped to attach stack traces.🔧 Proposed fix
if err = file.writer.Close(); err != nil { - return err + return errors.Trace(err) } if util.GetOrZero(f.cfg.FlushConcurrency) <= 1 { err = f.extStorage.WriteFile(egCtx, file.filename, file.writer.buf.Bytes()) } else { err = f.multiPartUpload(egCtx, file) } f.metricFlushAllDuration.Observe(time.Since(start).Seconds()) if err != nil { - return err + return errors.Trace(err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/memory/file_worker.go` around lines 282 - 293, The error returns from library calls in file_worker.go need to be wrapped with errors.Trace(err) to preserve stack traces: wrap the result of file.writer.Close() and any error returned from f.extStorage.WriteFile (and, if applicable, errors from f.multiPartUpload()) with errors.Trace before returning; update the error handling around the file.writer.Close(), the branch that assigns err = f.extStorage.WriteFile(egCtx, file.filename, file.writer.buf.Bytes()), and the multiPartUpload call so any non-nil err is returned as errors.Trace(err) instead of raw err.
♻️ Duplicate comments (1)
pkg/redo/writer/memory/file_worker.go (1)
241-245:⚠️ Potential issue | 🟠 MajorHandle closed
inputChas a normal shutdown path.The channel receive doesn't use the comma-ok idiom to detect channel closure. When the upstream encoding worker group closes
inputChduring shutdown, this code treats it as an unexpected error rather than a normal termination signal.🔧 Proposed fix
- case event := <-inputCh: - if event == nil { - log.Error("inputCh of redo file worker is closed unexpectedly") - return errors.ErrUnexpected.FastGenByArgs("inputCh of redo file worker is closed unexpectedly") - } + case event, ok := <-inputCh: + if !ok { + if err := egCtx.Err(); err != nil { + return errors.Trace(err) + } + log.Info("redo file worker input channel closed", + zap.String("keyspace", f.cfg.ChangeFeedID.Keyspace()), + zap.String("changefeed", f.cfg.ChangeFeedID.Name())) + return nil + } + if event == nil { + log.Warn("received nil redo event in file worker", + zap.String("keyspace", f.cfg.ChangeFeedID.Keyspace()), + zap.String("changefeed", f.cfg.ChangeFeedID.Name())) + continue + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/memory/file_worker.go` around lines 241 - 245, The receive from inputCh in the redo file worker currently treats a closed channel as an unexpected error; change the receive to use the comma-ok form (e.g., event, ok := <-inputCh) and if ok is false treat it as a normal shutdown (return nil or the function's normal exit) instead of logging/errors via errors.ErrUnexpected.FastGenByArgs; update the branch that currently logs "inputCh of redo file worker is closed unexpectedly" to perform a clean shutdown path so upstream closure is not reported as an error.
🧹 Nitpick comments (3)
downstreamadapter/sink/pulsar/sink.go (1)
448-449: Same ineffectivebuffer = buffer[:0]as in Kafka sink.This reassignment doesn't affect the caller's
msgsBuf. Consider removing it for clarity, or document why it's intentionally a local reset.🧹 Suggested cleanup
msgs, ok := s.rowChan.GetMultipleNoGroup(buffer) if !ok { log.Info("pulsar sink row event channel closed", zap.String("keyspace", s.changefeedID.Keyspace()), zap.String("changefeed", s.changefeedID.Name())) return nil, nil } - buffer = buffer[:0] return msgs, nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@downstreamadapter/sink/pulsar/sink.go` around lines 448 - 449, The local assignment buffer = buffer[:0] is ineffective because it only resets the local slice header and doesn't clear the caller's msgsBuf; remove that line (or replace it with explicit documentation if the intent was to be a no-op) and simply return msgs, nil from the function; locate and edit the code around the buffer, msgs, and msgsBuf usage in sink.go to delete the redundant reset so behavior and intent are clear.downstreamadapter/sink/kafka/sink.go (1)
345-346:buffer = buffer[:0]has no effect here.Since
bufferis a function parameter (slice header passed by value), reassigning it doesn't affect the caller'smsgsBuf. The caller's slice already haslen=0on each iteration because slice headers are passed by value andmsgsBufwas initialized once withlen=0.This line can be safely removed as it's a no-op. The code works correctly, but the statement is misleading about its purpose.
🧹 Suggested cleanup
msgs, ok := s.rowChan.GetMultipleNoGroup(buffer) if !ok { log.Info("kafka sink event channel closed", zap.String("keyspace", s.changefeedID.Keyspace()), zap.String("changefeed", s.changefeedID.Name())) return nil, nil } - buffer = buffer[:0] return msgs, nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@downstreamadapter/sink/kafka/sink.go` around lines 345 - 346, The statement `buffer = buffer[:0]` inside the function (the parameter named `buffer` which is the slice header passed by value) is a no-op and misleading; remove that line from the end of the function (the same block that returns `msgs, nil`) so the function simply returns `msgs, nil`. Ensure you do not attempt to mutate the caller's `msgsBuf` here — rely on the caller to manage `msgsBuf` length/capacity as before.pkg/redo/writer/memory/file_worker.go (1)
317-321: Consider propagating the original error from the write failure.Currently, when
wr.Write(data)fails, the error is logged but the function returnsnil, causing callers to generate a genericErrRedoWriterStoppederror. The original error context is lost in the error chain.This is a suggested improvement for better error diagnostics, not blocking for this PR.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/redo/writer/memory/file_worker.go` around lines 317 - 321, The write error is being logged but discarded; when wr.Write(data) fails you should propagate the original error (or wrap it) instead of returning nil so callers don't lose context and end up with ErrRedoWriterStopped; modify the code path that currently does `_, err := wr.Write(data)` to return the original err (or a wrapped error mentioning the operation) after logging (reference wr.Write(data), log.Error(...) and ErrRedoWriterStopped) so the real failure travels up the call chain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@downstreamadapter/sink/kafka/sink.go`:
- Around line 234-237: The variable rowsCount is inferred as int32 from
event.Len(), but make(..., 0, rowsCount) requires an int; change the assignment
to convert the length to int (e.g., set rowsCount using int(event.Len())) so the
capacity passed to make([]*commonEvent.MQRowEvent, 0, rowsCount) is the correct
type and update any dependent call sites like
toRowCallback(event.PostTxnFlushed, uint64(rowsCount)) as needed.
---
Outside diff comments:
In `@pkg/redo/writer/memory/file_worker.go`:
- Around line 377-380: The error returned from file.writer.Write(data) should be
wrapped with errors.Trace before returning; in the block that checks "if err !=
nil { return err }" replace the bare return with "return errors.Trace(err)" and
ensure the errors package used by the project (e.g., errors.Trace) is imported
in file_worker.go. This change applies to the code around file.writer.Write in
the file_worker.go writer routine.
- Around line 282-293: The error returns from library calls in file_worker.go
need to be wrapped with errors.Trace(err) to preserve stack traces: wrap the
result of file.writer.Close() and any error returned from f.extStorage.WriteFile
(and, if applicable, errors from f.multiPartUpload()) with errors.Trace before
returning; update the error handling around the file.writer.Close(), the branch
that assigns err = f.extStorage.WriteFile(egCtx, file.filename,
file.writer.buf.Bytes()), and the multiPartUpload call so any non-nil err is
returned as errors.Trace(err) instead of raw err.
---
Duplicate comments:
In `@pkg/redo/writer/memory/file_worker.go`:
- Around line 241-245: The receive from inputCh in the redo file worker
currently treats a closed channel as an unexpected error; change the receive to
use the comma-ok form (e.g., event, ok := <-inputCh) and if ok is false treat it
as a normal shutdown (return nil or the function's normal exit) instead of
logging/errors via errors.ErrUnexpected.FastGenByArgs; update the branch that
currently logs "inputCh of redo file worker is closed unexpectedly" to perform a
clean shutdown path so upstream closure is not reported as an error.
---
Nitpick comments:
In `@downstreamadapter/sink/kafka/sink.go`:
- Around line 345-346: The statement `buffer = buffer[:0]` inside the function
(the parameter named `buffer` which is the slice header passed by value) is a
no-op and misleading; remove that line from the end of the function (the same
block that returns `msgs, nil`) so the function simply returns `msgs, nil`.
Ensure you do not attempt to mutate the caller's `msgsBuf` here — rely on the
caller to manage `msgsBuf` length/capacity as before.
In `@downstreamadapter/sink/pulsar/sink.go`:
- Around line 448-449: The local assignment buffer = buffer[:0] is ineffective
because it only resets the local slice header and doesn't clear the caller's
msgsBuf; remove that line (or replace it with explicit documentation if the
intent was to be a no-op) and simply return msgs, nil from the function; locate
and edit the code around the buffer, msgs, and msgsBuf usage in sink.go to
delete the redundant reset so behavior and intent are clear.
In `@pkg/redo/writer/memory/file_worker.go`:
- Around line 317-321: The write error is being logged but discarded; when
wr.Write(data) fails you should propagate the original error (or wrap it)
instead of returning nil so callers don't lose context and end up with
ErrRedoWriterStopped; modify the code path that currently does `_, err :=
wr.Write(data)` to return the original err (or a wrapped error mentioning the
operation) after logging (reference wr.Write(data), log.Error(...) and
ErrRedoWriterStopped) so the real failure travels up the call chain.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 38674758-a1a5-419a-a441-50f195e72c99
📒 Files selected for processing (7)
downstreamadapter/sink/kafka/sink.godownstreamadapter/sink/pulsar/sink.godownstreamadapter/sink/redo/sink_test.gopkg/redo/writer/memory/encoding_worker.gopkg/redo/writer/memory/file_worker.gopkg/redo/writer/writer_mock.goscripts/generate-mock.sh
✅ Files skipped from review due to trivial changes (1)
- pkg/redo/writer/writer_mock.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/redo/writer/memory/encoding_worker.go
| rowsCount := event.Len() | ||
| rowCallback := toRowCallback(event.PostTxnFlushed, uint64(rowsCount)) | ||
| events := make([]*commonEvent.MQRowEvent, 0, rowsCount) | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "*.go" | xargs grep -l "type DMLEvent struct" | head -5Repository: pingcap/ticdc
Length of output: 88
🏁 Script executed:
ast-grep --pattern $'func ($_ *DMLEvent) Len() $_ {
$$$
}'Repository: pingcap/ticdc
Length of output: 210
🏁 Script executed:
sed -n '230,240p' downstreamadapter/sink/kafka/sink.goRepository: pingcap/ticdc
Length of output: 292
Fix compilation error: make() requires int type for capacity, but event.Len() returns int32.
Line 234 assigns rowsCount := event.Len(), making rowsCount type int32. Line 236 then uses rowsCount as the capacity argument to make([]*commonEvent.MQRowEvent, 0, rowsCount), which requires int. Add an explicit conversion:
rowsCount := int(event.Len())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@downstreamadapter/sink/kafka/sink.go` around lines 234 - 237, The variable
rowsCount is inferred as int32 from event.Len(), but make(..., 0, rowsCount)
requires an int; change the assignment to convert the length to int (e.g., set
rowsCount using int(event.Len())) so the capacity passed to
make([]*commonEvent.MQRowEvent, 0, rowsCount) is the correct type and update any
dependent call sites like toRowCallback(event.PostTxnFlushed, uint64(rowsCount))
as needed.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 3AceShowHand, asddongmen The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: ref #1061
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Summary by CodeRabbit
Performance
New Features
Bug Fixes
Tests
Refactor