fix: advance read session state for ignored command records by infiniteregrets · Pull Request #259 · s2-streamstore/s2-sdk-go

infiniteregrets · 2026-03-29T09:42:05Z

closes #235

Summary

When IgnoreCommandRecords is set, the streaming read loop skipped handleRecord entirely for command records, failing to advance nextSeq, recordsRead, and bytesRead.
On retry/reconnect, the session would re-request from the stale position, re-processing the same command records (infinite loop if stream is all command records).
Extracted state advancement into advanceState() and call it for ignored command records before continue, so the session position always advances.

Test plan

Verify go build ./... passes
Confirm that a streaming read with IgnoreCommandRecords: true correctly advances NextReadPosition() past command records
Confirm retries after disconnect do not re-fetch already-seen command records

🤖 Generated with Claude Code

When IgnoreCommandRecords is set, the loop skipped handleRecord entirely for command records, failing to advance nextSeq and counters. On retry, the session would re-request from the stale position, re-processing the same command records. Extract state advancement into advanceState and call it for ignored command records before continuing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-03-29T09:46:21Z

Greptile Summary

This PR fixes a real bug in the streaming read loop: when IgnoreCommandRecords is set, command records were silently skipped with continue but nextSeq, recordsRead, and bytesRead were never advanced. On a disconnect/retry the session would re-request from the stale position, potentially looping forever if the stream consisted entirely of command records.

The fix extracts the state-advancement block from handleRecord into a new advanceState helper and calls it for ignored command records before the continue. The refactoring is clean, the mutex usage is unchanged, and the ordering (state advanced after sending for data records, immediately for ignored records) is correct.

Key changes

New advanceState(record SequencedRecord) helper encapsulates the stateMu-guarded state update.
handleRecord now delegates its state update to advanceState (no behavioral change for data records).
Ignored command records now call advanceState before continue, ensuring nextSeq always advances.
As a positive side-effect, madeProgress in the retry loop (recordsRead > recordsBefore) is now true even when a batch contains only command records, preventing spurious consecutiveFailures increments.

Observations

recordsRead/bytesRead now include ignored command records, which correctly tracks server-side Count/Bytes budget for reconnect arithmetic, but can cause limitsReached() to stop a session before the caller receives Count data records if the stream has a dense prefix of command records. This is worth a clarifying comment.
The PR description's test-plan checklist is entirely unchecked and no new unit tests cover the reconnect-with-command-records path.

Confidence Score: 5/5

Safe to merge — the core fix is correct and all remaining findings are P2 style/documentation suggestions.

The fix correctly addresses a real infinite-reconnect bug by ensuring nextSeq is advanced for ignored command records. The refactoring is clean, mutex usage is unchanged, and the retry/reconnect arithmetic becomes more correct. The two remaining comments are P2: one is a suggestion to document the Count/Bytes semantics for ignored records, and one requests unit-test coverage for the regression path. Neither blocks merge.

No files require special attention; all concerns are in s2/read.go and are minor.

Important Files Changed

Filename	Overview
s2/read.go	Extracts state-advancement into `advanceState()` and calls it for command records when `IgnoreCommandRecords=true`, fixing an infinite-reconnect loop; core logic is correct with minor semantic implications for `Count`/`Bytes` limits worth documenting.

Sequence Diagram

sequenceDiagram
    participant R as streamReader.run
    participant RO as runOnce
    participant S as Server

    Note over R,S: IgnoreCommandRecords=true

    R->>S: GET /records?seq=0&count=10
    S-->>RO: batch [cmd@0, cmd@1, data@2, ...]
    loop for each record
        alt IsCommandRecord (old)
            Note over RO: continue (state NOT advanced)<br/>nextSeq stays at 0!
        else IsCommandRecord (new)
            RO->>RO: advanceState(record)<br/>nextSeq = seqNum+1
            Note over RO: continue
        else data record
            RO->>RO: handleRecord → advanceState(record)
        end
    end
    RO-->>R: disconnect error

    alt Old (buggy)
        R->>S: GET /records?seq=0&count=10 ♻️ infinite loop
    else New (fixed)
        R->>R: limitsReached() check
        R->>S: GET /records?seq=nextSeq&count=remaining ✅
    end

Prompt To Fix All With AI

This is a comment left during a code review.
Path: s2/read.go
Line: 555-564

Comment:
**`recordsRead`/`bytesRead` count toward `Count`/`Bytes` limits for ignored records**

`advanceState` increments `recordsRead` and `bytesRead` for command records that are never delivered to the caller's `Records()` channel. This has two downstream effects worth verifying are intentional:

1. **`limitsReached()` can fire before the caller receives `Count` data records.** If the user creates a session with `Count=10, IgnoreCommandRecords=true` and the stream has 10 command records followed by 10 data records, `limitsReached()` returns `true` after the first 10 and the session stops — delivering 0 records to the consumer.

2. **`buildAttemptOptions` computes `remaining = baseCount - recordsRead`.** Because `recordsRead` now includes command records, the reconnect request's `Count` parameter is reduced by the number of ignored command records. Since the server counts all records (including command records) toward its own `Count` limit, this is the correct server-side arithmetic — but it means a `Count=10` session may ultimately surface fewer than 10 data records to the application when command records are present.

This is likely the intended trade-off (correct server-side budget tracking beats the previous infinite-loop), but it changes observable behavior for anyone relying on `Count` meaning "N data records". A comment explaining the semantics (or a note in the `ReadOptions.Count` doc) would help future readers.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: s2/read.go
Line: 542-547

Comment:
**No unit test for the reconnect-with-command-records regression**

The PR description lists three test-plan items but none are checked, and the existing test file (`read_session_retry_test.go`) does not include a case that exercises the specific scenario being fixed: a stream reader with `IgnoreCommandRecords=true` that receives command-only (or command-leading) batches and then reconnects.

Without this test the regression could silently return. Consider adding a test along the lines of:
- Simulate a batch of command records followed by a disconnect.
- Assert `NextReadPosition()` advances past all command records.
- Assert a second run (or the retry path) does not re-request from the stale position.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix: advance read session state for igno..." | Re-trigger Greptile}

greptile-apps · 2026-03-29T09:46:25Z

s2/read.go

+func (r *streamReader) advanceState(record SequencedRecord) {
+	r.stateMu.Lock()
+	r.recordsRead++
+	r.bytesRead += MeteredSequencedRecordBytes(record)
+	r.lastRecordTime = time.Now()
+	r.nextSeq = record.SeqNum + 1
+	r.hasNextSeq = true
+	r.nextTS = record.Timestamp
+	r.stateMu.Unlock()
+}


recordsRead/bytesRead count toward Count/Bytes limits for ignored records

advanceState increments recordsRead and bytesRead for command records that are never delivered to the caller's Records() channel. This has two downstream effects worth verifying are intentional:

limitsReached() can fire before the caller receives Count data records. If the user creates a session with Count=10, IgnoreCommandRecords=true and the stream has 10 command records followed by 10 data records, limitsReached() returns true after the first 10 and the session stops — delivering 0 records to the consumer.

buildAttemptOptions computes remaining = baseCount - recordsRead. Because recordsRead now includes command records, the reconnect request's Count parameter is reduced by the number of ignored command records. Since the server counts all records (including command records) toward its own Count limit, this is the correct server-side arithmetic — but it means a Count=10 session may ultimately surface fewer than 10 data records to the application when command records are present.

This is likely the intended trade-off (correct server-side budget tracking beats the previous infinite-loop), but it changes observable behavior for anyone relying on Count meaning "N data records". A comment explaining the semantics (or a note in the ReadOptions.Count doc) would help future readers.

Prompt To Fix With AI

This is a comment left during a code review. Path: s2/read.go Line: 555-564 Comment: **`recordsRead`/`bytesRead` count toward `Count`/`Bytes` limits for ignored records** `advanceState` increments `recordsRead` and `bytesRead` for command records that are never delivered to the caller's `Records()` channel. This has two downstream effects worth verifying are intentional: 1. **`limitsReached()` can fire before the caller receives `Count` data records.** If the user creates a session with `Count=10, IgnoreCommandRecords=true` and the stream has 10 command records followed by 10 data records, `limitsReached()` returns `true` after the first 10 and the session stops — delivering 0 records to the consumer. 2. **`buildAttemptOptions` computes `remaining = baseCount - recordsRead`.** Because `recordsRead` now includes command records, the reconnect request's `Count` parameter is reduced by the number of ignored command records. Since the server counts all records (including command records) toward its own `Count` limit, this is the correct server-side arithmetic — but it means a `Count=10` session may ultimately surface fewer than 10 data records to the application when command records are present. This is likely the intended trade-off (correct server-side budget tracking beats the previous infinite-loop), but it changes observable behavior for anyone relying on `Count` meaning "N data records". A comment explaining the semantics (or a note in the `ReadOptions.Count` doc) would help future readers. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-29T09:46:25Z

s2/read.go

 		ignoreCommandRecords := r.baseOpts != nil && r.baseOpts.IgnoreCommandRecords
 		for _, record := range batch.Records {
 			if ignoreCommandRecords && record.IsCommandRecord() {
+				r.advanceState(record)
 				continue
 			}


No unit test for the reconnect-with-command-records regression

The PR description lists three test-plan items but none are checked, and the existing test file (read_session_retry_test.go) does not include a case that exercises the specific scenario being fixed: a stream reader with IgnoreCommandRecords=true that receives command-only (or command-leading) batches and then reconnects.

Without this test the regression could silently return. Consider adding a test along the lines of:

Simulate a batch of command records followed by a disconnect.

Assert NextReadPosition() advances past all command records.

Assert a second run (or the retry path) does not re-request from the stale position.

Prompt To Fix With AI

This is a comment left during a code review. Path: s2/read.go Line: 542-547 Comment: **No unit test for the reconnect-with-command-records regression** The PR description lists three test-plan items but none are checked, and the existing test file (`read_session_retry_test.go`) does not include a case that exercises the specific scenario being fixed: a stream reader with `IgnoreCommandRecords=true` that receives command-only (or command-leading) batches and then reconnects. Without this test the regression could silently return. Consider adding a test along the lines of: - Simulate a batch of command records followed by a disconnect. - Assert `NextReadPosition()` advances past all command records. - Assert a second run (or the retry path) does not re-request from the stale position. How can I resolve this? If you propose a fix, please make it concise.

infiniteregrets requested a review from a team as a code owner March 29, 2026 09:42

greptile-apps bot reviewed Mar 29, 2026

View reviewed changes

infiniteregrets merged commit 2a65131 into main Mar 29, 2026
10 checks passed

infiniteregrets deleted the fix/advance-state-for-ignored-command-records branch March 29, 2026 09:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: advance read session state for ignored command records#259

fix: advance read session state for ignored command records#259
infiniteregrets merged 1 commit intomainfrom
fix/advance-state-for-ignored-command-records

infiniteregrets commented Mar 29, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 29, 2026

Uh oh!

greptile-apps bot Mar 29, 2026

Uh oh!

greptile-apps bot Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

infiniteregrets commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

greptile-apps bot commented Mar 29, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

infiniteregrets commented Mar 29, 2026 •

edited

Loading