Skip to content

fix(batch): detect item timeout by elapsed time, not context state#123

Merged
adityathebe merged 2 commits into
mainfrom
fix/issue-61-batch-error-id
Jun 20, 2026
Merged

fix(batch): detect item timeout by elapsed time, not context state#123
adityathebe merged 2 commits into
mainfrom
fix/issue-61-batch-error-id

Conversation

@adityathebe

@adityathebe adityathebe commented Jun 20, 2026

Copy link
Copy Markdown
Member

Closes #61.

The post-execution item-timeout verdict in Batch.Run checked itemCtx.Err() == context.DeadlineExceeded. Because item callables do not receive the context, an over-budget item always runs to completion and the timeout was only inferred afterwards from context state. When the parent batchCtx was cancelled by the batch timeout before the item's own deadline expired, itemCtx.Err() returned context.Canceled instead of context.DeadlineExceeded, so an overrun item was wrongly reported as a successful result.

With Timeout left at zero, Run defaults it to len(items)*ItemTimeout, which for a single 50ms item equals the item timeout — so two 50ms timers raced and the outcome flipped per run.

Base the verdict on deterministic wall-clock time: duration > ItemTimeout. This only flags items that exceeded their own per-item budget regardless of which timer fired, and leaves batch-level cancellation to the monitoring goroutine.

Split into two commits: the first re-enables TestBatch_ErrorIdentification (was commented out), the second applies the production fix.

Summary by CodeRabbit

  • Bug Fixes

    • Improved per-item timeout detection reliability by using measured elapsed time instead of context error status, ensuring more consistent timeout handling.
  • Tests

    • Added error identification tests to verify correct handling of batch and item timeout errors.

Re-enable the previously commented-out TestBatch_ErrorIdentification, including the BatchTimeout and ItemTimeout subtests. Adds the withTestGlobal(t) setup used by every other test in this file so the batch runs against an isolated test manager.
The post-execution item-timeout verdict checked itemCtx.Err() == context.DeadlineExceeded. Because item callables do not receive the context, an over-budget item always runs to completion and the timeout was only guessed afterwards from context state. When the parent batchCtx was cancelled by the batch timeout before the item's own deadline expired, itemCtx.Err() returned context.Canceled instead of context.DeadlineExceeded, so an overrun item was wrongly reported as a successful result. With Timeout left at zero, Run() defaults it to len(items)*ItemTimeout, which for a single 50ms item equals the item timeout, so two 50ms timers raced and the outcome flipped per run (TestBatch_ErrorIdentification/ItemTimeout was ~80% failing).

Base the verdict on deterministic wall-clock time: duration > ItemTimeout. This only flags items that exceeded their own per-item budget regardless of which timer fired, and leaves batch-level cancellation to the monitoring goroutine. The test now passes 30/30 and under -race.
@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3c8ae581-5843-4230-8871-6af1008521c3

📥 Commits

Reviewing files that changed from the base of the PR and between fafaaed and facfee6.

📒 Files selected for processing (2)
  • task/batch.go
  • task/batch_test.go

Walkthrough

In Batch.Run, per-item timeout detection is changed from checking itemCtx.Err() == context.DeadlineExceeded to comparing measured elapsed wall-clock duration against b.ItemTimeout. The timeout warning log now includes the elapsed duration. A previously commented-out test TestBatch_ErrorIdentification is activated with subtests for ErrBatchTimeout and ErrItemTimeout.

Changes

Deterministic item timeout detection and error identification tests

Layer / File(s) Summary
Deterministic timeout check and error identification tests
task/batch.go, task/batch_test.go
Batch.Run now determines per-item timeout by comparing duration > b.ItemTimeout instead of itemCtx.Err() == context.DeadlineExceeded, eliminating the race between context cancellation and deadline timer firing. The warning log gains the measured elapsed duration. TestBatch_ErrorIdentification is converted from commented-out code to an active test with two subtests asserting errors.Is correctly identifies ErrBatchTimeout and ErrItemTimeout.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: fixing item timeout detection by using elapsed time instead of context state.
Description check ✅ Passed The description comprehensively covers the problem, root cause, solution approach, and commit structure, aligning well with the provided template sections.
Linked Issues check ✅ Passed The PR addresses issue #61 by re-enabling the previously commented-out TestBatch_ErrorIdentification test and fixing the item timeout detection logic that was causing test flakiness.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the item timeout detection issue in task/batch.go and enabling its corresponding test in task/batch_test.go.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-61-batch-error-id
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix/issue-61-batch-error-id

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown

Gavel summary

Source Pass Fail Skip Duration
ai 9 0 0 554.487µs
api 125 0 12 37ms
cache 12 0 0 15ms
clicky 98 0 0 7.2s
exec 86 0 1 8.3s
flags 20 0 0 612.277µs
formatters 26 0 0 2ms
github.com/flanksource/clicky 48 0 0 17.3s
github.com/flanksource/clicky/ai 10 0 0 40ms
github.com/flanksource/clicky/aichat 43 0 0 -
github.com/flanksource/clicky/api 235 0 0 160ms
github.com/flanksource/clicky/api/tailwind 427 0 0 20ms
github.com/flanksource/clicky/docs 19 0 0 -
github.com/flanksource/clicky/examples/enitity 1 0 0 -
github.com/flanksource/clicky/exec 3 0 0 350ms
github.com/flanksource/clicky/flags 32 0 0 10ms
github.com/flanksource/clicky/formatters 50 0 0 -
github.com/flanksource/clicky/formatters/http 38 0 0 -
github.com/flanksource/clicky/formatters/pdf 0 0 1 -
github.com/flanksource/clicky/formatters/tests 202 0 1 55.6s
github.com/flanksource/clicky/internal/gumchoose 2 0 0 -
github.com/flanksource/clicky/lint 4 0 0 -
github.com/flanksource/clicky/mcp 34 0 0 10ms
github.com/flanksource/clicky/middleware 31 0 0 -
github.com/flanksource/clicky/rpc 188 0 1 10ms
github.com/flanksource/clicky/task 85 0 0 16.9s
lint 6 0 0 6.7s
metrics 7 0 0 4ms
middleware 38 0 0 4ms
task 6 0 0 43ms
tests 11 0 47 17ms
text 78 0 0 303ms
valkey 22 0 0 48ms

Totals: 1996 passed · 0 failed · 63 skipped · 1m53s

View full results

@adityathebe adityathebe merged commit 391e6f5 into main Jun 20, 2026
13 checks passed
@adityathebe adityathebe deleted the fix/issue-61-batch-error-id branch June 20, 2026 12:06
@flankbot

Copy link
Copy Markdown

🎉 This PR is included in version 1.21.25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: flaky tests

2 participants