Skip to content

fix(matching): return typed ownership errors consistently with SD#7740

Merged
dkrotx merged 1 commit intocadence-workflow:masterfrom
dkrotx:dkrot/fix-errIfShardOwnershipLost
Feb 24, 2026
Merged

fix(matching): return typed ownership errors consistently with SD#7740
dkrotx merged 1 commit intocadence-workflow:masterfrom
dkrotx:dkrot/fix-errIfShardOwnershipLost

Conversation

@dkrotx
Copy link
Member

@dkrotx dkrotx commented Feb 24, 2026

fix(matching): return typed ownership errors consistently with SD

What changed?
Standardized task list ownership-guard behavior in matching so SD-onboarded and non-onboarded paths return consistent typed ownership errors where applicable.
Also added unit-test for errIfShardOwnershipLost.
This is part of production-readiness for SD (#6862)

Why?
Previously, ownership checks could return different error shapes depending on whether shard distributor onboarding was enabled.
In SD mode, one branch returned an untyped wrapped error while non-SD mode returned TaskListNotOwnedByHostError.
This made error handling inconsistent and could cause ownership-related metrics/logs to be missed.
This change aligns behavior and adds dedicated tests to protect expected ownership semantics with and without shard-distributor.

How did you test it?

  • go test -v ./service/matching/handler TestErrIfShardOwnershipLost
  • go test -v ./service/matching/handler/...

Potential risks

  • Matching ownership guard behavior is sensitive and sits in task-list manager acquisition flow.
  • Error typing/shape changes can affect downstream error classification or alerting expectations.
    • at the same time, the behaviour is now aligned with the previous
      (ringpop) case

Release notes
N/A (internal correctness/consistency fix).

Documentation Changes
N/A.


Reviewer Validation

PR Description Quality (check these before reviewing code):

  • "What changed" provides a clear 1-2 line summary
    • Project Issue is linked
  • "Why" explains the full motivation with sufficient context
  • Testing is documented:
    • Unit test commands are included (with exact go test invocation)
    • Integration test setup/commands included (if integration tests were run)
    • Canary testing details included (if canary was mentioned)
  • Potential risks section is thoughtfully filled out (or legitimately N/A)
  • Release notes included if this completes a user-facing feature
  • Documentation needs are addressed (or noted if uncertain)

@dkrotx dkrotx force-pushed the dkrot/fix-errIfShardOwnershipLost branch from aee032d to 185e324 Compare February 24, 2026 13:29
**What changed?**
Standardized task list ownership-guard behavior in matching so SD-onboarded and non-onboarded paths return consistent typed ownership errors where applicable.
Also added unit-test for `errIfShardOwnershipLost`.

**Why?**
Previously, ownership checks could return different error shapes depending on whether shard distributor onboarding was enabled.
In SD mode, one branch returned an untyped wrapped error while non-SD mode returned `TaskListNotOwnedByHostError`.
This made error handling inconsistent and could cause ownership-related metrics/logs to be missed.
This change aligns behavior and adds dedicated tests to protect expected ownership semantics with and without shard-distributor.

**How did you test it?**
- `go test -v ./service/matching/handler TestErrIfShardOwnershipLost`
- `go test -v ./service/matching/handler/...`

**Potential risks**
- Matching ownership guard behavior is sensitive and sits in task-list manager acquisition flow.
- Error typing/shape changes can affect downstream error classification or alerting expectations.
  - at the same time, the behaviour is now aligned with the previous
    (ringpop) case

**Release notes**
N/A (internal correctness/consistency fix).

**Documentation Changes**
N/A.

Signed-off-by: Jan Kisel <dkrot@uber.com>
@dkrotx dkrotx force-pushed the dkrot/fix-errIfShardOwnershipLost branch from 185e324 to dd96aa0 Compare February 24, 2026 13:47
@gitar-bot
Copy link

gitar-bot bot commented Feb 24, 2026

Code Review ✅ Approved 2 resolved / 2 findings

Clean refactor that standardizes ownership error types across SD and ringpop paths. Previous finding about missing SD happy path test has been resolved with the new test case at line 464. No new issues found.

✅ 2 resolved
Quality: Unused currentHostIdentity() method added

📄 service/matching/handler/engine.go:1535
The new currentHostIdentity() method (lines 1535-1542) is not called anywhere in the codebase. While it appears to be future-proofing, adding dead code without an immediate consumer increases maintenance surface. Consider either adding it in the PR where it's actually needed, or adding a comment explaining its intended future use to prevent it from being flagged/removed by linters or future contributors.

Quality: Missing test case for SD happy path (non-nil shard process)

📄 service/matching/handler/engine_test.go:407
The new TestErrIfShardOwnershipLost covers 6 scenarios but is missing the SD happy path: when IsOnboardedToSD() returns true, GetShardProcess returns a non-nil shard process with no error, and the function should return nil. This would exercise line 1511 (return nil) in engine.go.

Without this test case, you don't verify that the SD-onboarded path correctly returns no error when the shard is still owned. While the logic is straightforward, adding this case would complete the branch coverage matrix.

Rules ✅ All requirements met

Repository Rules

PR Description Quality Standards: PR description comprehensively addresses all required sections including intent, implementation details, test coverage, and risk assessment

1 rule not applicable. Show all rules by commenting gitar display:verbose.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@dkrotx dkrotx merged commit 19e3901 into cadence-workflow:master Feb 24, 2026
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants