Skip to content

Conversation

@gazi-yestemirova
Copy link
Contributor

@gazi-yestemirova gazi-yestemirova commented Nov 26, 2025

What changed?

  • executorstore.GetState now returns ctx.Err() when etcd calls are aborted by context cancellation, preventing “get executor data” from being logged as an internal failure.
  • runRebalancingLoop, runShardStatsCleanupLoop, and rebalanceShardsImpl checks if the err is context.Cancelled or DeadlineExceeded

Why?
context cancellations are expected when leadership is changed or service is stopped, but they are treated as internal errors and it is polluting the logs. So we are ignoring cancellation errors while maintaining the genuine errors visibility.

How did you test it?
unit-tests

Potential risks

Release notes

Documentation Changes

@gazi-yestemirova gazi-yestemirova changed the title Store err logs refactor: [shard-distributor]Remove error logs from store level Nov 28, 2025
gazi-yestemirova and others added 5 commits November 28, 2025 10:44
Signed-off-by: Gaziza Yestemirova <[email protected]>
…#7490)

<!-- Describe what has changed in this PR -->
**What changed?**
Reverting the trimprefix since we are using constants to compare the
values that include that

<!-- Tell your future self why have you made these changes -->
**Why?**
Constants that include the prefix are used to

<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
**How did you test it?**
Deployed in staging

<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
**Potential risks**
Corruption of db, which is already the case.

<!-- Is it notable for release? e.g. schema updates, configuration or
data migration required? If so, please mention it, and also update
CHANGELOG.md -->
**Release notes**

<!-- Is there any documentation updates should be made for config,
https://cadenceworkflow.io/docs/operation-guide/setup/ ? If so, please
open an PR in https://github.com/cadence-workflow/cadence-docs -->
**Documentation Changes**

---------

Signed-off-by: edigregorio <[email protected]>
Signed-off-by: Gaziza Yestemirova <[email protected]>
…hardOwner (cadence-workflow#7476)

**What changed?**
Changed `GetShardOwner` to return an `ExecutorOwnership` struct
containing both executor ID and metadata map, instead of just the
executor ID string.
Also adds a Spectators group so we can easily pass around all
spectators.
**Why?**
Enables callers to access additional executor information like gRPC
address for peer routing, without requiring separate lookups. This is
needed for implementing canary peer chooser that routes requests to
executors based on their addresses.

**How did you test it?**
Updated all tests to verify metadata is included in responses. Verified
locally that ownership information includes metadata.

**Potential risks**
Low - this is an API enhancement that maintains backward compatibility
by returning the same executor ID, just with additional metadata.

**Release notes**

**Documentation Changes**
None

---------

Signed-off-by: Jakob Haahr Taankvist <[email protected]>
Signed-off-by: Gaziza Yestemirova <[email protected]>
Signed-off-by: Gaziza Yestemirova <[email protected]>
@gazi-yestemirova gazi-yestemirova changed the title refactor: [shard-distributor]Remove error logs from store level refactor: [shard-distributor]Handle context.Cancelled errors Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants