Skip to content

Comments

Correct self-comparison in ShardCoordinator ResendShardHost handler#8050

Open
MattKotsenas wants to merge 5 commits intoakkadotnet:devfrom
MattKotsenas:refactor/region-equality
Open

Correct self-comparison in ShardCoordinator ResendShardHost handler#8050
MattKotsenas wants to merge 5 commits intoakkadotnet:devfrom
MattKotsenas:refactor/region-equality

Conversation

@MattKotsenas
Copy link
Contributor

@MattKotsenas MattKotsenas commented Feb 23, 2026

Changes

I'm new to contributing to Akka.NET, and as I was reading / reviewing code, I noticed that region.Equals(region) compared the variable to itself, making the condition always true and the else branch (which is empty) unreachable. Changed to region.Equals(m.Region) to compare the current region from state against the original region captured when the ResendShardHost timer was scheduled.

I believe that this is wasteful but has no user visible impact. However, I spent enough time looking at it that I thought it was worth fixing just to prevent future confusion.

If there's anything I did incorrect, I apologize in advance, please let me know. Thanks!

Checklist

region.Equals(region) compared the variable to itself, making the
condition always true and the else branch (shard reallocated to another
region) unreachable dead code. Changed to region.Equals(m.Region) to
compare the current region from state against the original region
captured when the ResendShardHost timer was scheduled.
Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

case ResendShardHost m:
{
if (State.Shards.TryGetValue(m.Shard, out var region) && region.Equals(region))
if (State.Shards.TryGetValue(m.Shard, out var region) && region.Equals(m.Region))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woof - yeah this is a real bug alright. It's probably really rare in practice since the surface area for when this could cause a problem is in the middle of shard rebalancing during the timeout window, but you could potentially end up in a situation like #6973 - we know from affected users that the primary source of that issue was actually https://petabridge.com/blog/worst-dotnet-bug/ (ClusterSingleton split brain) but this is another route by which that same problem could be achieved. Nice catch.

@Aaronontheweb Aaronontheweb enabled auto-merge (squash) February 24, 2026 20:57
@Aaronontheweb Aaronontheweb added this to the 1.5.61 milestone Feb 24, 2026
@Aaronontheweb
Copy link
Member

Akka.Cluster.Sharding.Tests.ShardEntityFailureSpec.Persistent shard must recover from transient failures inside sharding entity constructor and PreStart method

Failed: Timeout 00:00:03 while waiting for a message of type System.String

unrelated racy spec. I'll file an issue for that an attempt a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants