Skip to content

Fix grains not recycling after cancelling DelayDeactivation#9945

Open
Copilot wants to merge 6 commits intomainfrom
copilot/fix-grain-deactivation-issue
Open

Fix grains not recycling after cancelling DelayDeactivation#9945
Copilot wants to merge 6 commits intomainfrom
copilot/fix-grain-deactivation-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 23, 2026

  • Fix ActivationData.DelayDeactivation: restore Timeout.InfiniteTimeSpan as infinite keep-alive, keep rescheduling fix when cancelling
  • Update Grain.DelayDeactivation docs to describe the 3 typical cases
  • Add ActivationCollector to InternalGrainRuntime
  • Add test grain interface and implementation
  • Fix flaky test ActivationCollectorShouldCollectAfterCancellingKeepAlive by using ForceActivationCollection instead of timing-based Task.Delay for collection verification
Original prompt

This section details on the original issue you should resolve

<issue_title>Grains are not recycled after calling DelayDeactivation(Timespan.FromMilliseconds(-1))</issue_title>
<issue_description>Orleans 9.1.2
Grain are not deactivated after these calls in grain call:

this.DelayDeactivation(TimeSpan.FromSeconds(300));
this.DelayDeactivation(Timespan.FromMilliseconds(-1));

After that, the grain never deactivated.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Microsoft Reviewers: Open in CodeFlow

@ReubenBond
Copy link
Copy Markdown
Member

@copilot continue working on this

@ReubenBond
Copy link
Copy Markdown
Member

@copilot continue working on this

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Root cause: Timeout.InfiniteTimeSpan equals TimeSpan.FromMilliseconds(-1),
so calling DelayDeactivation(TimeSpan.FromMilliseconds(-1)) was matched by
the Timeout.InfiniteTimeSpan check and set KeepAliveUntil to DateTime.MaxValue
instead of cancelling the keep-alive.

Fix: Move the negative timespan check before the TimeSpan.MaxValue check and
remove the Timeout.InfiniteTimeSpan comparison. When cancelling an active
keep-alive, reschedule the collection ticket to CollectionAgeLimit so the
grain can be collected normally.

Also adds ActivationCollector to InternalGrainRuntime for access from
ActivationData, test grain interface/implementation, and integration test.

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 23, 2026

@copilot continue working on this

Found the root cause and fixed it in commit 70f9e9f.

The bug was that Timeout.InfiniteTimeSpan equals TimeSpan.FromMilliseconds(-1). In DelayDeactivation, the Timeout.InfiniteTimeSpan check was evaluated before the negative timespan check, so calling DelayDeactivation(TimeSpan.FromMilliseconds(-1)) was treated as "keep alive forever" (KeepAliveUntil = DateTime.MaxValue) instead of cancelling the keep-alive.

Fix: reordered the checks so negative/zero timespans are handled first, removed the Timeout.InfiniteTimeSpan comparison (only TimeSpan.MaxValue now triggers infinite keep-alive), and added rescheduling of the collection ticket when cancelling an active keep-alive. All 10 ActivationCollector tests pass.

Copilot AI changed the title [WIP] Fix grains not recycling after delay deactivation Fix grains not recycling after DelayDeactivation(TimeSpan.FromMilliseconds(-1)) Feb 23, 2026
@ReubenBond ReubenBond marked this pull request as ready for review February 23, 2026 23:08
Copilot AI review requested due to automatic review settings February 23, 2026 23:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a critical bug where calling DelayDeactivation(TimeSpan.FromMilliseconds(-1)) to cancel a previous keep-alive would incorrectly set an infinite keep-alive instead, preventing grains from ever being deactivated. The root cause was that Timeout.InfiniteTimeSpan (which equals TimeSpan.FromMilliseconds(-1)) was checked before the timespan <= TimeSpan.Zero condition, causing negative timespans to be misinterpreted.

Changes:

  • Fixed the condition ordering in ActivationData.DelayDeactivation to check negative timespans first, and added rescheduling logic to ensure grains are collected after canceling a keep-alive
  • Added ActivationCollector to InternalGrainRuntime to enable the reschedule call from ActivationData
  • Added test grain interface IKeepAliveActivationGcTestGrain and implementation KeepAliveActivationGcTestGrain to test keep-alive cancellation
  • Added integration test ActivationCollectorShouldCollectAfterCancellingKeepAlive to validate the fix

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Orleans.Runtime/Catalog/ActivationData.cs Fixed DelayDeactivation to check negative timespans first and reschedule collection when canceling an active keep-alive
src/Orleans.Runtime/Core/InternalGrainRuntime.cs Added ActivationCollector as a constructor parameter and property to enable rescheduling
test/Grains/TestInternalGrainInterfaces/ActivationGCTestGrainInterfaces.cs Added IKeepAliveActivationGcTestGrain interface for testing keep-alive cancellation
test/Grains/TestInternalGrains/ActivationGCTestGrains.cs Added KeepAliveActivationGcTestGrain implementation
test/Orleans.Runtime.Internal.Tests/ActivationsLifeCycleTests/ActivationCollectorTests.cs Added integration test to verify grains are collected after canceling keep-alive

@ReubenBond
Copy link
Copy Markdown
Member

@copilot In this case, I believe it's the docs that should be updated, since -1ms is interpreted as infinite in many places throughout .NET

If DelayDeactivation(Timeout.InfiniteTimeSpan) caused the runtime to not delay deactivation indefinitely, that would be surprising to most readers. I propose we update the docs to remove the verbiage around negative timespans. So, there would be 3 typical cases:

DelayDeactivation(TimeSpan.FromMinutes(5)) - delay for 5 minutes
DelayDeactivation(TimeSpan.Zero) - remove any existing delay
DelayDeactivation(Timeout.InfiniteTimeSpan) - delay indefinitely
On the other hand, DelayDeactivation(TimeSpan.FromMinutes(-5)) should be an error instead of being equivalent to TimeSpan.Zero. Since that change would break existing code, I think we should hold-off on that change for now.

Copilot AI and others added 2 commits February 26, 2026 14:19
Per review feedback, Timeout.InfiniteTimeSpan (-1ms) should delay
deactivation indefinitely, matching .NET convention. TimeSpan.Zero
is the correct way to cancel a previous delay.

- Restore original condition ordering in DelayDeactivation (check
  InfiniteTimeSpan before <= Zero)
- Keep rescheduling fix when cancelling an active keep-alive
- Update Grain.cs docs to describe the 3 typical cases:
  positive duration, Zero (cancel), InfiniteTimeSpan (indefinite)
- Update test grain to use TimeSpan.Zero for cancellation

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 26, 2026

@copilot In this case, I believe it's the docs that should be updated, since -1ms is interpreted as infinite in many places throughout .NET

If DelayDeactivation(Timeout.InfiniteTimeSpan) caused the...

Updated in 75192fd. Restored Timeout.InfiniteTimeSpan as infinite keep-alive (condition ordering matches original), updated the docs to describe the 3 cases you specified, and changed the test grain to use TimeSpan.Zero for cancellation. The rescheduling fix when cancelling an active keep-alive is preserved.

Copilot AI changed the title Fix grains not recycling after DelayDeactivation(TimeSpan.FromMilliseconds(-1)) Fix grains not recycling after cancelling DelayDeactivation Feb 26, 2026
await grain.SetKeepAlive(TimeSpan.FromMinutes(5));

// Verify the grain is not collected while it has an active keep-alive.
await Task.Delay(WAIT_TIME);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot this test looks like it will be flaky

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a7019ff. Replaced the timing-based Task.Delay(WAIT_TIME) with ForceActivationCollection(DEFAULT_IDLE_TIMEOUT) after waiting DEFAULT_IDLE_TIMEOUT + DEFAULT_COLLECTION_QUANTUM for the grain to become idle. This deterministically verifies the grain is collectable instead of relying on the periodic quantum timer.

…ased wait

Use ForceActivationCollection to deterministically verify the grain is
collectable after cancelling a keep-alive, instead of relying on
Task.Delay(WAIT_TIME) and the periodic quantum timer. This avoids
timing-dependent flakiness on slow CI machines.

Co-authored-by: ReubenBond <203839+ReubenBond@users.noreply.github.com>
@ReubenBond
Copy link
Copy Markdown
Member

@copilot test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Grains are not recycled after calling DelayDeactivation(Timespan.FromMilliseconds(-1))

3 participants