Skip to content

Enable rolling upgrade from LocalGrainDirectory to DistributedGrainDirectory#9881

Draft
ReubenBond wants to merge 5 commits intodotnet:mainfrom
ReubenBond:feature/grain-dir-enhancements/1
Draft

Enable rolling upgrade from LocalGrainDirectory to DistributedGrainDirectory#9881
ReubenBond wants to merge 5 commits intodotnet:mainfrom
ReubenBond:feature/grain-dir-enhancements/1

Conversation

@ReubenBond
Copy link
Copy Markdown
Member

@ReubenBond ReubenBond commented Jan 15, 2026

This PR enables basic support for rolling upgrades from the legacy DHT-based LocalGrainDirectory to the DistributedGrainDirectory (enabled by calling ISiloBuilder.AddDistributedGrainDirectory()). When performing an upgrade, users must follow the procedure detailed under the 'Rolling Upgrade Path' header.

There are caveats to this rolling uprade support. In particular, there will be inconsistency during the rollout: there can be duplicate activations created for a given grain, since different hosts have different ideas of what the correct directory for a given grain is. After the rollout is complete and all hosts are running the DistributedGrainDirectory, there is no more chance for inconsistency and there will be no orphaned (unregistered) or duplicate activations.

A different strategy could include attempts to reduce the chance for inconsistency during the rollout window. For now, that is not something I intend to pursue since it greatly increases complexity for dubious benefit.

Changes

  • Added DelegatingRemoteGrainDirectory - A system target that handles IRemoteGrainDirectory requests from old silos still using LocalGrainDirectory. This allows old silos to communicate with new silos during a rolling upgrade. It implements the same grain types (Constants.DirectoryServiceType, Constants.DirectoryCacheValidatorType) that LocalGrainDirectory uses.
  • Changed service registration - DistributedGrainDirectory and DirectoryMembershipService are now registered on all silos by default (not just when explicitly enabled). This ensures IGrainDirectoryClient is available for recovery queries during rolling upgrades. When AddDistributedGrainDirectory() is called:
    • Removes LocalGrainDirectory lifecycle participation
    • Swaps IGrainLocator from DhtGrainLocator to CachedGrainLocator
    • Registers the delegating system targets
  • Removed LocalGrainDirectoryPartition from DI - It's now created directly by LocalGrainDirectory instead of being injected.
  • Added UseTestClusterGrainDirectory option to InProcTestClusterOptions for testing.
  • Added GrainDirectoryMigrationTests - Tests covering rolling upgrade scenarios.

Rolling Upgrade Path

To take advantage of this new functionality, you will need to perform two complete rollouts. The first adds support for the new directory on all hosts. The second switches to it by default.

  1. Rollout 1: Upgrade all hosts to Orleans v10.0.0.
  2. Rollout 2: Add a call to AddDistributedGrainDirectory() in your silo configuration.
  • Old silos send IRemoteGrainDirectory messages to new silos, which are handled by DelegatingRemoteGrainDirectory.
  • New silos use DistributedGrainDirectory for all directory operations.
  • You may see duplicate activations during the rollout.

Fixes #9356

Copilot AI review requested due to automatic review settings January 15, 2026 01:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables rolling upgrades from the legacy DHT-based LocalGrainDirectory to the new DistributedGrainDirectory. The change ensures that old silos using LocalGrainDirectory can communicate with new silos using DistributedGrainDirectory during a rolling upgrade.

Changes:

  • Added DelegatingRemoteGrainDirectory system targets to handle IRemoteGrainDirectory requests from old silos
  • Changed service registration to register DistributedGrainDirectory and DirectoryMembershipService on all silos by default
  • Added test infrastructure option UseTestClusterGrainDirectory and comprehensive migration tests

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/Orleans.Runtime/GrainDirectory/DelegatingRemoteGrainDirectory.cs New system target that delegates IRemoteGrainDirectory calls from old silos to DistributedGrainDirectory
src/Orleans.Runtime/Hosting/CoreHostingExtensions.cs Updated AddDistributedGrainDirectory to remove LocalGrainDirectory lifecycle participation and register delegating system targets
src/Orleans.Runtime/Hosting/DefaultSiloServices.cs Changed to register DistributedGrainDirectory on all silos and moved IGrainLocator registration
src/Orleans.Runtime/GrainDirectory/LocalGrainDirectory.cs Updated constructor to create LocalGrainDirectoryPartition directly instead of via DI
src/Orleans.Runtime/GrainDirectory/LocalGrainDirectoryPartition.cs Changed constructor to accept ILogger instead of ILoggerFactory
src/Orleans.Runtime/GrainDirectory/GrainDirectoryHandoffManager.cs Removed unused Factory dependency
src/Orleans.Runtime/GrainDirectory/GrainLocatorResolver.cs Changed to use IGrainLocator instead of DhtGrainLocator directly
src/Orleans.Runtime/GrainDirectory/GrainDirectoryPartition.cs Fixed indentation and changed logging level from Error to Warning for transient failures
src/Orleans.Runtime/GrainDirectory/DistributedGrainDirectory.cs Made _localActivations readonly and reordered fields
src/Orleans.Runtime/GrainDirectory/CachedGrainLocator.cs Sealed the class
src/Orleans.Runtime/Core/InternalGrainRuntime.cs Removed ILocalGrainDirectory dependency
src/Orleans.Core/Configuration/ServiceCollectionExtensions.cs Added TaggedServiceDescriptor to track and remove service registrations by implementation type
src/Orleans.TestingHost/InProcTestClusterOptions.cs Added UseTestClusterGrainDirectory property
src/Orleans.TestingHost/InProcTestClusterBuilder.cs Set UseTestClusterGrainDirectory default to true
src/Orleans.TestingHost/InProcTestCluster.cs Added validation and conditional registration of test grain directory
test/TesterInternal/LocalGrainDirectoryPartitionTests.cs Updated test to use NullLogger instead of LoggerFactory and renamed class
test/TesterInternal/GrainDirectory/GrainDirectoryMigrationTests.cs New comprehensive test suite for rolling upgrade scenarios

@ReubenBond ReubenBond enabled auto-merge January 15, 2026 02:10
@ReubenBond ReubenBond force-pushed the feature/grain-dir-enhancements/1 branch 4 times, most recently from f514819 to ab2801c Compare January 15, 2026 16:13
@ReubenBond ReubenBond force-pushed the feature/grain-dir-enhancements/1 branch from 5d23709 to d9036ea Compare January 15, 2026 17:14
@ReubenBond ReubenBond marked this pull request as draft January 20, 2026 22:01
auto-merge was automatically disabled January 20, 2026 22:01

Pull request was converted to draft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Distributed Grain Directory followup work

2 participants