-
Notifications
You must be signed in to change notification settings - Fork 185
refactor(registry): Replace event-driven GC with a lease-based lifecycle #1476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(registry): Replace event-driven GC with a lease-based lifecycle #1476
Conversation
Reorganizes the methods in `registry.go` to improve readability and group related functionality. Public API methods are placed first, followed by administrative helpers, GC logic, and finally callbacks and statistics propagation. This is to minimize the delta for subsequent changes. No logical changes are included in this commit.
Removes the complex, stateful signaling mechanism from the `managedQueue`. Its sole responsibility is now to decorate a `SafeQueue` with atomic, strictly consistent statistics tracking. This change is the foundational first step in moving the registry from a complex, event-driven lifecycle model to a simpler, lease-based one. The now-unused event and lifecycle types are also removed.
Updates the `registryShard` to work with the new, signal-free `managedQueue`. The shard's responsibility is simplified; it no longer needs to handle or propagate lifecycle signals from its queues. Its draining logic is now based on directly checking its aggregate length, removing the need for the `BecameDrained` signal.
Replaces the registry's core lifecycle management system. The complex, event-driven, actor-based model is removed in favor of a simpler and more performant lease-based lifecycle. This commit introduces: - A new `WithConnection` client API that provides a safe, leak-proof entry point for managing flow leases via reference counting. - A per-flow `RWMutex` to provide fine-grained locking, allowing the garbage collector to operate on one flow without blocking others. - A simplified GC that periodically scans for flows with a zero lease count. Simultaneously, this commit removes the old machinery: - The central event-processing loop (`Run` is now just for GC). - The `RegisterOrUpdateFlow` administrative method. - The entire signaling and event-handling subsystem.
Updates all tests in the registry package to align with the new lease-based API and simplified component responsibilities. - `managedqueue_test` is rewritten to focus on stats propagation and invariants, removing all signal testing. - `shard_test` is updated to reflect its simpler role and new draining logic. - `registry_test` is completely overhauled to test the `WithConnection` API, JIT registration, and the new lease-based GC logic, using a fake clock for deterministic lifecycle testing.
|
Hi @LukeAVanDrie. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
| // ErrInvalidShardCount indicates that an invalid shard count was provided (e.g., zero or negative). | ||
| ErrInvalidShardCount = errors.New("invalid shard count") | ||
|
|
||
| // ErrShardDraining indicates that an operation could not be completed because the target shard is in the process of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes the responsibility of the controller.FlowController which distributes requests across workers.
| // mu protects all mutating operations (writes) on the queue. It ensures that the underlying queue's state and the | ||
| // atomic counters are updated atomically. Read operations (like `Peek`) do not acquire this lock. | ||
| mu sync.Mutex | ||
| // --- Immutable Identity & Dependencies (set at construction) --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply regrouped to better align with the concurrency model. No major changes here except flattening/renaming onStatsDelta from the now removed managedQueueCallbacks and adding isDraining.
| // It is stored as an `int32` for atomic operations. | ||
| status atomic.Int32 // `componentStatus` | ||
| // onStatsDelta is the callback used to propagate statistics changes up to the parent registry. | ||
| onStatsDelta propagateStatsDeltaFunc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flattened (and renamed) from the now-removed parentCallbacks.
|
|
||
| // mu protects the shard's internal maps (`priorityBands`). | ||
| // mu protects the shard's internal topology (`priorityBands`) and `config`. | ||
| // TODO: This is a priority inversion issue. Administrative operations (e.g., GC) for a low-priority flow block all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will tackle this in a fast-followup PR once the flow control system is wired up and benchmarked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A per band lock makes sense, but it is not clear to me if this should be a "fast" follow; once we have the system working end-to-end, I recommend listing all followup items and then prioritize them.
| // | ||
| // Errors returned by the callback `fn` are propagated up. | ||
| // Returns `ErrFlowIDEmpty` if the provided key has an empty ID. | ||
| WithConnection(key types.FlowKey, fn func(conn ActiveFlowConnection) error) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative here is:
FlowRegistryClient.Connect with a corresponding ActiveFlowConnection.Close. The functional approach is leak proof and more architecturally sound, but I am open to feedback here. Connect/Close also works just fine with the flow control synchronous blocking model.
e.g.,
FlowController.EnqueueAndWait(...) (...) {
conn := fc.registryClient.Connect(req.FlowKey())
defer conn.Close()
// ... rest of logic proceeds as normal
}
Also, this method should eventually accept and respect caller context. I am not doing this in this PR as this will need to be done comprehensively across the flow control module contracts. As of right now, none of our function have unbounded blocking, so this is more for contract hardening and tracing than correctness at the moment.
I will do this in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can simply be in registry.go, but I plan on expanding the set of functionality exposed from ActiveConnection, and I want to keep registry.go lean. I don't have a strong opinion here though. We should do whatever is best for readability / maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dramatically shortened for doc maintainability (plus a lot of it became obsolete with the new lease-based model). Following principle of first disclosure by moving details closer to the relevant types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly modified and rolled into registry.go as it no longer has a role as an eventually consistent cache.
| "sigs.k8s.io/gateway-api-inference-extension/pkg/epp/util/logging" | ||
| ) | ||
|
|
||
| // propagateStatsDeltaFunc defines the callback function used to propagate statistics changes (deltas) up the hierarchy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved here from the now deleted events.go. No changes to this signature (or stats propagation generally).
| // It is the sole source of truth for determining if a flow is Idle. | ||
| leaseCount atomic.Int64 | ||
|
|
||
| // becameIdleAt tracks the time at which the lease count last dropped to zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearer to follow and test than the generation marker we used previously... but the same concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
| } | ||
| } | ||
|
|
||
| // RegisterOrUpdateFlow handles the registration of a new flow instance or the update of an existing instance's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in favor of the JIT registration in the WithConnection path. We don't allow flow specific config overrides yet, so aReconcile/Update method is not necessary.
| return true | ||
| } | ||
|
|
||
| // verifyFlowIsTrulyIdleLocked performs the "stop-the-world" verification step of GC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this complex "stop the world" logic is needed anymore as our reference counter (leases) is already aggregated at the registry level and not scopes to individual queue instances.
|
|
||
| // UpdateShardCount dynamically adjusts the number of internal state shards. | ||
| func (fr *FlowRegistry) UpdateShardCount(n int) error { | ||
| // updateShardCount dynamically adjusts the number of internal state shards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Privated for now per feedback on the previous registry PR.
| fr.mu.Unlock() | ||
|
|
||
| // Execute all deferred side effects outside the lock. | ||
| for _, action := range deferredActions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for deferred actions anymore since everything is synchronous and can happen under lock (no side-effects).
| func (fr *FlowRegistry) buildFlowComponents(key types.FlowKey, numInstances int) ([]flowComponents, error) { | ||
| bandConfig, err := fr.config.getBandConfig(key.Priority) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get configuration for priority %d: %w", priority, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error returned from getBandConfig is already well-formed. I was doing redundant error wrapping.
|
/ok-to-test |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, LukeAVanDrie The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…cle (kubernetes-sigs#1476) * refactor: Reorder methods for logical grouping Reorganizes the methods in `registry.go` to improve readability and group related functionality. Public API methods are placed first, followed by administrative helpers, GC logic, and finally callbacks and statistics propagation. This is to minimize the delta for subsequent changes. No logical changes are included in this commit. * refactor: Simplify managedQueue to a decorator Removes the complex, stateful signaling mechanism from the `managedQueue`. Its sole responsibility is now to decorate a `SafeQueue` with atomic, strictly consistent statistics tracking. This change is the foundational first step in moving the registry from a complex, event-driven lifecycle model to a simpler, lease-based one. The now-unused event and lifecycle types are also removed. * refactor: Adapt shard to new managedQueue contract Updates the `registryShard` to work with the new, signal-free `managedQueue`. The shard's responsibility is simplified; it no longer needs to handle or propagate lifecycle signals from its queues. Its draining logic is now based on directly checking its aggregate length, removing the need for the `BecameDrained` signal. * feat: Replace events lifecycle with leases Replaces the registry's core lifecycle management system. The complex, event-driven, actor-based model is removed in favor of a simpler and more performant lease-based lifecycle. This commit introduces: - A new `WithConnection` client API that provides a safe, leak-proof entry point for managing flow leases via reference counting. - A per-flow `RWMutex` to provide fine-grained locking, allowing the garbage collector to operate on one flow without blocking others. - A simplified GC that periodically scans for flows with a zero lease count. Simultaneously, this commit removes the old machinery: - The central event-processing loop (`Run` is now just for GC). - The `RegisterOrUpdateFlow` administrative method. - The entire signaling and event-handling subsystem. * test: Overhaul tests for new lease-based design Updates all tests in the registry package to align with the new lease-based API and simplified component responsibilities. - `managedqueue_test` is rewritten to focus on stats propagation and invariants, removing all signal testing. - `shard_test` is updated to reflect its simpler role and new draining logic. - `registry_test` is completely overhauled to test the `WithConnection` API, JIT registration, and the new lease-based GC logic, using a fake clock for deterministic lifecycle testing.
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
This PR introduces a major architectural refactoring of the flow registry. It replaces the complex, event-driven, actor-model-based garbage collection with a simpler and more performant lease-based lifecycle model.
The primary motivation is to enhance correctness, improve performance by removing global lock contention, and provide a safer, leak-proof client API.
Architectural Changes:
This refactoring touches every part of the registry. To help with the review, here is a high-level overview of the "before" and "after" states:
Previous Architecture (Event-Driven Actor Model)
RegisterOrUpdateFlow. Their lifecycle was tracked via an eventually consistent cache (flowState) that was updated by asynchronous signals (BecameEmpty,BecameNonEmpty) sent from eachmanagedQueueover a buffered channel.Runloop acted as a serialized actor, processing all events to prevent race conditions in the control plane.New Architecture (Lease-Based Lifecycle)
WithConnectionAPI is the sole entry point to the data path. It handles Just-In-Time (JIT) registration of flows and manages their lifecycle via a lease. A flow is considered "Active" as long as it has one or more active leases.sync.Mapfor highly-concurrent lookups of flow states.atomic.Int64(leaseCount) on each flow for lock-free reference counting.sync.RWMutex(gcLock) that provides surgical locking, arbitrating between active connections (read lock) and the garbage collector (write lock) for a single flow without impacting any other flows.leaseCountis zero and an idleness timer has expired. The "stop-the-world" pause is completely eliminated.Key Benefits:
WithConnectioncallback pattern makes resource leaks impossible from the client's perspective, as the lease is guaranteed to be released.Suggested Review Plan:
This is a large change, but it has been structured into a series of logical commits to make the review easier. I recommend reviewing this PR commit-by-commit:
refactor: Reorder methods for logical groupingregistry.gofor better readability. You can scan and approve this one quickly.refactor: Simplify managedQueue to a decoratorrefactor: Adapt shard to to new managedQueue contractregistryShard, which no longer needs to handle signals from its queues.feat: Replace event lifecycle with leasesWithConnectionAPI, the newflowStatewith itsgcLock, and replaces the entire GC and administrative logic.test: Overhaul tests for new lease-based designregistry_test.go, which now uses a fake clock for deterministic lifecycle testing.Which issue(s) this PR fixes:
Tracks #674
Does this PR introduce a user-facing change?: