[Ingestion] Fix stale session handles causing permanent SessionClosed errors by muhamadazmy · Pull Request #4433 · restatedev/restate

muhamadazmy · 2026-02-23T18:16:45Z

[Ingestion] Fix stale session handles causing permanent SessionClosed errors

SessionManagerInner caches SessionHandles in a DashMap that is never
cleaned up. If a PartitionSession background task terminates (e.g. due
to ConnectError::Shutdown or cancellation), the cached handle becomes
permanently stale — every subsequent ingest() call for that partition
returns SessionClosed forever.

Fix this by checking if the cached handle is closed (via the underlying
mpsc sender) before returning it from SessionManagerInner::get(). When
a stale handle is detected, it is evicted from the cache and a fresh
session is started transparently.

AhmedSoliman · 2026-02-23T19:07:37Z

crates/ingestion-client/src/session.rs

    /// It guarantees that only one session is started per partition id.
    pub fn get(&self, id: PartitionId) -> SessionHandle {
+        // Evict stale handle if the underlying session has terminated.
+        if let Some(handle) = self.handles.get(&id)


Would a drop implementation that auto-remove the handle from the map be a reliable alternative here?

I pushed a change that just uses self.handlers.remove_if().

Was also thinking of auto removing this when the session task is dropped or exited, but it felt that doing this proactively is easier.

I am reconsidering now after your comment :)

… errors SessionManagerInner caches SessionHandles in a DashMap that is never cleaned up. If a PartitionSession background task terminates (e.g. due to ConnectError::Shutdown or cancellation), the cached handle becomes permanently stale — every subsequent ingest() call for that partition returns SessionClosed forever. Fix this by checking if the cached handle is closed (via the underlying mpsc sender) before returning it from SessionManagerInner::get(). When a stale handle is detected, it is evicted from the cache and a fresh session is started transparently.

AhmedSoliman · 2026-02-23T19:43:35Z

crates/ingestion-client/src/session.rs


+impl<T> Drop for PartitionSession<T> {
+    fn drop(&mut self) {
+        self.manager.handles.remove(&self.partition);


is it possible for multiple sessions to exist for the same key?

No, that should not happen. The ingestion client opens the sessions re-actively when they are first needed. The session should never terminate (it can reconnect internally) but should remain running until the client is closed by the .close() function.

AhmedSoliman reviewed Feb 23, 2026

View reviewed changes

muhamadazmy force-pushed the pr4433 branch 2 times, most recently from 5b9b7f8 to 2c0a1f4 Compare February 23, 2026 19:35

muhamadazmy force-pushed the pr4433 branch from 2c0a1f4 to 0831823 Compare February 23, 2026 19:38

muhamadazmy requested a review from AhmedSoliman February 23, 2026 19:41

AhmedSoliman reviewed Feb 23, 2026

View reviewed changes

AhmedSoliman approved these changes Feb 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Ingestion] Fix stale session handles causing permanent SessionClosed errors#4433

[Ingestion] Fix stale session handles causing permanent SessionClosed errors#4433
muhamadazmy wants to merge 1 commit intorestatedev:mainfrom
muhamadazmy:pr4433

muhamadazmy commented Feb 23, 2026

Uh oh!

AhmedSoliman Feb 23, 2026

Uh oh!

muhamadazmy Feb 23, 2026

Uh oh!

AhmedSoliman Feb 23, 2026

Uh oh!

muhamadazmy Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

muhamadazmy commented Feb 23, 2026

Uh oh!

AhmedSoliman Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

muhamadazmy Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

AhmedSoliman Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

muhamadazmy Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants