Invoker memory management: Prevent memory ballooning during node restart/leader failover

## Problem Statement

When a Restate node restarts (or gains new partition leaders during failover), all currently invoked invocations on those partitions will be re-invoked. This can cause significant memory spikes ("memory ballooning") due to several factors:

1. **Journal stream materialization** - Journals are fully materialized in memory when reading from storage, causing memory spikes proportional to journal sizes
2. **No global memory budget for the invoker** - The invoker has no mechanism to limit overall memory usage; it only has a concurrency limit which doesn't account for actual memory consumption
3. **HTTP/2 connection buffer memory** - The invoker does not respect the per stream window size and blindly pushes records into a bounded channel
4. **No service endpoint backpressure** - Service endpoints cannot report their available capacity back to Restate, potentially causing both Restate and the endpoint to become overwhelmed

## Current State

The invoker currently has these memory-related controls:
- `concurrent_invocations_limit` (default: 1000) - Limits concurrent invocations but doesn't account for memory
- `in_memory_queue_length_limit` (default: 66,049) - Per-partition limit before spilling to disk
- `message_size_limit` - Limits individual journal message sizes
- Token bucket throttling - Rate limiting for invocations/actions (optional)

However, these controls are insufficient:
- Concurrency limits don't correlate with memory usage (some invocations use MBs, others use KBs)
- Per-partition queue limits can multiply across many partitions (#partitions × limit × size ≈ potentially GBs)
- HTTP/2 adaptive window is enabled, but there's no per-connection memory cap

## Proposed Solution

This umbrella issue tracks the work needed to implement proper memory management for the invoker. The goal is to allow configuring a memory budget that the invoker respects when scheduling invocations.

### Sub-issues / Tasks

#### 1. Avoid journal stream materialization (#275)
Keep the storage transaction open inside `InvocationTask` while reading the journal stream, avoiding full materialization in memory.

**Related:** https://github.com/restatedev/restate/issues/275

#### 2. Implement memory-aware invocation scheduling
- Track estimated memory usage per invocation task
- Implement a memory budget/quota that controls when new invocations can be started
- Consider memory consumption from: journal data, HTTP buffers, response data
- Integrate with potential global node memory limit (#2402)

#### 3. HTTP/2 connection memory configuration
- Configure proper connection and stream window sizes for the `HttpClient` if possible
- If not possible, consider alternative approaches (e.g., limiting concurrent streams per connection more aggressively)
- Let `InvocationTask` respect the limits

#### 4. Service endpoint capacity reporting (protocol extension)
- Extend the service protocol to allow endpoints to report their available capacity
- Restate can use this information to pace invocations to endpoints
- This helps prevent overwhelming service endpoints during bursts

## Success Criteria

- [ ] A restarting node with 100k+ pending invocations with significant state and journals should not OOM
- [ ] Memory usage during restart should be bounded and configurable
- [ ] Service endpoints should not be overwhelmed during restart scenarios
- [ ] Observable metrics for memory usage and throttling behavior

## Related Issues

- #275 - Avoid journal stream materialization by keeping storage transaction open
- #2402 - Make InvokerOptions.in_memory_queue_length_limit respect global memory limit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invoker memory management: Prevent memory ballooning during node restart/leader failover #4311

Problem Statement

Current State

Proposed Solution

Sub-issues / Tasks

1. Avoid journal stream materialization (#275)

2. Implement memory-aware invocation scheduling

3. HTTP/2 connection memory configuration

4. Service endpoint capacity reporting (protocol extension)

Success Criteria

Related Issues

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Invoker memory management: Prevent memory ballooning during node restart/leader failover #4311

Description

Problem Statement

Current State

Proposed Solution

Sub-issues / Tasks

1. Avoid journal stream materialization (#275)

2. Implement memory-aware invocation scheduling

3. HTTP/2 connection memory configuration

4. Service endpoint capacity reporting (protocol extension)

Success Criteria

Related Issues

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions