Skip to content

Prune idle sessions before starting new ones #701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

halter73
Copy link
Contributor

@halter73 halter73 commented Aug 12, 2025

This PR replaces #677. The key difference between this PR and the old one is that extra idle sessions are closed immediately before starting new ones instead of waiting for the IdleTrackingBackgroundService to clean up the next time the periodic timer fires. To achieve this, I made the ConcurrentDictionary tracking the stateful StreamableHttpSessions a singleton service that's also responsible for pruning idle sessions. StreamableHttpSession was previously named HttpMcpSession<TTransport>, but it's no longer shared with the SseHandler

You can look at the description of #677 to see the consequences of creating too many new sessions without first closing and unrooting a corresponding number of idle sessions. The tl;dr is that overallocating could lead to thread pool starvation as hundreds of threads had to wait on the GC to allocate heap space. This thread pool starvation created a vicious cycle because it prevented the IdleTrackingBackgroundService from unrooting the idle sessions causing more of them to get promoted and creating more work for the GC.

In order to reduce contention, I reuse the sorted _idleSessionIds list to find the most idle session to remove next. This list only gets repopulated every 5 seconds on a background loop, or if we run out of new idle sessions to close to make room for new ones. This isn't perfect, because sessions may briefly become active while siting in the _idleSessionIds list, but not get resorted. However, this is only a problem if the server is at the MaxIdleSessionCount, and that every idle session that was less recently active during the last sort has already been closed. Considering a sort should happen at least every 5 seconds when sessions are pruned, I think this is a fair tradeoff to reduce global synchronization on session creation (at least when under the MaxIdleSessionCount) and every time a session becomes idle.

In my testing on a with 16core/64GB VM, a 100,000 idle session limit (the old default) caused the server process to consume 2-3 GB of memory according to Task Manager and limited new session creation rate to about 60 sessions/second after reaching the MaxIdleSessionCount. At a 10,000 idle session limit (the new default), the process memory usage dropped to about 300MB session creation rate increased to about 900 sessions/second. And at the even lower 1,000 idle session limit, the process memory usage dropped further to about 180MB the session creation rate increased again to about 5,000 sessions/second. All of these numbers are stable over repeated runs after having reached the MaxIdleSessionCount.

MaxIdleSessionCount = 10,000 (New default)

$ ./wrk -t32 -c256 -d15s http://172.20.240.1:3001/ -s scripts/mcp.lua
Running 15s test @ http://172.20.240.1:3001/
  32 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   260.42ms   24.53ms 342.05ms   77.48%
    Req/Sec    31.61     11.99   110.00     83.07%
  14737 requests in 15.07s, 5.04MB read
Requests/sec:    977.96
Transfer/sec:    342.43KB

MaxIdleSessionCount = 100,000 (Old default)

$ ./wrk -t32 -c256 -d15s http://172.20.240.1:3001/ -s scripts/mcp.lua --timeout 15s
Running 15s test @ http://172.20.240.1:3001/
  32 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.76s     1.68s    7.47s    56.71%
    Req/Sec     9.08     13.85    79.00     89.39%
  917 requests in 15.05s, 321.01KB read
Requests/sec:     60.92
Transfer/sec:     21.33KB

MaxIdleSessionCount = 1,000 (Lower than default)

$ ./wrk -t32 -c256 -d15s http://172.20.240.1:3001/ -s scripts/mcp.lua
Running 15s test @ http://172.20.240.1:3001/
  32 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    46.52ms    9.40ms 127.71ms   85.46%
    Req/Sec   172.64     31.54   574.00     76.85%
  82981 requests in 15.08s, 28.38MB read
Requests/sec:   5501.70
Transfer/sec:      1.88MB

Single Session Tool Call

The MaxIdleSessionCount has no apparent affect on this test, and I wouldn't expect it to, since we still look up existing sessions the same way we did previously.

$ ./wrk -t32 -c256 -d15s http://172.20.240.1:3001/ --timeout=15s -s scripts/mcp.lua <session-id>
Running 15s test @ http://172.20.240.1:3001/
  32 threads and 256 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     9.31ms   14.79ms 370.81ms   96.89%
    Req/Sec     1.05k   179.55     3.23k    78.45%
  503104 requests in 15.10s, 172.05MB read
Requests/sec:  33319.09
Transfer/sec:     11.39MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants