Skip to content

Conversation

sjmonson
Copy link
Collaborator

@sjmonson sjmonson commented Jun 13, 2025

By default httpx will keep the last few connections open in a pool for re-use. Since we create the httpx connection inside an asyncio task, it is bound to the lifetime of the task rather than the httpx client. Thus when the client attempts to reuse the connection in a new task it fails. There may be performance implications for this change, but since the default is only to re-use the last 5 connections, its likely minor. (ref)

Fixes: #147

@sjmonson sjmonson requested review from dagrayvid and markurtz June 13, 2025 14:51

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15637451220/artifacts/3323489992.
They will be retained for up to 30 days.

Copy link
Collaborator

@markurtz markurtz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjmonson good find on the issue! With the default for it set to 20, though, I am worried about the potential performance implications. It looks like we can call .aclose / .close on shutdown and that would release the connection pools. Could we see if there's a simple way to add that in?

@ashishkamra
Copy link

Some other suggestions including Mark's:

  1. Keep reuse, close pools on shutdown

Call await self._async_client.aclose() inside a del or explicit shutdown() hook
(httpx docs recommend this)
python-httpx.org
github.com

  1. Reuse a singleton client per event-loop

Instantiate the client in the backend constructor or in a FastAPI lifespan hook and share it across tasks; close it when the loop ends
medium.com
community.temporal.io

  1. Make keep-alive configurable

Expose --httpx-max-keepalive CLI flag or GUIDELLM_HTTPX_KEEPALIVE env-var; default to 20, let users drop to 0 if they see the bug.

  1. Hybrid limits

httpx.Limits(max_keepalive_connections=1, keepalive_expiry=1) keeps a single hot connection per host but is immune to cross-loop reuse in most cases.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15685320617/artifacts/3337493936.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15687300616/artifacts/3338255401.
They will be retained for up to 30 days.

@sjmonson sjmonson requested a review from markurtz June 16, 2025 17:35
@sjmonson
Copy link
Collaborator Author

I just realized that prepare_multiprocessing should handle this but it does not because it is called after the workers are forked. Will work on a new patch that reuses prepare_multiprocessing.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15709696448/artifacts/3345899661.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15712188785/artifacts/3346897631.
They will be retained for up to 30 days.

@sjmonson
Copy link
Collaborator Author

New patch did not work from some reason. Reverting to the last patch because its known working and has minimal performance impact. Since this is a high priority bug we just need to get this in and can investigate why prepare_multiprocessing does not work later.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15786123106/artifacts/3372515218.
They will be retained for up to 30 days.

📦 Build Artifacts Available
The build artifacts (.whl and .tar.gz) have been successfully generated and are available for download: https://github.com/neuralmagic/guidellm/actions/runs/15786223019/artifacts/3372549765.
They will be retained for up to 30 days.

@sjmonson sjmonson merged commit 023c8dd into main Jun 20, 2025
15 checks passed
@sjmonson sjmonson deleted the fix/event_loop_closed branch June 20, 2025 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Event loop is closed error.

3 participants