Skip to content

[Bug]: Implement Concurrent Health Checks for gateways instead of sequential. #1522

@kevalmahajan

Description

@kevalmahajan

🐞 Bug Summary

The current implementation of check_health_of_gateways() performs health checks sequentially, causing total execution time to scale linearly with the number of gateways (O(n·t)). This results in significant delays when many gateways are registered.

A proposed optimization introduces concurrent health checks using asyncio.gather() and a new helper method _check_single_gateway_health(), but this behavior is not yet implemented and needs to be added.


🧩 Affected Component

Select the area of the project impacted:

  • mcpgateway - API
  • mcpgateway - UI (admin panel)
  • mcpgateway.wrapper - stdio wrapper
  • Federation or Transports
  • CLI, Makefiles, or shell scripts
  • Container setup (Docker/Podman/Compose)
  • Other (explain below)

🔁 Steps to Reproduce

  1. Register multiple gateways (e.g., 20–50).
  2. Trigger a full health check via check_health_of_gateways().
  3. Observe that each gateway health check runs sequentially.
  4. Total time increases proportionally with number of gateways.

🤔 Expected Behavior

Health checks should run in parallel, not sequentially.
check_health_of_gateways() should use asyncio.gather() to check all gateways concurrently.
Total execution time should drop from O(n·t) to O(t), where:

  • n = number of gateways
  • t = time to check one gateway

🧠 Environment Info

You can retrieve most of this from the /version endpoint.

Key Value
Version or commit e.g. v0.9.0 or main@a1b2c3d
Runtime e.g. Python 3.11, Gunicorn
Platform / OS e.g. Ubuntu 22.04, macOS
Container e.g. Docker, Podman, none

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriageIssues / Features awaiting triage

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions