feat: Add connection resilience features for better reliability#135
feat: Add connection resilience features for better reliability#135steiner385 wants to merge 4 commits intojenkinsci:mainfrom
Conversation
| @Restricted(NoExternalUse.class) | ||
| @Extension | ||
| @Slf4j | ||
| public class HealthEndpoint implements UnprotectedRootAction { |
There was a problem hiding this comment.
I'm a bit sceptic of such endpoint in the mcp plugin. This is more a generic Jenkins endpoint.
There was a problem hiding this comment.
Good point, thanks for the feedback. I've updated the health endpoint to be MCP-specific:
Changes:
- Renamed status → mcpServerStatus
- Added activeConnections (current MCP connection count)
- Removed jenkinsVersion (generic Jenkins info)
The endpoint now returns MCP server status and connection metrics rather than generic Jenkins health information:
{
"mcpServerStatus": "ok",
"activeConnections": 5,
"shuttingDown": false,
"timestamp": "..."
}
This makes it clearly specific to the MCP plugin — useful for MCP clients to check server availability and capacity before establishing connections.
| * @param rsp the Stapler response | ||
| * @throws IOException if writing the response fails | ||
| */ | ||
| public void doIndex(StaplerRequest2 req, StaplerResponse2 rsp) throws IOException { |
Check warning
Code scanning / Jenkins Security Scan
Stapler: Missing permission check Warning
| * @param rsp the Stapler response | ||
| * @throws IOException if writing the response fails | ||
| */ | ||
| public void doIndex(StaplerRequest2 req, StaplerResponse2 rsp) throws IOException { |
Check warning
Code scanning / Jenkins Security Scan
Stapler: Missing POST/RequirePOST annotation Warning
This commit adds several features to improve MCP connection stability: - Enable keep-alive by default (30s interval) to detect broken connections faster - Add lightweight health endpoint at /mcp-server/health (no auth required) - Returns HTTP 200 when healthy, HTTP 503 during shutdown - Includes Retry-After header during shutdown for client reconnection - Add metrics endpoint at /mcp-server/health/metrics (auth required) - Tracks SSE connections (total/active), Streamable requests, errors - Add graceful shutdown notification with 5-second grace period - Allows clients to detect shutdown state before connections are closed - Add enhanced connection logging with client identification - Logs IP, X-Forwarded-For, and User-Agent for debugging - Document all resilience features in README Related to jenkinsci#15 (SSE connection breaking) Related to jenkinsci#22 (Gateway timeout issues) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change health endpoint to use UnprotectedRootAction at /mcp-health for proper unauthenticated access - Move metrics endpoint to /mcp-server/metrics (simpler path) - Handle metrics in process() method since handle() isn't reached - Update README with correct endpoint paths The original /mcp-server/health path didn't work because Jenkins's HttpServletFilter.process() runs AFTER security filtering, not before. UnprotectedRootAction is the correct way to expose an unauthenticated endpoint in Jenkins. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add blank lines after constant declarations in Endpoint.java - Fix import order in McpConnectionMetrics.java (alphabetical) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address reviewer feedback from olamy about health endpoint being too generic.
Changes:
- Rename response field 'status' to 'mcpServerStatus' for clarity
- Add 'activeConnections' field showing current MCP connection count
- Remove 'jenkinsVersion' field (generic Jenkins info not MCP-specific)
- Update Javadoc to emphasize MCP-specific purpose
- Update README documentation with new response format
The health endpoint now returns MCP server status and connection metrics,
making it clearly specific to the MCP plugin rather than a generic
Jenkins health check.
Response format:
{
"mcpServerStatus": "ok",
"activeConnections": 5,
"shuttingDown": false,
"timestamp": "..."
}
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
043d4fe to
c58d9fc
Compare
Summary
This PR adds several features to improve MCP connection stability and help clients detect and recover from connection issues:
/mcp-server/health) - Lightweight endpoint for connection monitoring (no auth required)/mcp-server/health/metrics) - Connection statistics for debugging (auth required)Health Endpoint Response
{ "status": "ok", "timestamp": "2025-01-28T10:30:00Z", "jenkinsVersion": "2.533", "shuttingDown": false }Metrics Endpoint Response
{ "sseConnectionsTotal": 42, "sseConnectionsActive": 3, "streamableRequestsTotal": 150, "connectionErrorsTotal": 2, "uptimeSeconds": 3600, "startTime": "2025-01-28T10:00:00Z" }Related Issues
Test plan
Notes
/mcp-server/mcp) is more reliable than SSE🤖 Generated with Claude Code