Skip to content

Support standbyok query parameter in Vault health checks to avoid HttpClient retry logging in HA deployments #960

@ptirador

Description

@ptirador

Problem

When using Spring Vault 3.2.0 with Vault in HA mode and Apache HttpClient 5.5.1+, health checks generate excessive log noise:

  1. Spring Vault calls /v1/sys/health without query parameters
  2. If Standby Vault nodes are hit, they return HTTP 429
  3. Apache HttpClient 5.5.1+ logs retry attempts at INFO level (HTTPCLIENT-2371). Log example: ex-0000000043 https://<vault-url>:8200 responded with status 429; request will be automatically re-executed in 1 SECONDS (exec count 2)
  4. This creates log noise on every health check cycle (typically every 5-10 seconds)

Context

The log noise was introduced when I upgraded my Spring Boot application from:

  • Spring Boot 3.4.11 (HttpClient 5.4.4) → Spring Boot 3.5.8 (HttpClient 5.5.1+)

HttpClient 5.5.1 added explicit retry logging:

While the health check correctly reports status as "UP", the 429 response triggers HttpClient's retry mechanism, logging each attempt.

Health check response

{
    "status": "UP",
    "components": {
        [...]
        "vault": {
            "status": "UP",
            "details": {
                "state": "Vault in standby",
                "version": "1.15.6"
            }
        }
    }
}

Example Logs

INFO o.a.h.c.h.i.c.HttpRequestRetryExec : ex-0000000042 https://<vault-url>:8200 responded with status 429; request will be automatically re-executed in 1 SECONDS (exec count 2)
INFO o.a.h.c.h.i.c.HttpRequestRetryExec : ex-0000000043 https://<vault-url>:8200 responded with status 429; request will be automatically re-executed in 1 SECONDS (exec count 2)

Proposed Solution
Add configuration properties to support Vault's HA query parameters:

# Option 1: Feature flags
spring.vault.health.standby-ok=true
spring.vault.health.perfstandby-ok=true

# Option 2: Generic query parameters
spring.vault.health.query-params=standbyok=true&perfstandbyok=true

This would make the health check call:

GET /v1/sys/health?standbyok=true&perfstandbyok=true

Benefits

  • Eliminates log noise - Standby nodes return 200, no retries needed
  • More semantically correct - 200 response better represents healthy standby state
  • Improves observability - Real issues won't be hidden in retry noise

Metadata

Metadata

Assignees

No one assigned

    Labels

    status: declinedA suggestion or change that we don't feel we should currently apply

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions