[Locket] Add configurable Locket DB Health Check

### Proposed Change

**As a** Platform Operator,
**I want** a configurable Locket DB health check,
**So that** I can proactively detect connectivity or availability issues and react before they impact platform stability.

### Problem Details

Currently the Locket service has no mechanism to actively verify its database connectivity at runtime.

During past operational incidents we observed that Locket could enter a degraded or silently broken state when its database became unresponsive or unreachable. Because no internal health verification exists, the failure could only be detected indirectly through platform symptoms or external monitoring. By that point, platform behavior has already been impacted and recovery requires manual intervention.

If Locket had been able to detect its own loss of database connectivity, the process could have been restarted automatically by BOSH, significantly reducing impact and recovery time.

The same operational gap was recently addressed for the BBS component by introducing a DB health check. This mechanism has proven valuable and could be extended to Locket to provide consistent resilience across Diego control plane components.

### Solution Proposal

Implement a Locket DB health check following the same model introduced for the BBS DB health check and adapting it for the Locket database connection.
The following merged PRs serve as the reference implementation:
- BBS DB health check runner
https://github.com/cloudfoundry/bbs/pull/134
- Diego Release configuration
https://github.com/cloudfoundry/diego-release/pull/1088

The Locket equivalent should expose analogous BOSH properties (e.g. diego.locket.enable_db_health_check, along with timeout, interval, and failure threshold settings) and implement the same internal runner pattern within the Locket process:
- Periodically verify DB connectivity using a simple write/read operation
- Exit the process after a configurable number of consecutive failures
- Allow BOSH to restart the process for recovery
- Be disabled by default

### Acceptance criteria

Scenario: Health check detects a healthy database 
**Given** the Locket DB health check is enabled via configuration 
**When** Locket successfully performs a DB insert and retrieve within the configured timeout 
**Then** Locket continues operating normally 
**And** no restart is triggered 

Scenario: Health check detects consecutive database failures and triggers a restart 
**Given** the Locket DB health check is enabled 
**And** configured with a failure threshold of N consecutive failures 
**When** Locket fails to complete a DB insert and retrieve within the configured timeout for N consecutive attempts 
**Then** the Locket process exits so that BOSH can restart it and restore database connectivity 

Scenario: Health check is disabled by default 
**Given** a Locket deployment with no explicit health check configuration 
**When** the Locket process starts 
**Then** the DB health check is not active 
**And** Locket behaves as it did prior to this feature 

Scenario: Operator can configure health check parameters 
**Given** the Locket DB health check is enabled 
**When** an operator sets custom values for the check interval, per-check timeout, and consecutive failure threshold 
**Then** the health check runs using those values
**And** database connectivity is evaluated according to the operator-specified parameters

### Related links

- https://github.com/cloudfoundry/diego-release/pull/1088
- https://github.com/cloudfoundry/bbs/pull/134
- https://github.com/cloudfoundry/diego-release/issues/380
- https://github.com/cloudfoundry/diego-release/issues/406

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Locket] Add configurable Locket DB Health Check #1105

Proposed Change

Problem Details

Solution Proposal

Acceptance criteria

Related links

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Locket] Add configurable Locket DB Health Check #1105

Description

Proposed Change

Problem Details

Solution Proposal

Acceptance criteria

Related links

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions