Skip to content

feat: heartbeat for relays#29

Merged
x032205 merged 7 commits intomainfrom
ENG-3802
Oct 10, 2025
Merged

feat: heartbeat for relays#29
x032205 merged 7 commits intomainfrom
ENG-3802

Conversation

@x032205
Copy link
Member

@x032205 x032205 commented Oct 9, 2025

heartbeat for relays

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR introduces a comprehensive heartbeat mechanism for relay infrastructure components in the Infisical CLI system. The implementation adds health monitoring capabilities that allow relays to periodically communicate their operational status back to the Infisical backend.

The changes span three key areas:

  1. API Layer Enhancement: Two new API functions (CallOrgRelayHeartBeat and CallInstanceRelayHeartBeat) are added to handle heartbeat requests for different relay types. These functions follow the established codebase patterns for API communication, making POST requests to specific endpoints (/v1/relays/heartbeat-org-relay and /v1/relays/heartbeat-instance-relay) with proper error handling and operation constants for tracing.

  2. Data Model Extension: A new RelayHeartbeatRequest struct is introduced with a simple Name field to identify which relay is sending the heartbeat. This follows the existing request/response model patterns in the codebase.

  3. Relay Service Integration: The core relay service (packages/relay/relay.go) implements a sophisticated two-phase heartbeat system. Phase 1 performs aggressive retries every 10 seconds until the first successful heartbeat is established, ensuring quick connectivity verification. Phase 2 then switches to regular maintenance heartbeats every 30 minutes to maintain ongoing health status reporting.

The implementation also includes special handling in the TLS client connection logic, where connections using a special heartbeat gateway ID (00000000-0000-0000-0000-000000000000) are recognized as heartbeat checks and processed immediately without interfering with normal client-gateway routing.

This heartbeat system is essential for distributed relay architecture, enabling the main Infisical service to track which relay components are operational and reachable, potentially supporting failover mechanisms and infrastructure monitoring.

PR Description Notes:

  • The PR description is extremely brief ("heartbeat for relays") and could benefit from more detail about the implementation approach and use cases

Important Files Changed

Changed Files
Filename Score Overview
packages/api/api.go 4/5 Adds two new heartbeat API functions for organization and instance relays with proper error handling
packages/api/model.go 5/5 Introduces RelayHeartbeatRequest struct with name field for relay identification
packages/relay/relay.go 4/5 Implements comprehensive two-phase heartbeat system with special TLS connection handling

Confidence score: 4/5

  • This PR is safe to merge with minimal production risk as it adds new functionality without modifying existing core logic
  • Score reflects well-structured implementation following established patterns, though the special heartbeat gateway ID handling needs attention
  • Pay close attention to packages/relay/relay.go for the hardcoded gateway ID and ensure proper testing of the heartbeat phases

Sequence Diagram

sequenceDiagram
    participant User
    participant Relay
    participant API
    participant HTTPClient
    
    User->>Relay: "Start relay"
    Relay->>Relay: "registerRelay()"
    Relay->>API: "CallRegisterInstanceRelay() or CallRegisterRelay()"
    API-->>Relay: "RegisterRelayResponse with certificates"
    
    Relay->>Relay: "registerHeartBeat(ctx, errCh)"
    
    Note over Relay: Phase 1: Retry every 10 seconds until first success
    loop Every 10 seconds until success
        Relay->>HTTPClient: "Create heartbeat request"
        HTTPClient->>API: "CallInstanceRelayHeartBeat() or CallOrgRelayHeartBeat()"
        alt Heartbeat successful
            API-->>HTTPClient: "Success response"
            HTTPClient-->>Relay: "No error"
            Note over Relay: "Relay is reachable by Infisical"
        else Heartbeat failed
            API-->>HTTPClient: "Error response"
            HTTPClient-->>Relay: "Error"
            Relay->>Relay: "Send error to errCh"
            Note over Relay: "Heartbeat failed"
        end
    end
    
    Note over Relay: Phase 2: Regular heartbeat every 30 minutes
    loop Every 30 minutes
        Relay->>HTTPClient: "Create heartbeat request"
        HTTPClient->>API: "CallInstanceRelayHeartBeat() or CallOrgRelayHeartBeat()"
        alt Heartbeat successful
            API-->>HTTPClient: "Success response"
            HTTPClient-->>Relay: "No error"
            Note over Relay: "Relay is reachable by Infisical"
        else Heartbeat failed
            API-->>HTTPClient: "Error response"
            HTTPClient-->>Relay: "Error"
            Note over Relay: "Heartbeat failed (logged)"
        end
    end
Loading

Context used:

Rule from dashboard - # Greptile Code Review Prompt: OR Query Safety Check (knex.js)

Objective

Flag database queries t... (source)

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@x032205 x032205 merged commit ac9bcb4 into main Oct 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants