Skip to content

Conversation

VietND96
Copy link
Member

@VietND96 VietND96 commented Oct 15, 2025

User description

🔗 Related Issues

💥 What does this PR do?

Fixes SeleniumHQ/docker-selenium#2991 - Distributor Requires Restart to Register New Nodes After Hours of Inactivity.

Implementation adds configurable ZeroMQ heartbeat support to prevent stale connection issues in Selenium Grid's EventBus, particularly for Node-to-Hub registration after long idle periods.

After extended idle periods (e.g., 8 hours), ZeroMQ connections can become stale. Nodes send registration requests, but the Distributor doesn't recognize them because the ZMQ socket layer is broken, even though network connections appear fine.

Prevents Stale Connections: ZMQ detects and recovers from stale connections automatically
Configurable: Can be tuned based on network conditions and requirements
Backward Compatible: Defaults to 60 seconds if not specified
Automatic Recovery: ZMQ heartbeat mechanism handles connection health monitoring

Refer to zeromq/jeromq#364

Heartbeat Settings

With a 60-second heartbeat period:

  • Heartbeat Interval: 60,000 ms (60 seconds)
  • Heartbeat Timeout: 180,000 ms (180 seconds = 3x interval)
  • Heartbeat TTL: 360,000 ms (360 seconds = 6x interval)

These settings ensure:

  1. ZMQ sends heartbeats every 60 seconds
  2. Connection is considered dead if no response within 180 seconds
  3. Heartbeat messages are valid for 360 seconds

🔧 Implementation Notes

💡 Additional Considerations

🔄 Types of changes

  • Cleanup (formatting, renaming)
  • Bug fix (backwards compatible)
  • New feature (non-breaking change which adds functionality and tests!)
  • Breaking change (fix or feature that would cause existing functionality to change)

PR Type

Bug fix, Enhancement


Description

  • Adds configurable ZeroMQ heartbeat mechanism to EventBus sockets to prevent stale connections after idle periods

  • Implements heartbeat configuration with interval, timeout (3x interval), and TTL (6x interval) settings

  • Introduces --eventbus-heartbeat-period CLI flag with 60-second default value

  • Applies heartbeat settings to all ZMQ socket types (XPUB, XSUB, PUB, SUB)


Diagram Walkthrough

flowchart LR
  Config["EventBusOptions reads heartbeat config"] --> Duration["Duration heartbeatPeriod"]
  Duration --> Factory["ZeroMqEventBus.create()"]
  Factory --> Bound["BoundZmqEventBus (XPUB/XSUB)"]
  Factory --> Unbound["UnboundZmqEventBus (PUB/SUB)"]
  Bound --> Configure1["configureHeartbeat()"]
  Unbound --> Configure2["configureHeartbeat()"]
  Configure1 --> Sockets["ZMQ Socket Settings"]
  Configure2 --> Sockets
Loading

File Walkthrough

Relevant files
Enhancement
BoundZmqEventBus.java
Add heartbeat configuration to bound EventBus sockets       

java/src/org/openqa/selenium/events/zeromq/BoundZmqEventBus.java

  • Adds Duration heartbeatPeriod parameter to constructor
  • Implements configureHeartbeat() method for XPUB and XSUB sockets
  • Sets heartbeat interval, timeout (3x), and TTL (6x) on sockets
  • Passes heartbeat period to delegate UnboundZmqEventBus
+26/-2   
UnboundZmqEventBus.java
Add heartbeat configuration to unbound EventBus sockets   

java/src/org/openqa/selenium/events/zeromq/UnboundZmqEventBus.java

  • Adds Duration heartbeatPeriod parameter to constructor
  • Implements configureHeartbeat() method for PUB and SUB sockets
  • Configures heartbeat with interval, timeout, and TTL settings
  • Adds logging for heartbeat configuration
+23/-1   
ZeroMqEventBus.java
Add heartbeat period parameter to EventBus factory methods

java/src/org/openqa/selenium/events/zeromq/ZeroMqEventBus.java

  • Adds Duration heartbeatPeriod parameter to create() methods
  • Creates overloaded method with default 60-second heartbeat
  • Reads eventbus-heartbeat-period config value with 60-second default
  • Passes heartbeat period to bound and unbound EventBus constructors
+23/-3   
EventBusFlags.java
Add CLI flag for EventBus heartbeat period configuration 

java/src/org/openqa/selenium/grid/server/EventBusFlags.java

  • Adds --eventbus-heartbeat-period CLI parameter
  • Configures parameter with config section and example value
+6/-0     
EventBusOptions.java
Add heartbeat period configuration option to EventBusOptions

java/src/org/openqa/selenium/grid/server/EventBusOptions.java

  • Adds DEFAULT_HEARTBEAT_PERIOD constant set to 60 seconds
  • Implements getHeartbeatPeriod() method to read config value
  • Returns Duration object from configured heartbeat period
+8/-0     

@selenium-ci selenium-ci added B-grid Everything grid and server related C-java Java Bindings labels Oct 15, 2025
Copy link
Contributor

qodo-merge-pro bot commented Oct 15, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Weak liveness configuration

Description: Heartbeat interval, timeout, and TTL are derived from an externally configurable period
without validation of upper bounds, which could allow a very large value that effectively
disables liveness detection and delays failure detection across the grid.
BoundZmqEventBus.java [85-96]

Referred Code
if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
  int heartbeatIvl = (int) heartbeatPeriod.toMillis();
  socket.setHeartbeatIvl(heartbeatIvl);
  // Set heartbeat timeout to 3x the interval
  socket.setHeartbeatTimeout(heartbeatIvl * 3);
  // Set heartbeat TTL to 6x the interval
  socket.setHeartbeatTtl(heartbeatIvl * 6);
  LOG.info(
      String.format(
          "Event bus %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
          socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
}
Weak liveness configuration

Description: The heartbeat settings accept unbounded durations which, if set excessively high, can
prevent timely detection of dead connections for PUB/SUB sockets and lead to prolonged
unavailability or message loss.
UnboundZmqEventBus.java [182-195]

Referred Code
private void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
  if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
    int heartbeatIvl = (int) heartbeatPeriod.toMillis();
    socket.setHeartbeatIvl(heartbeatIvl);
    // Set heartbeat timeout to 3x the interval
    socket.setHeartbeatTimeout(heartbeatIvl * 3);
    // Set heartbeat TTL to 6x the interval
    socket.setHeartbeatTtl(heartbeatIvl * 6);
    LOG.info(
        String.format(
            "Event bus %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
            socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
  }
}
Ticket Compliance
🟡
🎫 #1234
🔴 Investigate and fix regression where clicking links with javascript in href no longer
triggers in Selenium 2.48.x on Firefox 42 compared to 2.47.1.
Provide compatibility or workaround so alerts triggered via href javascript execute
correctly.
🟡
🎫 #5678
🔴 Diagnose and resolve recurring "Error: ConnectFailure (Connection refused)" when
instantiating multiple ChromeDriver instances on Ubuntu 16.04 with Selenium 3.9.0 and
Chrome 65/ChromeDriver 2.35.
Ensure subsequent ChromeDriver instantiations connect reliably without connection refusal
after the first instance.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
No custom compliance provided

Follow the guide to enable custom compliance check.

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Copy link
Contributor

qodo-merge-pro bot commented Oct 15, 2025

PR Code Suggestions ✨

Latest suggestions up to cc71cd0

CategorySuggestion                                                                                                                                    Impact
Incremental [*]
Prevent heartbeat interval overflow
Suggestion Impact:The commit implemented clamping of the heartbeat interval with defined min/max bounds, safe casting, computed timeout/ttl without overflow, and updated logging—addressing the overflow concern from the suggestion.

code diff:

+  // Minimum heartbeat interval: 1 second
+  private static final long MIN_HEARTBEAT_MS = 1_000L;
+  // Maximum heartbeat interval: ~24 days (to prevent overflow when multiplied by 6)
+  private static final long MAX_HEARTBEAT_MS = Integer.MAX_VALUE / 6;
+
+  private ZmqUtils() {}
 
   /**
    * Configures ZeroMQ heartbeat settings on a socket to prevent stale connections.
+   *
+   * <p>The heartbeat interval is clamped between 1 second and ~24 days to prevent integer overflow
+   * and ensure reasonable values. If the provided duration is outside this range, it will be
+   * adjusted and a warning will be logged.
    *
    * @param socket The ZMQ socket to configure
    * @param heartbeatPeriod The heartbeat interval duration
@@ -39,14 +46,58 @@
    */
   static void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
     if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
-      int heartbeatIvl = (int) heartbeatPeriod.toMillis();
+      long heartbeatMs = heartbeatPeriod.toMillis();
+      long clampedHeartbeatMs = clampHeartbeatInterval(heartbeatMs, socketType);
+
+      // Safe to cast to int now
+      int heartbeatIvl = (int) clampedHeartbeatMs;
+      int heartbeatTimeout = heartbeatIvl * 3;
+      int heartbeatTtl = heartbeatIvl * 6;
+
       socket.setHeartbeatIvl(heartbeatIvl);
-      socket.setHeartbeatTimeout(heartbeatIvl * 3);
-      socket.setHeartbeatTtl(heartbeatIvl * 6);
+      socket.setHeartbeatTimeout(heartbeatTimeout);
+      socket.setHeartbeatTtl(heartbeatTtl);
+
       LOG.info(
           String.format(
-              "ZMQ %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
-              socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
+              "ZMQ %s socket heartbeat configured: interval=%ds, timeout=%ds, ttl=%ds",
+              socketType, heartbeatIvl / 1000, heartbeatTimeout / 1000, heartbeatTtl / 1000));
     }
   }
+
+  /**
+   * Clamps the heartbeat interval to safe bounds and logs warnings if adjustments are made.
+   *
+   * @param heartbeatMs The heartbeat interval in milliseconds
+   * @param socketType The socket type for logging
+   * @return The clamped heartbeat interval
+   */
+  private static long clampHeartbeatInterval(long heartbeatMs, String socketType) {
+    if (heartbeatMs < MIN_HEARTBEAT_MS) {
+      logHeartbeatClampWarning(socketType, heartbeatMs, MIN_HEARTBEAT_MS, "below minimum");
+      return MIN_HEARTBEAT_MS;
+    }
+    if (heartbeatMs > MAX_HEARTBEAT_MS) {
+      logHeartbeatClampWarning(socketType, heartbeatMs, MAX_HEARTBEAT_MS, "exceeds maximum");
+      return MAX_HEARTBEAT_MS;
+    }
+    return heartbeatMs;
+  }
+
+  /**
+   * Logs a warning when the heartbeat interval is clamped.
+   *
+   * @param socketType The socket type
+   * @param originalMs The original interval value in milliseconds
+   * @param clampedMs The clamped interval value in milliseconds
+   * @param reason The reason for clamping
+   */
+  private static void logHeartbeatClampWarning(
+      String socketType, long originalMs, long clampedMs, String reason) {
+    LOG.log(
+        Level.WARNING,
+        String.format(
+            "ZMQ %s socket heartbeat interval %ds %s %ds, clamping to %ds",
+            socketType, originalMs / 1000, reason, clampedMs / 1000, clampedMs / 1000));
+  }

In configureHeartbeat, guard against integer overflow by clamping the heartbeat
interval in milliseconds to a safe maximum before casting to int and setting
socket options. Also, enforce a sensible minimum value.

java/src/org/openqa/selenium/events/zeromq/ZmqUtils.java [40-52]

 static void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
-  if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
-    int heartbeatIvl = (int) heartbeatPeriod.toMillis();
-    socket.setHeartbeatIvl(heartbeatIvl);
-    socket.setHeartbeatTimeout(heartbeatIvl * 3);
-    socket.setHeartbeatTtl(heartbeatIvl * 6);
-    LOG.info(
-        String.format(
-            "ZMQ %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
-            socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
+  if (heartbeatPeriod == null || heartbeatPeriod.isZero() || heartbeatPeriod.isNegative()) {
+    return;
   }
+  long ivlMsLong = heartbeatPeriod.toMillis();
+  // Enforce sane bounds: at least 100ms and at most Integer.MAX_VALUE / 6 to avoid overflow below
+  long min = 100L;
+  long max = Integer.MAX_VALUE / 6L;
+  if (ivlMsLong < min) {
+    ivlMsLong = min;
+  } else if (ivlMsLong > max) {
+    ivlMsLong = max;
+  }
+  int heartbeatIvl = (int) ivlMsLong;
+  int heartbeatTimeout = heartbeatIvl * 3;
+  int heartbeatTtl = heartbeatIvl * 6;
+  socket.setHeartbeatIvl(heartbeatIvl);
+  socket.setHeartbeatTimeout(heartbeatTimeout);
+  socket.setHeartbeatTtl(heartbeatTtl);
+  LOG.info(
+      String.format(
+          "ZMQ %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
+          socketType, heartbeatIvl, heartbeatTimeout, heartbeatTtl));
 }

[Suggestion processed]

Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential integer overflow when calculating heartbeat intervals from a large Duration, which could lead to invalid socket configuration and runtime errors.

Medium
Close sockets on setup failure

In the Failsafe block for socket creation, use a try-catch block to ensure that
if an exception occurs after creating one socket but before the second is fully
configured, the created socket is properly closed to prevent resource leaks.

java/src/org/openqa/selenium/events/zeromq/UnboundZmqEventBus.java [139-152]

 Failsafe.with(retryPolicy)
     .run(
         () -> {
-          sub = context.createSocket(SocketType.SUB);
-          sub.setIPv6(isSubAddressIPv6(publishConnection));
-          ZmqUtils.configureHeartbeat(sub, heartbeatPeriod, "SUB");
-          sub.connect(publishConnection);
-          sub.subscribe(new byte[0]);
+          ZMQ.Socket localSub = null;
+          ZMQ.Socket localPub = null;
+          try {
+            localSub = context.createSocket(SocketType.SUB);
+            localSub.setIPv6(isSubAddressIPv6(publishConnection));
+            ZmqUtils.configureHeartbeat(localSub, heartbeatPeriod, "SUB");
+            localSub.connect(publishConnection);
+            localSub.subscribe(new byte[0]);
 
-          pub = context.createSocket(SocketType.PUB);
-          pub.setIPv6(isSubAddressIPv6(subscribeConnection));
-          ZmqUtils.configureHeartbeat(pub, heartbeatPeriod, "PUB");
-          pub.connect(subscribeConnection);
+            localPub = context.createSocket(SocketType.PUB);
+            localPub.setIPv6(isSubAddressIPv6(subscribeConnection));
+            ZmqUtils.configureHeartbeat(localPub, heartbeatPeriod, "PUB");
+            localPub.connect(subscribeConnection);
+
+            // Assign only after successful setup
+            sub = localSub;
+            pub = localPub;
+          } catch (RuntimeException e) {
+            if (localSub != null) {
+              localSub.close();
+            }
+            if (localPub != null) {
+              localPub.close();
+            }
+            throw e;
+          }
         });

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a potential resource leak where sockets might not be closed if an exception occurs during their configuration, which is a critical issue for long-running services.

Medium
Validate heartbeat config bounds

In getHeartbeatPeriod, validate the integer periodSeconds read from config.
Ensure it is positive and cap it at a reasonable maximum to prevent potential
overflows later when it's converted to milliseconds.

java/src/org/openqa/selenium/events/zeromq/ZeroMqEventBus.java [119-125]

 private static Duration getHeartbeatPeriod(Config config) {
   int periodSeconds =
       config
           .getInt(EVENTS_SECTION, "eventbus-heartbeat-period")
           .orElse(DEFAULT_HEARTBEAT_PERIOD_SECONDS);
+  // Normalize to sensible bounds: at least 1s, cap at Integer.MAX_VALUE / 1000 to avoid millis overflow later
+  if (periodSeconds <= 0) {
+    periodSeconds = DEFAULT_HEARTBEAT_PERIOD_SECONDS;
+  }
+  int maxSeconds = Integer.MAX_VALUE / 1000;
+  if (periodSeconds > maxSeconds) {
+    periodSeconds = maxSeconds;
+  }
   return Duration.ofSeconds(periodSeconds);
 }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: This suggestion improves robustness by validating the user-configured heartbeat period at the source, preventing invalid values from propagating and complementing the overflow protection in ZmqUtils.

Medium
Possible issue
Cleanup sockets on setup failure

In UnboundZmqEventBus, ensure ZMQ sockets are closed on initialization failure
within the Failsafe block to prevent resource leaks.

java/src/org/openqa/selenium/events/zeromq/UnboundZmqEventBus.java [139-152]

 Failsafe.with(retryPolicy)
     .run(
         () -> {
-          sub = context.createSocket(SocketType.SUB);
-          sub.setIPv6(isSubAddressIPv6(publishConnection));
-          ZmqUtils.configureHeartbeat(sub, heartbeatPeriod, "SUB");
-          sub.connect(publishConnection);
-          sub.subscribe(new byte[0]);
+          ZMQ.Socket createdSub = null;
+          ZMQ.Socket createdPub = null;
+          try {
+            createdSub = context.createSocket(SocketType.SUB);
+            createdSub.setIPv6(isSubAddressIPv6(publishConnection));
+            ZmqUtils.configureHeartbeat(createdSub, heartbeatPeriod, "SUB");
+            createdSub.connect(publishConnection);
+            createdSub.subscribe(new byte[0]);
 
-          pub = context.createSocket(SocketType.PUB);
-          pub.setIPv6(isSubAddressIPv6(subscribeConnection));
-          ZmqUtils.configureHeartbeat(pub, heartbeatPeriod, "PUB");
-          pub.connect(subscribeConnection);
+            createdPub = context.createSocket(SocketType.PUB);
+            createdPub.setIPv6(isSubAddressIPv6(subscribeConnection));
+            ZmqUtils.configureHeartbeat(createdPub, heartbeatPeriod, "PUB");
+            createdPub.connect(subscribeConnection);
+
+            // Assign only after successful setup
+            sub = createdSub;
+            pub = createdPub;
+          } catch (RuntimeException e) {
+            if (createdSub != null) {
+              try {
+                createdSub.close();
+              } catch (Exception ignore) {
+              }
+            }
+            if (createdPub != null) {
+              try {
+                createdPub.close();
+              } catch (Exception ignore) {
+              }
+            }
+            throw e;
+          }
         });

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 8

__

Why: This suggestion correctly identifies a potential resource leak where ZMQ sockets are not closed if an exception occurs during initialization within the Failsafe retry block, which could lead to resource exhaustion.

Medium
General
Clamp invalid heartbeat configuration

In getHeartbeatPeriod, validate and clamp the configured heartbeat period to a
sensible range to prevent invalid values from being used.

java/src/org/openqa/selenium/events/zeromq/ZeroMqEventBus.java [119-125]

 private static Duration getHeartbeatPeriod(Config config) {
   int periodSeconds =
       config
           .getInt(EVENTS_SECTION, "eventbus-heartbeat-period")
           .orElse(DEFAULT_HEARTBEAT_PERIOD_SECONDS);
-  return Duration.ofSeconds(periodSeconds);
+  // Clamp to [1s, 86400s] to avoid zero/negative and extreme values
+  int clamped = Math.max(1, Math.min(periodSeconds, 86_400));
+  if (clamped != periodSeconds) {
+    // Optionally log that value was adjusted
+  }
+  return Duration.ofSeconds(clamped);
 }
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion improves robustness by validating user-configured heartbeat values at the source, preventing invalid configurations, though a similar check is also proposed for the utility method.

Low
  • Update

Previous suggestions

✅ Suggestions up to commit 7a60828
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent integer overflow from large duration

Prevent a potential integer overflow by checking if heartbeatPeriod.toMillis()
exceeds Integer.MAX_VALUE before casting it to an int, and cap the value if it
does.

java/src/org/openqa/selenium/events/zeromq/BoundZmqEventBus.java [84-97]

 private void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
   if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
-    int heartbeatIvl = (int) heartbeatPeriod.toMillis();
+    long periodMillis = heartbeatPeriod.toMillis();
+    if (periodMillis > Integer.MAX_VALUE) {
+      LOG.warning(
+          String.format(
+              "Heartbeat period %dms is too large. Capping at %dms.",
+              periodMillis, Integer.MAX_VALUE));
+      periodMillis = Integer.MAX_VALUE;
+    }
+    int heartbeatIvl = (int) periodMillis;
     socket.setHeartbeatIvl(heartbeatIvl);
     // Set heartbeat timeout to 3x the interval
     socket.setHeartbeatTimeout(heartbeatIvl * 3);
     // Set heartbeat TTL to 6x the interval
     socket.setHeartbeatTtl(heartbeatIvl * 6);
     LOG.info(
         String.format(
             "Event bus %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
             socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
   }
 }
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies a potential integer overflow when casting heartbeatPeriod.toMillis() to an int. Although the likelihood is low, adding a check and capping the value makes the code more robust and prevents unexpected behavior from large configuration values.

Medium
High-level
Consolidate redundant configuration logic

Consolidate the duplicated logic for reading the heartbeat period configuration
into the EventBusOptions class and remove the redundant implementation from
ZeroMqEventBus.

Examples:

java/src/org/openqa/selenium/events/zeromq/ZeroMqEventBus.java [99-101]
    int heartbeatPeriodSeconds =
        config.getInt(EVENTS_SECTION, "eventbus-heartbeat-period").orElse(60);
    Duration heartbeatPeriod = Duration.ofSeconds(heartbeatPeriodSeconds);
java/src/org/openqa/selenium/grid/server/EventBusOptions.java [52-56]
  public Duration getHeartbeatPeriod() {
    int period =
        config.getInt(EVENTS_SECTION, "eventbus-heartbeat-period").orElse(DEFAULT_HEARTBEAT_PERIOD);
    return Duration.ofSeconds(period);
  }

Solution Walkthrough:

Before:

// In ZeroMqEventBus.java
public class ZeroMqEventBus {
  public static EventBus create(Config config) {
    // ...
    int heartbeatPeriodSeconds =
        config.getInt("events", "eventbus-heartbeat-period").orElse(60);
    Duration heartbeatPeriod = Duration.ofSeconds(heartbeatPeriodSeconds);
    // ...
    return create(..., heartbeatPeriod);
  }
}

// In EventBusOptions.java
public class EventBusOptions {
  // This new method is unused in the PR
  public Duration getHeartbeatPeriod() {
    int period = config.getInt("events", "eventbus-heartbeat-period").orElse(60);
    return Duration.ofSeconds(period);
  }
}

After:

// In ZeroMqEventBus.java
public class ZeroMqEventBus {
  public static EventBus create(Config config) {
    // ...
    EventBusOptions eventBusOptions = new EventBusOptions(config);
    Duration heartbeatPeriod = eventBusOptions.getHeartbeatPeriod();
    SecretOptions secretOptions = new SecretOptions(config);

    return create(
      new ZContext(),
      publish,
      subscribe,
      bind,
      secretOptions.getRegistrationSecret(),
      heartbeatPeriod);
  }
}

// In EventBusOptions.java
public class EventBusOptions {
  public Duration getHeartbeatPeriod() {
    int period = config.getInt("events", "eventbus-heartbeat-period").orElse(60);
    return Duration.ofSeconds(period);
  }
}
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies duplicated configuration logic and an unused method, proposing a good refactoring that improves maintainability by centralizing the logic in the appropriate EventBusOptions class.

Low
General
Refactor duplicated code into a utility method
Suggestion Impact:The commit replaces local configureHeartbeat calls with ZmqUtils.configureHeartbeat and removes the duplicated private method, implementing the refactor to a utility method.

code diff:

-    configureHeartbeat(xpub, heartbeatPeriod, "XPUB");
+    ZmqUtils.configureHeartbeat(xpub, heartbeatPeriod, "XPUB");
     xpub.bind(xpubAddr.bindTo);
 
     xsub = context.createSocket(SocketType.XSUB);
     xsub.setIPv6(xsubAddr.isIPv6);
     xsub.setImmediate(true);
-    configureHeartbeat(xsub, heartbeatPeriod, "XSUB");
+    ZmqUtils.configureHeartbeat(xsub, heartbeatPeriod, "XSUB");
     xsub.bind(xsubAddr.bindTo);
 
     executor =
@@ -75,25 +75,9 @@
               return thread;
             });
     executor.submit(() -> ZMQ.proxy(xsub, xpub, null));
-
     delegate =
         new UnboundZmqEventBus(
             context, xpubAddr.advertise, xsubAddr.advertise, secret, heartbeatPeriod);
-  }
-
-  private void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
-    if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
-      int heartbeatIvl = (int) heartbeatPeriod.toMillis();
-      socket.setHeartbeatIvl(heartbeatIvl);
-      // Set heartbeat timeout to 3x the interval
-      socket.setHeartbeatTimeout(heartbeatIvl * 3);
-      // Set heartbeat TTL to 6x the interval
-      socket.setHeartbeatTtl(heartbeatIvl * 6);
-      LOG.info(
-          String.format(
-              "Event bus %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
-              socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
-    }
   }

Extract the duplicated configureHeartbeat method from BoundZmqEventBus and
UnboundZmqEventBus into a static helper method in a new utility class to avoid
code repetition and improve maintainability.

java/src/org/openqa/selenium/events/zeromq/BoundZmqEventBus.java [84-97]

-private void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
-  if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
-    int heartbeatIvl = (int) heartbeatPeriod.toMillis();
-    socket.setHeartbeatIvl(heartbeatIvl);
-    // Set heartbeat timeout to 3x the interval
-    socket.setHeartbeatTimeout(heartbeatIvl * 3);
-    // Set heartbeat TTL to 6x the interval
-    socket.setHeartbeatTtl(heartbeatIvl * 6);
-    LOG.info(
-        String.format(
-            "Event bus %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
-            socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
+// In a new file: java/src/org/openqa/selenium/events/zeromq/ZmqUtils.java
+package org.openqa.selenium.events.zeromq;
+
+import java.time.Duration;
+import java.util.logging.Logger;
+import org.zeromq.ZMQ;
+
+class ZmqUtils {
+  private static final Logger LOG = Logger.getLogger(ZmqUtils.class.getName());
+
+  private ZmqUtils() {
+    // Utility class
+  }
+
+  static void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
+    if (heartbeatPeriod != null && !heartbeatPeriod.isZero() && !heartbeatPeriod.isNegative()) {
+      long periodMillis = heartbeatPeriod.toMillis();
+      if (periodMillis > Integer.MAX_VALUE) {
+        LOG.warning(
+            String.format(
+                "Heartbeat period %dms is too large. Capping at %dms.",
+                periodMillis, Integer.MAX_VALUE));
+        periodMillis = Integer.MAX_VALUE;
+      }
+      int heartbeatIvl = (int) periodMillis;
+      socket.setHeartbeatIvl(heartbeatIvl);
+      // Set heartbeat timeout to 3x the interval
+      socket.setHeartbeatTimeout(heartbeatIvl * 3);
+      // Set heartbeat TTL to 6x the interval
+      socket.setHeartbeatTtl(heartbeatIvl * 6);
+      LOG.info(
+          String.format(
+              "Event bus %s socket heartbeat configured: interval=%dms, timeout=%dms, ttl=%dms",
+              socketType, heartbeatIvl, heartbeatIvl * 3, heartbeatIvl * 6));
+    }
   }
 }
 
+// In BoundZmqEventBus.java, replace the configureHeartbeat method with:
+private void configureHeartbeat(ZMQ.Socket socket, Duration heartbeatPeriod, String socketType) {
+  ZmqUtils.configureHeartbeat(socket, heartbeatPeriod, socketType);
+}
+
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that the configureHeartbeat method is duplicated in BoundZmqEventBus and UnboundZmqEventBus, and proposes a valid refactoring to a shared utility method, which improves code maintainability by adhering to the DRY principle.

Low

@VietND96 VietND96 enabled auto-merge (squash) October 15, 2025 23:02
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a configurable ZeroMQ heartbeat mechanism to Selenium Grid's EventBus to prevent stale connection issues, particularly after long idle periods (8+ hours) where nodes may fail to register with the distributor.

Key changes:

  • Introduces --eventbus-heartbeat-period CLI flag with 60-second default
  • Implements heartbeat configuration for all ZMQ socket types (XPUB, XSUB, PUB, SUB)
  • Adds heartbeat utilities with interval clamping and validation

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
EventBusOptions.java Adds heartbeat period configuration option with 60-second default
EventBusFlags.java Adds CLI parameter for configuring heartbeat period
ZmqUtils.java New utility class for configuring ZMQ socket heartbeat settings
ZeroMqEventBus.java Updates factory methods to accept heartbeat period parameter
UnboundZmqEventBus.java Configures heartbeat on PUB/SUB sockets and fixes isReady() method
BoundZmqEventBus.java Configures heartbeat on XPUB/XSUB sockets

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Signed-off-by: Viet Nguyen Duc <[email protected]>
@VietND96 VietND96 merged commit d710cda into trunk Oct 16, 2025
47 checks passed
@VietND96 VietND96 deleted the event-bus-heartbeat branch October 16, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

B-grid Everything grid and server related C-java Java Bindings Review effort 3/5

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[🐛 Bug]: Distributor Requires Restart to Register New Nodes After Hours of Inactivity

3 participants