-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
problem
ISSUE TYPE
- Bug Report
COMPONENT NAME
Cluster Communication / Management Server
CLOUDSTACK VERSION
4.20.1.0
main branch (current)
All versions with cluster communication feature
CONFIGURATION
Advanced networking, VRF/multi-homed management server configuration
OS / ENVIRONMENT
Linux with VRF (Virtual Routing and Forwarding) or multiple network interfaces
SUMMARY
The bind.interface
configuration in server.properties only affects server-side binding (Jetty HTTP server) but is completely ignored for client-side HTTP
connections used in cluster communication. This causes cluster communication failures in VRF or multi-homed environments where management traffic should be
isolated to a specific interface.
STEPS TO REPRODUCE
1. Configure management server with VRF or multiple interfaces
2. Set bind.interface=192.168.100.10 in server.properties (VRF interface)
3. Set cluster.node.IP=192.168.100.10 in db.properties
4. Start CloudStack management server
5. Observe cluster communication attempts
EXPECTED RESULTS
- Management server binds to 192.168.100.10:8080 (server-side) ✓
- Management server binds to 192.168.100.10:9090 (cluster service) ✓
- Outbound cluster HTTP connections originate from 192.168.100.10 ✓
- Cluster communication works within the same network namespace/VRF ✓
ACTUAL RESULTS
- Management server binds to 192.168.100.10:8080 (server-side) ✓ (edit: actually this is not true. on clean start can't bind to vrf addr)
- Management server binds to 192.168.100.10:9090 (cluster service) ✓ (edit: actually this is not true. on clean start can't bind to vrf addr)
- Outbound cluster HTTP connections originate from default interface (10.0.1.2) ❌
- Connection timeout: "Connect to 192.168.100.10:9090 [/192.168.100.10] failed: Connection timed out"
ERROR [c.c.c.ClusterServiceServletImpl] Exception from : https://192.168.100.10:9090/clusterservice
org.apache.http.conn.ConnectTimeoutException: Connect to 192.168.100.10:9090 [/192.168.100.10] failed: Connection timed out
ss output shows asymmetric routing:
SYN-SENT [::ffff:10.0.1.2]:57104 -> [::ffff:192.168.100.10]:9090
Root Cause: ClusterServiceServletImpl.getHttpClient()
creates HttpClient without source address binding configuration. The HttpClient uses system
default routing instead of respecting the configured bind.interface.
Files Affected:
framework/cluster/src/main/java/com/cloud/cluster/ClusterServiceServletImpl.java:180-183
client/src/main/java/org/apache/cloudstack/ServerDaemon.java:246
(server binding works correctly)
Suggested Fix:
Add source interface binding to HttpClient configuration by reading bind.interface from server.properties and configuring a custom ConnectionSocketFactory
with setLocalAddress().
Example Diff:
--- a/framework/cluster/src/main/java/com/cloud/cluster/ClusterServiceServletImpl.java
+++ b/framework/cluster/src/main/java/com/cloud/cluster/ClusterServiceServletImpl.java
@@ -177,8 +177,26 @@ public class ClusterServiceServletImpl implements ClusterService {
.setConnectionRequestTimeout(timeout)
.setSocketTimeout(timeout).build();
+ // Read bind.interface from server.properties for source binding
+ String bindInterface = getBindInterface();
+ ConnectionSocketFactory socketFactory = new SSLConnectionSocketFactory(sslContext);
+
+ if (bindInterface != null) {
+ InetAddress localAddress = InetAddress.getByName(bindInterface);
+ socketFactory = new SSLConnectionSocketFactory(sslContext) {
+ @Override
+ public Socket connectSocket(int connectTimeout, Socket sock, HttpHost host,
+ InetSocketAddress remoteAddress, InetSocketAddress localSocketAddress,
+ HttpContext context) throws IOException {
+ sock.bind(new InetSocketAddress(localAddress, 0));
+ return super.connectSocket(connectTimeout, sock, host, remoteAddress,
+ localSocketAddress, context);
+ }
+ };
+ }
+
s_client = HttpClientBuilder.create()
.setDefaultRequestConfig(config)
- .setSSLContext(sslContext)
+ .setSSLSocketFactory(socketFactory)
.build();
}
---
P.S.: Example code provided by Claude AI assistant during issue analysis.
P.P.S: Socket binding and cluster and main http is not a vrf-aware.
### versions
cloudstack 4.20.1.0
ubuntu 25
Metadata
Metadata
Assignees
Labels
Type
Projects
Status