Skip to content

Fix cluster mode connecting to same node for all shards#35

Open
ajGingrich wants to merge 2 commits intomainfrom
fix/cluster-node-connection
Open

Fix cluster mode connecting to same node for all shards#35
ajGingrich wants to merge 2 commits intomainfrom
fix/cluster-node-connection

Conversation

@ajGingrich
Copy link
Copy Markdown
Contributor

Summary

  • Bug: In cluster mode, process_node() always connected to the config entry-point host/port (config.get("host") / config.get("port")) instead of the actual discovered node address passed via the node parameter. This meant every node in a Redis cluster was queried from the same instance, producing identical metrics across all masters and replicas.
  • Fix: Use the discovered node's host and port (parsed from the node parameter) for the Redis connection, while preserving authentication and TLS settings from config.
  • Scope: Only cluster mode was affected. In non-cluster (standalone) mode, process_database() constructs the node address from config["host"]:config["port"], so the values already matched.

Problem Detail

When process_database() discovers cluster nodes via CLUSTER NODES, it iterates over each discovered host:port and passes it to process_node(). However, process_node() was ignoring the node address and always creating the Redis client with:

client = get_redis_client(
    host=config.get("host"),     # <-- always the config entry-point
    port=int(config.get("port", 6379)),  # <-- always the config port
    ...
)

This caused all 6 rows in a 3-master/3-replica cluster to show identical values for memory, throughput, and command stats.

Changes

  • osstats.py: Parse node into node_host and node_port, and use those for the get_redis_client() call
  • test_osstats.py: Added test_process_node_connects_to_discovered_node which uses a different node address than the config entry-point, verifying get_redis_client is called with the discovered node's host/port

Test plan

  • All 20 tests pass (pytest -v)
  • Code formatting passes (black --check .)
  • Re-run against a real Redis cluster to verify each node now reports distinct metrics

Made with Cursor

In cluster mode, process_node() was always connecting to the config
entry-point host/port instead of the discovered node's address. This
caused all cluster nodes to report identical metrics (memory, throughput,
command stats) since they were all querying the same Redis instance.

The fix uses the node address (host:port) from cluster discovery for the
Redis connection while preserving auth and TLS settings from config.

Non-cluster mode was unaffected since the single-node address already
matched the config values.

Made-with: Cursor
Add a ping check after creating the Redis client in process_node().
If a discovered cluster node is unreachable (e.g. behind NAT, load
balancer, or firewall), return None with a warning instead of crashing
the entire collection run for that database.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant