Skip to content

fix: Reduce number of round trips for redis#172

Merged
sasjo merged 11 commits intomasterfrom
fix/redis-roundtrips
Dec 2, 2025
Merged

fix: Reduce number of round trips for redis#172
sasjo merged 11 commits intomasterfrom
fix/redis-roundtrips

Conversation

@sasjo
Copy link
Contributor

@sasjo sasjo commented Dec 1, 2025

Contribution by @ikalachy.

Problem

The SubscriptionCatalog.removeUnknownSubs() method was performing individual Redis operations for each stale subscription entry, resulting in excessive round trips to Redis. When cleaning up subscriptions from unknown nodes (e.g., after cluster scale-down or node crashes), the method would:

  1. Iterate through all subscription entries
  2. Call subsMap.remove(key, value) individually for each stale entry
  3. This resulted in N round trips where N is the number of stale subscriptions

This approach was inefficient, especially in scenarios with many lingering subscriptions from failed nodes, causing:

  • Increased latency during cleanup operations
  • Higher network overhead
  • Potential performance degradation in high-traffic scenarios

Solution

Optimized the cleanup process to use bulk operations that group removals by subscription key and perform batch deletions:

Key Changes

  1. Bulk Removal by Key (SubscriptionCatalog.java):

    • Group stale subscriptions by their key (address) before removal
    • Use RSet.removeAll() to remove multiple values in a single Redis operation
    • Reduces round trips from N operations to K operations (where K = number of unique keys)
    • Added fallback to individual removal if bulk operation fails
  2. Improved Logging:

    • Added detailed statistics tracking (total removed, failed removals, breakdown by node)
    • Better visibility into cleanup operations
  3. Lock Optimization:

    • Changed ReentrantLock to fair lock in RedissonContext.java for better thread fairness
    • Changed ReadWriteLock to fair lock in SubscriptionCatalog.java
    • Moved lock acquisition inside blocking execution in RedisClusterManager.setNodeInfo() to avoid holding locks during blocking operations

This PR supersedes #170 to add unit tests.

@sasjo sasjo changed the title Fix/redis roundtrips build: Verify PR 170 Dec 1, 2025
@sasjo sasjo force-pushed the fix/redis-roundtrips branch from c9eb9e6 to 854eb71 Compare December 2, 2025 09:06
@sasjo sasjo changed the title build: Verify PR 170 fix: Reduce number of round trips for redis Dec 2, 2025
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 2, 2025

@sasjo sasjo requested a review from Copilot December 2, 2025 09:47
@sasjo sasjo merged commit 6e8f766 into master Dec 2, 2025
10 checks passed
@sasjo sasjo deleted the fix/redis-roundtrips branch December 2, 2025 09:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Redis subscription cleanup by replacing individual deletion operations with bulk removals, significantly reducing network round trips. The optimization groups stale subscriptions by key and performs batch deletions, reducing operations from N (number of stale subscriptions) to K (number of unique keys), with fallback to individual removal on failures.

Key Changes:

  • Implemented bulk removal logic in SubscriptionCatalog.bulkRemoveUnknownSubsByKey() with exception handling fallback
  • Enhanced logging to provide detailed cleanup statistics (total removed, per-node breakdown, failures)
  • Changed locks to fair mode (ReentrantLock(true), ReentrantReadWriteLock(true)) for better thread fairness

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/main/java/com/retailsvc/vertx/spi/cluster/redis/impl/SubscriptionCatalog.java Core optimization: refactored removeUnknownSubs() to group stale subscriptions by key and use RSet.removeAll() for bulk deletion; added static helper method with fallback mechanism and comprehensive logging
src/test/java/com/retailsvc/vertx/spi/cluster/redis/impl/ITSubscriptionCatalog.java Added test for bulk removal exception handling to verify fallback behavior when bulk operations fail
src/main/java/com/retailsvc/vertx/spi/cluster/redis/impl/RedissonContext.java Changed ReentrantLock to fair mode for better thread fairness
src/main/java/com/retailsvc/vertx/spi/cluster/redis/RedisClusterManager.java Moved lock acquisition inside blocking execution in setNodeInfo() to avoid holding lock during blocking I/O
.github/workflows/pull_request.yaml Updated SonarCloud scan condition to check repository name instead of secret availability; added MAVEN_INIT environment variable
.github/workflows/pre-commit.yaml Fixed GitHub Actions name from pre-commit/actions to pre-commit/action

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Comment on lines +138 to +143
assertThat(subsCatalog.get("sub-1"))
.containsOnly(
new RegistrationInfo("node2", 3, false),
new RegistrationInfo("node3", 4, false),
new RegistrationInfo("node4", 5, false));
assertThat(subsCatalog.get("sub-2")).isEmpty();
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test creates a spy of subsMap and mocks subsMap.get("sub-1") to return a mock RSet that throws an exception. However, the test then uses subsCatalog (which has a reference to the original unmocked subsMap) to verify the results. This test is not actually testing what it claims to test.

The test should either:

  1. Create a new SubscriptionCatalog instance using the spied subsMap, or
  2. Call the static method and then verify the results using the spied subsMap directly

The current test is calling the static method with the mocked subsMap, but then asserting against subsCatalog.get() which uses the unmocked map. This means the test is checking the state of the wrong map and may not accurately validate the fallback behavior.

Suggested change
assertThat(subsCatalog.get("sub-1"))
.containsOnly(
new RegistrationInfo("node2", 3, false),
new RegistrationInfo("node3", 4, false),
new RegistrationInfo("node4", 5, false));
assertThat(subsCatalog.get("sub-2")).isEmpty();
SubscriptionCatalog spiedSubsCatalog = new SubscriptionCatalog(subsMap, nodeSelector, keyFactory);
assertThat(spiedSubsCatalog.get("sub-1"))
.containsOnly(
new RegistrationInfo("node2", 3, false),
new RegistrationInfo("node3", 4, false),
new RegistrationInfo("node4", 5, false));
assertThat(spiedSubsCatalog.get("sub-2")).isEmpty();

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants