Add distributed cache for horizontal scaling#877
Open
hpopuri2 wants to merge 16 commits intotrinodb:mainfrom
Open
Add distributed cache for horizontal scaling#877hpopuri2 wants to merge 16 commits intotrinodb:mainfrom
hpopuri2 wants to merge 16 commits intotrinodb:mainfrom
Conversation
kbhatianr
reviewed
Jan 27, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 27, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 27, 2026
kbhatianr
reviewed
Jan 27, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/DistributedCache.java
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 27, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/DistributedCache.java
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 28, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/module/HaGatewayProviderModule.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 28, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/module/HaGatewayProviderModule.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 28, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/cache/ValkeyDistributedCache.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 28, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/cache/QueryCacheManager.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 28, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 28, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Outdated
Show resolved
Hide resolved
kbhatianr
reviewed
Jan 29, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/cache/ValkeyDistributedCache.java
Outdated
Show resolved
Hide resolved
oneonestar
reviewed
Feb 4, 2026
Member
oneonestar
left a comment
There was a problem hiding this comment.
Just a quick skim. Please rebase to main since we migrated to Caffeine cache =)
gateway-ha/src/main/java/io/trino/gateway/ha/config/ValkeyConfiguration.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/ha/cache/ValkeyDistributedCache.java
Outdated
Show resolved
Hide resolved
Contributor
Author
|
@oneonestar addressed comments and done rebasing as well. please review again |
oneonestar
requested changes
Feb 6, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Outdated
Show resolved
Hide resolved
Contributor
Author
|
@oneonestar addressed comment changing all logic to querycachemanger and given new cache design .. |
oneonestar
reviewed
Feb 9, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/ha/cache/QueryCacheManager.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/proxyserver/ProxyRequestHandler.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/ha/router/BaseRoutingManager.java
Outdated
Show resolved
Hide resolved
Contributor
Author
|
@oneonestar addressed comments and one open conversation let me know your answer there ... |
Contributor
Author
|
@oneonestar addressed comments ..please review the single cache design |
ebyhr
requested changes
Feb 11, 2026
gateway-ha/src/main/java/io/trino/gateway/ha/cache/ValkeyDistributedCache.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/ha/cache/DistributedCache.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/ha/cache/QueryCacheManager.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/main/java/io/trino/gateway/ha/cache/ValkeyDistributedCache.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/test/java/io/trino/gateway/ha/cache/TestQueryCacheManager.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/test/java/io/trino/gateway/ha/router/TestValkeyDistributedCacheIntegration.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/test/java/io/trino/gateway/ha/util/TestcontainersUtils.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/test/java/io/trino/gateway/ha/util/TestcontainersUtils.java
Outdated
Show resolved
Hide resolved
gateway-ha/src/test/java/io/trino/gateway/ha/router/TestValkeyDistributedCacheIntegration.java
Outdated
Show resolved
Hide resolved
Contributor
Author
|
@ebyhr resolved all comments please review the changes |
- Fixed cacheTtlSeconds configuration not being used in ValkeyDistributedCache - Refactored repetitive distributedCache.isEnabled() checks into helper methods - Created QueryCacheManager to encapsulate cache management logic - Moved all cache classes to dedicated io.trino.gateway.ha.cache package - Renamed DistributedCache interface to Cache for better abstraction These changes provide better separation of concerns and make the caching infrastructure more maintainable and reusable across the gateway.
Resolved code review comments from @kbhatianr: 1. Applied proper dependency injection pattern in HaGatewayProviderModule - Made provider methods static with injected parameters - HaGatewayConfiguration is injected (already bound in BaseApp) 2. Simplified ValkeyDistributedCache constructor - Accept ValkeyConfiguration object instead of 10 individual parameters 3. Implemented proper DI for QueryCacheManager - Added @provides method in HaGatewayProviderModule - Separated concerns: QueryCacheManager handles L2 (distributed cache), BaseRoutingManager owns L1 (LoadingCache) - QueryCacheManager is now injected into routing managers 4. Abstracted cache tier orchestration - Added getBackend/getRoutingGroup/getExternalUrl methods to QueryCacheManager - These methods internally handle L2→L3 fallback and automatic backfilling - Eliminated manual cache tier checking from BaseRoutingManager
Use Duration, fix database logging, update documentation
Move all cache logic into QueryCacheManager
…e QueryCacheManager directly
Consolidated three separate caches (backend, routingGroup, externalUrl) into a single cache storing QueryMetadata objects. This reduces cache operations by 3x, ensures atomic updates, and improves consistency across the 3-tier cache architecture (L1: Caffeine, L2: Valkey, L3: Database). Added @JsonIgnore annotations to prevent Jackson from serializing helper methods (isEmpty, isComplete) as JSON properties, which was causing deserialization failures in distributed cache operations.
381dbeb to
fdcb77f
Compare
Contributor
Author
|
@ebyhr resolved all the comments ..please review |
Contributor
Author
|
@oneonestar , @ebyhr please review |
Member
|
I removed the two unnecessary caches in #923. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
##Add Valkey Distributed Cache for Horizontal Scaling
##Summary
This PR implements distributed caching using Valkey to enable horizontal scaling of Trino Gateway. Multiple gateway instances can now share query metadata through a distributed cache layer, ensuring consistent query routing across all instances.
##Motivation
Currently, Trino Gateway uses local Guava caches that are not shared between instances. In multi-instance deployments, this can lead to:
This implementation addresses these limitations while maintaining backward compatibility and graceful degradation.
##Architecture
3-Tier Caching Strategy
Request Flow:
- Hit: Return immediately
- Miss: Check L2
- Hit: Populate L1, return
- Miss: Check L3
- Found: Populate L2 + L1, return
- Not Found: Search backends via HTTP (~200ms)
Cache Keys
Three values are cached for each query:
All keys use configurable TTL (default 30 minutes / 1800 seconds).
##Implementation Details
Core Components
ValkeyConfiguration (gateway-ha/src/main/java/io/trino/gateway/ha/config/ValkeyConfiguration.java)
Cache Interface (gateway-ha/src/main/java/io/trino/gateway/ha/cache/Cache.java)
ValkeyDistributedCache (gateway-ha/src/main/java/io/trino/gateway/ha/cache/ValkeyDistributedCache.java)
QueryCacheManager (gateway-ha/src/main/java/io/trino/gateway/ha/cache/QueryCacheManager.java) - NEW
NoopDistributedCache (gateway-ha/src/test/java/io/trino/gateway/ha/cache/NoopDistributedCache.java)
Integration
BaseRoutingManager - Simplified routing logic:
ProxyRequestHandler - Query submission:
HaGatewayProviderModule - Dependency injection:
Configuration
Minimal (Recommended for Getting Started)
With Authentication
Advanced (Production Tuning)
Single Instance (No Changes Required)
##Testing
Unit Tests
TestValkeyConfiguration
TestValkeyDistributedCache (2 tests)
Integration Tests
TestValkeyDistributedCacheIntegration (9 comprehensive tests using TestContainers)
TestRoutingManagerExternalUrlCache (6 tests)
TestContainers Setup
Test Results
##Backward Compatibility
✅ Fully backward compatible
Migration Path
From Single to Multi-Gateway:
docker run -d -p 6379:6379 valkey/valkey:latest
valkeyConfiguration:
enabled: true
host: valkey.internal
port: 6379
password: ${VALKEY_PASSWORD}
Check Valkey keys
docker exec valkey valkey-cli KEYS "trino:query:*"
No data migration needed - cache populates automatically.
##Graceful Degradation
When Valkey is unavailable:
Dependencies
Added:
###Code Quality Improvements
New Files (8)
Core Implementation:
Tests:
Modified Files (10)
Configuration:
Core:
Build:
Tests:
Future Enhancements