Skip to content

Conversation

@drborges
Copy link
Owner

@drborges drborges commented Nov 13, 2025

Summary

Add support to Stale-While-Revalidate (SWR) prompt cache strategy.

Motivation

With the current caching strategy, the first request after cache expiry must wait for the Langfuse API call (~100ms). Even with stampede protection preventing 1,200 simultaneous API calls, one user still pays the latency cost.

The goal for SWR is to serve slightly outdated (stale) data immediately while refreshing the cache in the background so users get instant responses (~1ms).

Changes

TODO

Testing

TODO

Usage

Configure the client:

Langfuse.configure do |config|
  config.public_key = ENV['LANGFUSE_PUBLIC_KEY']
  config.secret_key = ENV['LANGFUSE_SECRET_KEY']
  
  # Required: Use Rails cache backend
  config.cache_backend = :rails
  config.cache_ttl = 300 # Fresh for 5 minutes
  
  # Enable SWR
  config.cache_stale_while_revalidate = true
  config.cache_stale_ttl = 300 # Grace period: 5 more minutes
  config.cache_refresh_threads = 5 # Background thread pool size
end

Once configured, SWR works transparently:

client = Langfuse.client

# First request - populates cache
prompt = client.get_prompt("greeting") # ~100ms (API call)

# Subsequent requests while fresh
prompt = client.get_prompt("greeting") # ~1ms (cache hit)

# After cache_ttl expires but within grace period
prompt = client.get_prompt("greeting") # ~1ms (stale data + background refresh)

# Background refresh completes, next request gets fresh data
prompt = client.get_prompt("greeting") # ~1ms (fresh cache)

References

TODO

  • Double check a few assumptions:
    • Is it expected for the in memory cache to support SWR? My assumption is yes given the suggestions around enhancing CacheEntry
    • It would be interesting to have the SWR module reused across cache implementations if possible (memory vs Rails adapter for instance)
  • Update PR description with architecture / test coverage information

- Add SWR configuration options (cache_stale_while_revalidate, cache_stale_ttl, cache_refresh_threads)
- Enhance RailsCacheAdapter with SWR support and background refresh
- Integrate SWR detection and usage in ApiClient
- Add concurrent-ruby dependency for thread pool management
- Implement comprehensive test suite (53 new tests)
- Add example usage and documentation
- Maintain backward compatibility and 97.78% test coverage

Follows design spec in docs/future-enhancements/STALE_WHILE_REVALIDATE_DESIGN.md

Key benefits:
- Near-instant response times (~1ms vs ~100ms)
- Background refresh prevents user-facing latency
- Graceful degradation with stale data during API issues
- Thread-safe implementation with stampede protection
- Consolidated memoized helpers to comply with rubocop limits
- Fixed RSpec/ContextWording violations with proper context descriptions
- Fixed RSpec/VerifiedDoubleReference by using class constants
- Added rubocop exception for integration test length (complex scenario)
- Converted excessive let blocks to inline variables where appropriate
- All 53 SWR tests still passing with clean rubocop compliance
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements Stale-While-Revalidate (SWR) caching for the Langfuse Ruby SDK to provide near-instant response times (~1ms vs ~100ms) for prompt fetching. The implementation adds background refresh capabilities with thread pool management, allowing stale data to be served immediately while fresh data is fetched asynchronously.

Key changes:

  • Added SWR configuration options (cache_stale_while_revalidate, cache_stale_ttl, cache_refresh_threads) with validation
  • Enhanced RailsCacheAdapter with SWR support, metadata storage, and background refresh using concurrent-ruby
  • Integrated SWR detection in ApiClient with graceful fallback to existing caching strategies

Reviewed Changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
lib/langfuse/rails_cache_adapter.rb Core SWR implementation with fetch_with_stale_while_revalidate, metadata handling, thread pool management, and refresh locks
lib/langfuse/config.rb Added SWR configuration options with validation ensuring Rails backend requirement
lib/langfuse/client.rb Updated cache initialization to conditionally enable SWR based on configuration
lib/langfuse/api_client.rb Added caching strategy detection to use SWR when available with fallback chain
langfuse.gemspec Added concurrent-ruby (~> 1.2) dependency for thread pool management
Gemfile.lock Updated with concurrent-ruby 1.3.5 dependency
spec/langfuse/rails_cache_adapter_swr_spec.rb Comprehensive test suite covering SWR lifecycle, metadata, locks, and error cases
spec/langfuse/config_swr_spec.rb Configuration validation tests for SWR options
spec/langfuse/api_client_swr_spec.rb Integration tests for ApiClient SWR detection and usage
examples/swr_cache_example.rb Demonstration of SWR configuration and usage patterns
SWR_FEATURE_README.md Complete documentation of SWR feature, configuration, and best practices

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@drborges
Copy link
Owner Author

@copilot review this PR again and let me know if there are any logic that can be simplified or any test coverage that is missing.

Copy link

Copilot AI commented Nov 26, 2025

@drborges I've opened a new pull request, #2, to work on those changes. Once the pull request is ready, I'll request review from you.

@drborges
Copy link
Owner Author

Closing in favor of simplepractice#35

@drborges drborges closed this Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants