Skip to content

Commit 9884f21

Browse files
committed
BUG-8180 fixing memory leak with project
1 parent f4967e4 commit 9884f21

File tree

4 files changed

+558
-0
lines changed

4 files changed

+558
-0
lines changed

.github/copilot-instructions.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# Optimizely Ruby SDK - AI Coding Agent Instructions
2+
3+
## Architecture Overview
4+
5+
This is the **Optimizely Ruby SDK** for A/B testing and feature flag management. Key architectural patterns:
6+
7+
### Core Components
8+
- **`Optimizely::Project`** (`lib/optimizely.rb`): Main client class, entry point for all SDK operations
9+
- **`DecisionService`** (`lib/optimizely/decision_service.rb`): Core bucketing logic with 7-step decision flow (status→forced→whitelist→sticky→audience→CMAB→hash)
10+
- **`ProjectConfig`** (`lib/optimizely/config/`): Manages datafile parsing and experiment/feature configuration
11+
- **Config Managers** (`lib/optimizely/config_manager/`): Handle datafile fetching - `HTTPProjectConfigManager` for polling, `StaticProjectConfigManager` for static files
12+
13+
### Data Flow Patterns
14+
- **User Context**: `OptimizelyUserContext` wraps user ID + attributes for decision APIs
15+
- **Decision Pipeline**: All feature flags/experiments flow through `DecisionService.get_variation_for_feature()`
16+
- **Event Processing**: `BatchEventProcessor` queues and batches impression/conversion events
17+
- **CMAB Integration**: Contextual Multi-Armed Bandit system in `lib/optimizely/cmab/` for advanced optimization
18+
19+
## Development Workflows
20+
21+
### Testing
22+
```bash
23+
# Run all tests
24+
bundle exec rake spec
25+
26+
# Run specific test file
27+
bundle exec rspec spec/decision_service_spec.rb
28+
29+
# Run linting
30+
bundle exec rubocop
31+
```
32+
33+
### Build & Gem Management
34+
```bash
35+
# Build gem locally
36+
rake build # outputs to /pkg/
37+
38+
# Run benchmarks
39+
rake benchmark
40+
```
41+
42+
## Project-Specific Conventions
43+
44+
### Ruby Patterns
45+
- **Frozen string literals**: All files start with `# frozen_string_literal: true`
46+
- **Module namespacing**: Everything under `Optimizely::` module
47+
- **Struct for data**: Use `Struct.new()` for decision results (e.g., `DecisionService::Decision`)
48+
- **Factory pattern**: `OptimizelyFactory` provides preset SDK configurations
49+
50+
### Naming Conventions
51+
- **Experiment/Feature keys**: String identifiers from Optimizely dashboard
52+
- **User bucketing**: `bucketing_id` vs `user_id` (can be different for sticky bucketing)
53+
- **Decision sources**: `'EXPERIMENT'`, `'FEATURE_TEST'`, `'ROLLOUT'` constants in `DecisionService`
54+
55+
### Error Handling
56+
- **Graceful degradation**: Invalid configs return `nil`/default values, never crash
57+
- **Logging levels**: Use `Logger::ERROR`, `Logger::INFO`, `Logger::DEBUG` consistently
58+
- **User input validation**: `Helpers::Validator` validates all public API inputs
59+
60+
## Critical Integration Points
61+
62+
### Datafile Management
63+
- **HTTPProjectConfigManager**: Polls Optimizely CDN every 5 minutes by default
64+
- **Notification system**: Subscribe to `OPTIMIZELY_CONFIG_UPDATE` for datafile changes
65+
- **ODP integration**: On datafile update, call `update_odp_config_on_datafile_update()`
66+
67+
### Event Architecture
68+
- **BatchEventProcessor**: Default 10 events/batch, 30s flush interval
69+
- **Thread safety**: SDK spawns background threads for polling and event processing
70+
- **Graceful shutdown**: Always call `optimizely.close()` to flush events
71+
72+
### Feature Flag Decision API
73+
- **New pattern**: Use `decide()` API, not legacy `is_feature_enabled()`
74+
- **Decision options**: `OptimizelyDecideOption` constants control behavior (exclude variables, include reasons, etc.)
75+
- **Forced decisions**: Override bucketing via `set_forced_variation()` or user profile service
76+
77+
## Test Patterns
78+
- **Spec location**: One spec file per class in `/spec/` directory
79+
- **WebMock**: Mock HTTP requests in tests (`require 'webmock/rspec'`)
80+
- **Test data**: Use `spec_params.rb` for shared test constants
81+
- **Coverage**: Coveralls integration via GitHub Actions
82+
83+
## Common Gotchas
84+
- **Thread spawning**: Initialize SDK after forking processes in web servers
85+
- **Bucketing consistency**: Use same `bucketing_id` across SDK calls for sticky behavior
86+
- **CMAB caching**: Context-aware ML decisions cache by user - use appropriate cache invalidation options
87+
- **ODP events**: Require at least one identifier, auto-add default event data
88+
89+
## Critical Memory Leak Prevention
90+
91+
### **NEVER** Initialize Project Per Request
92+
```ruby
93+
# ❌ MEMORY LEAK - Creates new threads every request
94+
def get_optimizely_client
95+
Optimizely::Project.new(datafile: fetch_datafile)
96+
end
97+
98+
# ✅ CORRECT - Singleton pattern with proper cleanup
99+
class OptimizelyService
100+
def self.instance
101+
@instance ||= Optimizely::Project.new(datafile: fetch_datafile)
102+
end
103+
104+
def self.reset!
105+
@instance&.close # Critical: stops background threads
106+
@instance = nil
107+
end
108+
end
109+
```
110+
111+
### Thread Management
112+
- **Background threads**: `Project.new()` spawns multiple threads (`BatchEventProcessor`, `OdpEventManager`, config polling)
113+
- **Memory accumulation**: Each initialization creates new threads that persist until explicitly stopped
114+
- **Proper cleanup**: Always call `optimizely_instance.close()` before dereferencing
115+
- **Rails deployment**: Use singleton pattern or application-level initialization, never per-request
116+
117+
### Production Patterns
118+
```ruby
119+
# Application initialization (config/initializers/optimizely.rb)
120+
OPTIMIZELY_CLIENT = Optimizely::Project.new(
121+
sdk_key: ENV['OPTIMIZELY_SDK_KEY']
122+
)
123+
124+
# Graceful shutdown (config/application.rb)
125+
at_exit { OPTIMIZELY_CLIENT&.close }
126+
```

lib/optimizely.rb

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,11 +54,88 @@ class Project
5454
DEFAULT_CMAB_CACHE_TIMEOUT = (30 * 60 * 1000)
5555
DEFAULT_CMAB_CACHE_SIZE = 1000
5656

57+
# Class-level instance cache to prevent memory leaks from repeated initialization
58+
@@instance_cache = {}
59+
@@cache_mutex = Mutex.new
60+
5761
attr_reader :notification_center
5862
# @api no-doc
5963
attr_reader :config_manager, :decision_service, :error_handler, :event_dispatcher,
6064
:event_processor, :logger, :odp_manager, :stopped
6165

66+
# Get or create a cached Project instance for static datafile configurations
67+
# This prevents memory leaks when the same datafile is used repeatedly
68+
#
69+
# @param datafile - JSON string representing the project
70+
# @param options - Hash of initialization options (optional)
71+
# @return [Project] Cached or new Project instance
72+
def self.get_or_create_instance(datafile: nil, **options)
73+
# Only cache static datafile configurations (no sdk_key, no custom managers)
74+
return new(datafile: datafile, **options) if should_skip_cache?(datafile, options)
75+
76+
cache_key = generate_cache_key(datafile, options)
77+
78+
@@cache_mutex.synchronize do
79+
# Return existing instance if available and not stopped
80+
if @@instance_cache[cache_key] && !@@instance_cache[cache_key].stopped
81+
return @@instance_cache[cache_key]
82+
end
83+
84+
# Create new instance and cache it
85+
instance = new(datafile: datafile, **options)
86+
@@instance_cache[cache_key] = instance
87+
instance
88+
end
89+
end
90+
91+
# Clear all cached instances and properly close them
92+
def self.clear_instance_cache!
93+
@@cache_mutex.synchronize do
94+
# First stop all instances without removing them from cache to avoid deadlock
95+
@@instance_cache.each_value do |instance|
96+
next if instance.stopped
97+
98+
instance.instance_variable_set(:@stopped, true)
99+
instance.config_manager.stop! if instance.config_manager.respond_to?(:stop!)
100+
instance.event_processor.stop! if instance.event_processor.respond_to?(:stop!)
101+
instance.odp_manager.stop!
102+
end
103+
# Then clear the cache
104+
@@instance_cache.clear
105+
end
106+
end
107+
108+
# Get count of cached instances (for testing/monitoring)
109+
def self.cached_instance_count
110+
@@cache_mutex.synchronize { @@instance_cache.size }
111+
end
112+
113+
private_class_method def self.should_skip_cache?(datafile, options)
114+
# Don't cache if using dynamic features that would make sharing unsafe
115+
return true if options[:sdk_key] ||
116+
options[:config_manager] ||
117+
options[:event_processor] ||
118+
options[:user_profile_service] ||
119+
datafile.nil? || datafile.empty?
120+
121+
# Also don't cache if custom loggers or error handlers that might have state
122+
return true if options[:logger] || options[:error_handler] || options[:event_dispatcher]
123+
124+
false
125+
end
126+
127+
private_class_method def self.generate_cache_key(datafile, options)
128+
# Create cache key from datafile content and relevant options
129+
require 'digest'
130+
content_hash = Digest::SHA256.hexdigest(datafile)
131+
options_hash = {
132+
skip_json_validation: options[:skip_json_validation],
133+
default_decide_options: options[:default_decide_options]&.sort,
134+
event_processor_options: options[:event_processor_options]
135+
}
136+
"#{content_hash}_#{Digest::SHA256.hexdigest(options_hash.to_s)}"
137+
end
138+
62139
# Constructor for Projects.
63140
#
64141
# @param datafile - JSON string representing the project.
@@ -163,6 +240,23 @@ def initialize(
163240
flush_interval: event_processor_options[:flush_interval] || BatchEventProcessor::DEFAULT_BATCH_INTERVAL
164241
)
165242
end
243+
244+
# Set up finalizer to ensure cleanup if close() is not called explicitly
245+
ObjectSpace.define_finalizer(self, self.class.create_finalizer(@config_manager, @event_processor, @odp_manager))
246+
end
247+
248+
# Create finalizer proc to clean up background threads
249+
# This ensures cleanup even if close() is not explicitly called
250+
def self.create_finalizer(config_manager, event_processor, odp_manager)
251+
proc do
252+
begin
253+
config_manager.stop! if config_manager.respond_to?(:stop!)
254+
event_processor.stop! if event_processor.respond_to?(:stop!)
255+
odp_manager.stop! if odp_manager.respond_to?(:stop!)
256+
rescue
257+
# Suppress errors during finalization to avoid issues during GC
258+
end
259+
end
166260
end
167261

168262
# Create a context of the user for which decision APIs will be called.
@@ -936,6 +1030,14 @@ def close
9361030
@config_manager.stop! if @config_manager.respond_to?(:stop!)
9371031
@event_processor.stop! if @event_processor.respond_to?(:stop!)
9381032
@odp_manager.stop!
1033+
1034+
# Remove this instance from the cache if it exists
1035+
# Note: we don't synchronize here to avoid deadlock when called from clear_instance_cache!
1036+
self.class.send(:remove_from_cache_unsafe, self)
1037+
end
1038+
1039+
private_class_method def self.remove_from_cache_unsafe(instance)
1040+
@@instance_cache.delete_if { |_key, cached_instance| cached_instance == instance }
9391041
end
9401042

9411043
def get_optimizely_config

0 commit comments

Comments
 (0)