Skip to content

Commit a180d2e

Browse files
authored
Support for resolving sticky assignments (#31)
* add basic support for sticky assignments * more sticky * fixup! more sticky * cleanup * fixup! cleanup * add resolving sticky * fail fast on the first missing materialization info * add documentation about how sticky assignments are resolved and what is the algorithm to discover missing materializations and resolve * wrap the resolve request in resolve with sticky req * fixup! wrap the resolve request in resolve with sticky req * fix clippy * fixup! fix clippy * fix clippy * move to wasm_api * move collect deps out of resolve flag * fixup! move collect deps out of resolve flag * fix comments * move the materialization types to the response
1 parent 20284b9 commit a180d2e

File tree

6 files changed

+727
-61
lines changed

6 files changed

+727
-61
lines changed

STICKY_ASSIGNMENTS.md

Lines changed: 271 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,271 @@
1+
# Sticky Assignments in Confidence Flag Resolver
2+
3+
## Overview
4+
5+
Sticky assignments are a feature in the Confidence Flag Resolver that allows flag assignments to persist across multiple resolve requests. This ensures consistent user experiences and enables advanced experimentation workflows by maintaining assignment state over time.
6+
7+
## What are Sticky Assignments?
8+
9+
Sticky assignments work by storing flag assignment information (materializations) that can be referenced in future resolve requests. Instead of randomly assigning users to variants each time a flag is resolved, the system can "stick" to previous assignments when certain conditions are met.
10+
11+
### Key Concepts
12+
13+
- **Materialization**: The persisted record of a flag assignment for a specific unit (user/entity)
14+
- **Unit**: The entity being assigned (typically a user ID or targeting key)
15+
- **Materialization Context**: Information about previous assignments passed to the resolver
16+
- **Read/Write Materialization**: Rules specify whether to read from or write to materializations
17+
18+
## How It Works
19+
20+
### 1. Materialization Specification
21+
22+
Each flag rule can include a `MaterializationSpec` that defines:
23+
24+
```protobuf
25+
message MaterializationSpec {
26+
// Where to read previous assignments from
27+
string read_materialization = 2;
28+
29+
// Where to write new assignments to
30+
string write_materialization = 1;
31+
32+
// How materialization reads should be treated
33+
MaterializationReadMode mode = 3;
34+
}
35+
```
36+
37+
### 2. Materialization Read Mode
38+
39+
The `MaterializationReadMode` controls how materializations interact with normal targeting:
40+
41+
```protobuf
42+
message MaterializationReadMode {
43+
// If true, only units in the materialization will be considered
44+
// If false, units match if they're in materialization OR match segment
45+
bool materialization_must_match = 1;
46+
47+
// If true, segment targeting is ignored for units in materialization
48+
// If false, both materialization and segment must match
49+
bool segment_targeting_can_be_ignored = 2;
50+
}
51+
```
52+
53+
### 3. Resolution Process
54+
55+
When resolving a flag with sticky assignments enabled:
56+
57+
1. **Check Dependencies**: Verify all required materializations are available
58+
2. **Read Materialization**: Check if the unit has a previous assignment for this rule
59+
3. **Apply Logic**: Based on `MaterializationReadMode`, determine if the stored assignment should be used
60+
4. **Write Materialization**: If a new assignment is made and a write materialization is specified, store it
61+
62+
### 4. Materialization Context
63+
64+
The resolver accepts a `MaterializationContext` containing previous assignments:
65+
66+
```protobuf
67+
message MaterializationContext {
68+
map<string, MaterializationInfo> unit_materialization_info = 1;
69+
}
70+
71+
message MaterializationInfo {
72+
bool unit_in_info = 1;
73+
map<string, string> rule_to_variant = 2;
74+
}
75+
```
76+
77+
## Usage Patterns
78+
79+
### Basic Sticky Assignment
80+
81+
A rule with both read and write materialization will:
82+
1. Check if the unit was previously assigned
83+
2. Use the previous assignment if available
84+
3. Store new assignments for future use
85+
86+
### Paused Intake
87+
88+
Setting `materialization_must_match = true` creates "paused intake":
89+
- Only units already in the materialization will match the rule
90+
- New units will skip this rule entirely
91+
- Useful for controlled rollout scenarios
92+
93+
### Override Targeting
94+
95+
Setting `segment_targeting_can_be_ignored = true` allows:
96+
- Units in materialization match the rule regardless of segment targeting
97+
- Segment allocation proportions are ignored for these units
98+
- Useful for maintaining assignments when targeting rules change
99+
100+
## API Integration
101+
102+
### Enable Sticky Assignments
103+
104+
Set `process_sticky = true` in the resolve request:
105+
106+
```protobuf
107+
message ResolveFlagsRequest {
108+
// ... other fields ...
109+
110+
// if the resolver should handle sticky assignments
111+
bool process_sticky = 6;
112+
113+
// Context about the materialization required for the resolve
114+
MaterializationContext materialization_context = 7;
115+
116+
// if a materialization info is missing, return immediately
117+
bool fail_fast_on_sticky = 8;
118+
}
119+
```
120+
121+
### Handling Missing Materializations
122+
123+
The resolver may return `MissingMaterializations` when required materialization data is unavailable:
124+
125+
```protobuf
126+
message ResolveFlagResponseResult {
127+
oneof resolve_result {
128+
ResolveFlagsResponse response = 1;
129+
MissingMaterializations missing_materializations = 2;
130+
}
131+
}
132+
133+
message MissingMaterializationItem {
134+
string unit = 1;
135+
string rule = 2;
136+
string read_materialization = 3;
137+
}
138+
```
139+
140+
## Use Cases
141+
142+
### 1. Consistent User Experience
143+
144+
Ensure users see the same variant across app sessions and devices by storing their assignments in a shared materialization store.
145+
146+
### 2. Experiment Analysis
147+
148+
Maintain assignment consistency during long-running experiments, even when targeting rules or traffic allocation changes.
149+
150+
### 3. Migration Scenarios
151+
152+
Gradually migrate users from one variant to another by updating materializations over time.
153+
154+
### 4. Controlled Rollout
155+
156+
Use "paused intake" mode to limit new user assignments while maintaining existing ones.
157+
158+
## Implementation Details
159+
160+
### Materialization Updates
161+
162+
When assignments are made, the resolver returns `MaterializationUpdate` objects:
163+
164+
```protobuf
165+
message MaterializationUpdate {
166+
string unit = 1;
167+
string write_materialization = 2;
168+
string rule = 3;
169+
string variant = 4;
170+
}
171+
```
172+
173+
These should be persisted by the client for use in future resolve requests.
174+
175+
### Error Handling
176+
177+
- **Missing Materializations**: When required materialization data is unavailable
178+
- **Fail Fast**: `fail_fast_on_sticky` controls whether to return immediately or continue processing
179+
- **Dependency Checking**: The resolver validates all materialization dependencies before evaluation
180+
181+
## Advanced Optimizations
182+
183+
### Fail Fast on First Missing Materialization
184+
185+
The `fail_fast_on_sticky` parameter provides a performance optimization for handling missing materializations:
186+
187+
**Behavior:**
188+
- When `fail_fast_on_sticky = true`: As soon as any flag encounters a missing materialization dependency, the resolver immediately returns all accumulated missing materializations without processing remaining flags
189+
- When `fail_fast_on_sticky = false`: The resolver continues processing all flags and collects all missing materializations before returning
190+
191+
**Use Cases:**
192+
- **Discovery Mode**: Set to `false` when you want to collect all missing materializations across all flags in a single request
193+
- **Production Mode**: Set to `true` when you want immediate feedback about missing dependencies to avoid unnecessary processing
194+
195+
**Example Flow:**
196+
```
197+
Flag A: ✅ Has materialization → Process normally
198+
Flag B: ❌ Missing materialization + fail_fast=true → Return immediately with [Flag B missing item]
199+
Flag C: (Not processed due to fail_fast)
200+
```
201+
202+
### Rule Evaluation Skipping Optimization
203+
204+
The resolver implements a sophisticated optimization to avoid unnecessary rule evaluation when materialization dependencies are missing:
205+
206+
**The `skip_on_not_missing` Mechanism:**
207+
208+
1. **Dependency Discovery Phase**: When processing multiple flags, if any previous flag had missing materializations, subsequent flags enter "discovery mode"
209+
210+
2. **Two-Pass Evaluation**:
211+
- **Pass 1**: Check for missing materializations only (skip rule evaluation)
212+
- **Pass 2**: If all materializations are available, re-evaluate with full rule processing
213+
214+
3. **Optimization Logic**:
215+
```rust
216+
skip_on_not_missing: !missing_materialization_items.is_empty()
217+
```
218+
219+
**How It Works:**
220+
221+
```
222+
Processing Flag 1:
223+
├── Rule 1: Missing materialization X → Collect missing item
224+
├── Rule 2: Skip evaluation (skip_on_not_missing=true)
225+
├── Result: Flag 1 has missing materializations
226+
227+
Processing Flag 2:
228+
├── skip_on_not_missing = true (because Flag 1 had missing deps)
229+
├── All rules: Only check for missing materializations, don't evaluate
230+
├── Result: Collect any additional missing items for Flag 2
231+
```
232+
233+
**Benefits:**
234+
- **Performance**: Avoids expensive rule evaluation (segment matching, bucket calculation) when dependencies are missing
235+
- **Consistency**: Ensures all missing materializations are discovered before any rule evaluation begins
236+
- **Atomicity**: Either all flags resolve successfully with their materializations, or all missing dependencies are returned
237+
238+
**Complete Resolution Flow:**
239+
240+
1. **First Pass**: Process all flags in discovery mode to find all missing materializations
241+
2. **Early Return**: If `fail_fast_on_sticky=true` and missing deps found, return immediately
242+
3. **Second Pass**: If all materializations available, re-process all flags with full evaluation
243+
4. **Success**: Return resolved flags with materialization updates
244+
245+
This optimization ensures efficient handling of complex dependency graphs while maintaining correctness and performance.
246+
247+
### Performance Considerations
248+
249+
- **Materialization lookups happen before rule evaluation**: Dependencies are checked first to avoid expensive operations
250+
- **Failed materialization dependencies skip rule evaluation**: No segment matching or bucket calculation when deps missing
251+
- **Two-phase resolution**: Discovery phase finds all missing deps, evaluation phase only runs when all deps available
252+
- **Batch processing**: Multiple flags can share materialization context for efficient processing
253+
254+
## Best Practices
255+
256+
1. **Consistent Storage**: Use reliable storage for materialization data to ensure assignment consistency
257+
2. **Version Management**: Consider materialization versioning for complex migration scenarios
258+
3. **Monitoring**: Track materialization hit rates and assignment consistency
259+
4. **Testing**: Verify sticky behavior with different materialization states
260+
5. **Cleanup**: Implement materialization cleanup for archived flags or expired experiments
261+
262+
## Example Workflow
263+
264+
1. User requests flag resolution without materialization context
265+
2. Resolver assigns variants and returns `MaterializationUpdate`s
266+
3. Client stores materialization data
267+
4. Subsequent requests include `MaterializationContext`
268+
5. Resolver uses stored assignments when available, creating new ones as needed
269+
6. Process continues with updated materialization context
270+
271+
This approach ensures assignment consistency while allowing new users to be assigned according to current targeting rules.

confidence-cloudflare-resolver/src/lib.rs

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ static FLAGS_LOGS_QUEUE: OnceLock<Queue> = OnceLock::new();
3333
static CONFIDENCE_CLIENT_ID: OnceLock<String> = OnceLock::new();
3434
static CONFIDENCE_CLIENT_SECRET: OnceLock<String> = OnceLock::new();
3535

36-
static FLAG_LOGGER: Lazy<Logger> = Lazy::new(|| Logger::new());
36+
static FLAG_LOGGER: Lazy<Logger> = Lazy::new(Logger::new);
3737

3838
static RESOLVER_STATE: Lazy<ResolverState> = Lazy::new(|| {
3939
ResolverState::from_proto(STATE_JSON.to_owned().try_into().unwrap(), ACCOUNT_ID).unwrap()
@@ -192,9 +192,7 @@ pub async fn main(req: Request, env: Env, _ctx: Context) -> Result<Response> {
192192
&Bytes::from(STANDARD.decode(ENCRYPTION_KEY_BASE64).unwrap()),
193193
) {
194194
Ok(resolver) => match resolver.apply_flags(&apply_flag_req) {
195-
Ok(()) => {
196-
return Response::from_json(&ApplyFlagsResponse::default());
197-
}
195+
Ok(()) => Response::from_json(&ApplyFlagsResponse::default()),
198196
Err(msg) => {
199197
Response::error(msg, 500)?.with_cors_headers(&allowed_origin)
200198
}

confidence-resolver/build.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ fn main() -> Result<()> {
99
root.join("confidence/flags/admin/v1/resolver.proto"),
1010
root.join("confidence/flags/resolver/v1/api.proto"),
1111
root.join("confidence/flags/resolver/v1/internal_api.proto"),
12+
root.join("confidence/flags/resolver/v1/wasm_api.proto"),
1213
root.join("confidence/flags/resolver/v1/events/events.proto"),
1314
];
1415

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
syntax = "proto3";
2+
3+
package confidence.flags.resolver.v1;
4+
5+
import "google/api/resource.proto";
6+
import "google/api/annotations.proto";
7+
import "google/api/field_behavior.proto";
8+
import "google/protobuf/struct.proto";
9+
import "google/protobuf/timestamp.proto";
10+
11+
import "confidence/api/annotations.proto";
12+
import "confidence/flags/types/v1/types.proto";
13+
import "confidence/flags/resolver/v1/types.proto";
14+
import "confidence/flags/resolver/v1/api.proto";
15+
16+
option java_package = "com.spotify.confidence.flags.resolver.v1";
17+
option java_multiple_files = true;
18+
option java_outer_classname = "WasmApiProto";
19+
20+
21+
message ResolveWithStickyRequest {
22+
ResolveFlagsRequest resolve_request = 1;
23+
24+
// Context about the materialization required for the resolve
25+
MaterializationContext materialization_context = 7;
26+
27+
// if a materialization info is missing, we want tor return to the caller immediately
28+
bool fail_fast_on_sticky = 8;
29+
}
30+
31+
message MaterializationContext {
32+
map<string, MaterializationInfo> unit_materialization_info = 1;
33+
}
34+
35+
message MaterializationInfo {
36+
bool unit_in_info = 1;
37+
map<string, string> rule_to_variant = 2;
38+
}
39+
40+
message ResolveWithStickyResponse {
41+
oneof resolve_result {
42+
Success success = 1;
43+
MissingMaterializations missing_materializations = 2;
44+
}
45+
46+
message Success {
47+
ResolveFlagsResponse response = 1;
48+
repeated MaterializationUpdate updates = 2;
49+
}
50+
51+
message MissingMaterializations {
52+
repeated MissingMaterializationItem items = 1;
53+
}
54+
55+
message MissingMaterializationItem {
56+
string unit = 1;
57+
string rule = 2;
58+
string read_materialization = 3;
59+
}
60+
61+
message MaterializationUpdate {
62+
string unit = 1;
63+
string write_materialization = 2;
64+
string rule = 3;
65+
string variant = 4;
66+
}
67+
}
68+

0 commit comments

Comments
 (0)