Skip to content

Commit 0642ac8

Browse files
beeme1mrtoddbaert
andauthored
docs: add code default adr (#1639)
## This PR - ADR to add support for explicitly deferring to the code default. ### Notes This has been a challenge for users because enabling a feature for targeting requires a default variant to be used, which may not match the default used in code. --------- Signed-off-by: Michael Beemer <beeme1mr@users.noreply.github.com> Co-authored-by: Todd Baert <todd.baert@dynatrace.com>
1 parent 23e1554 commit 0642ac8

File tree

1 file changed

+342
-0
lines changed

1 file changed

+342
-0
lines changed
Lines changed: 342 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,342 @@
1+
---
2+
# Valid statuses: draft | proposed | rejected | accepted | superseded
3+
status: accepted
4+
author: @beeme1mr
5+
created: 2025-06-06
6+
updated: 2025-06-20
7+
---
8+
9+
# Support Explicit Code Default Values in flagd Configuration
10+
11+
This ADR proposes adding support for explicitly configuring flagd to use code-defined default values by allowing `null` as a valid default variant. This change addresses the current limitation where users cannot differentiate between "use the code's default" and "use this configured default" without resorting to workarounds like misconfigured rulesets.
12+
13+
## Background
14+
15+
Currently, flagd requires a default variant to be specified in flag configurations. This creates a fundamental mismatch with the OpenFeature specification and common feature flag usage patterns where code-defined defaults serve as the ultimate fallback.
16+
17+
The current behavior leads to confusion and operational challenges:
18+
19+
1. **Two Sources of Truth**: Applications have default values defined in code (as per OpenFeature best practices), while flagd configurations require their own default variants. This dual-default pattern violates the principle of single source of truth.
20+
21+
2. **State Transition Issues**: When transitioning a flag from DISABLED to ENABLED state, the behavior changes unexpectedly:
22+
- DISABLED state: Flag evaluation falls through to code defaults
23+
- ENABLED state: Flag evaluation uses the configured default variant
24+
25+
3. **Workarounds**: Users resort to misconfiguring rulesets (e.g., returning invalid variants) to force fallback to code defaults, which generates confusing error states and complicates debugging.
26+
27+
4. **OpenFeature Alignment**: The OpenFeature specification emphasizes that code defaults should be the ultimate fallback, but flagd's current design doesn't provide a clean way to express this intent.
28+
29+
Related discussions and context can be found in the [OpenFeature specification](https://openfeature.dev/specification/types) and [flagd flag definitions reference](https://flagd.dev/reference/flag-definitions/).
30+
31+
## Requirements
32+
33+
- **Explicit Code Default Support**: Users must be able to explicitly configure a flag to use the code-defined default value as its resolution
34+
- **Backward Compatibility**: Existing flag configurations must continue to work without modification
35+
- **Clear Semantics**: The configuration must clearly indicate when code defaults are being used versus configured defaults
36+
- **Appropriate Reason Codes**: Resolution details must include appropriate reason codes when code defaults are used (e.g., `DEFAULT` or a new specific reason)
37+
- **Schema Validation**: JSON schema must support and validate the new configuration options
38+
- **Provider Compatibility**: All OpenFeature providers must handle the new behavior correctly
39+
- **Testbed Coverage**: flagd testbed must include test cases for the new functionality
40+
41+
## Considered Options
42+
43+
- **Option 1: Allow `null` as Default Variant** - Modify the schema to accept `null` as a valid value for defaultVariant, signaling "use code default"
44+
- **Option 2: Make Default Variant Optional** - Remove the requirement for defaultVariant entirely, with absence meaning "use code default"
45+
- **Option 3: Special Variant Value** - Define a reserved variant name (e.g., `"__CODE_DEFAULT__"`) that signals code default usage
46+
- **Option 4: New Configuration Property** - Add a new property like `useCodeDefault: true` alongside or instead of defaultVariant
47+
- **Option 5: Status Quo with Documentation** - Keep current behavior but improve documentation about workarounds
48+
49+
## Proposal
50+
51+
We propose implementing **Option 1: Allow `null` as Default Variant**, potentially combined with **Option 2: Make Default Variant Optional** for maximum flexibility.
52+
53+
The implementation leverages field presence in evaluation responses across all protocols (in-process, RPC, and OFREP). When a flag configuration has `defaultVariant: null`, the evaluation response omits the value field entirely, which serves as a programmatic signal to the client to use its code-defined default value.
54+
55+
This approach offers several key advantages:
56+
57+
1. **No Protocol Changes**: RPC and OFREP protocols remain unchanged
58+
2. **Clear Semantics**: Omitted value field = "use your code default"
59+
3. **Backward Compatible**: Existing clients and servers continue to work
60+
4. **Universal Pattern**: Works consistently across all evaluation modes
61+
62+
The absence of a value field provides an unambiguous signal that distinguishes between "the server evaluated to null/false/empty" (value field present) and "the server delegates to your code default" (value field absent).
63+
64+
### Implementation Details
65+
66+
1. **Schema Changes**:
67+
68+
```json
69+
{
70+
"defaultVariant": {
71+
"oneOf": [
72+
{ "type": "string" },
73+
{ "type": "null" }
74+
],
75+
"description": "Default variant to use. Set to null to use code-defined default."
76+
}
77+
}
78+
```
79+
80+
2. **Protobuf Considerations**:
81+
82+
**No proto changes required** - The existing proto3 definitions already support this behavior:
83+
84+
```protobuf
85+
message ResolveBooleanResponse {
86+
bool value = 1;
87+
string reason = 2;
88+
string variant = 3;
89+
google.protobuf.Struct metadata = 4;
90+
}
91+
```
92+
93+
In proto3, all fields are optional by default. To implement code default delegation:
94+
- Do NOT set the `value` field (not even to false/0/"")
95+
- Do NOT set the `variant` field
96+
- Set `reason` to "DEFAULT"
97+
- Optionally set `metadata`
98+
99+
**Critical implementation notes**:
100+
- Ensure flagd omits fields rather than setting zero values
101+
- Different languages detect field presence differently:
102+
- Go: Check for zero values or use pointers
103+
- Java: Use `hasValue()` / `hasVariant()` methods
104+
- Python: Use `HasField()` or check field presence
105+
- JavaScript: Check for `undefined` (not null)
106+
- Consider adding proto comments documenting this behavior
107+
- Test wire format to confirm fields are actually omitted
108+
109+
**Example - Correct vs Incorrect Implementation**:
110+
111+
```go
112+
// INCORRECT - sets zero value
113+
response := &ResolveBooleanResponse{
114+
Value: false, // This sends 'false' on the wire
115+
Reason: "DEFAULT",
116+
Variant: "", // This sends empty string on the wire
117+
}
118+
119+
// CORRECT - omits fields
120+
response := &ResolveBooleanResponse{
121+
Reason: "DEFAULT",
122+
// Value and Variant fields are not set at all
123+
}
124+
```
125+
126+
3. **Evaluation Behavior**:
127+
- When flag has `defaultVariant: null` and targeting returns no match
128+
- Server responds with both value and variant fields omitted, reason set to "DEFAULT"
129+
- Client detects the missing value field and uses its code-defined default
130+
- This same pattern works across all evaluation modes
131+
132+
**In-Process Mode Special Considerations**:
133+
- The in-process evaluator must return the same "shape" of response
134+
- Cannot return null/nil as that's different from "no value"
135+
- May need a special response type or wrapper to indicate delegation
136+
- Example approach: Return evaluation result with a "useDefault" flag
137+
138+
4. **Remote Evaluation Protocol Responses**:
139+
140+
**OFREP (OpenFeature Remote Evaluation Protocol)**:
141+
- Return HTTP 200 with response body that omits both value and variant fields
142+
- Reason field set to "DEFAULT" to indicate delegation
143+
- Clear distinction from error cases (which return 4xx/5xx)
144+
145+
**flagd RPC**:
146+
- Omit both the type-specific value field and variant field in response messages
147+
- Use protobuf field presence to signal "no opinion from server"
148+
- No changes needed to RPC method signatures
149+
150+
5. **Provider Implementation**:
151+
- Check for presence/absence of value field in responses
152+
- When value is absent and reason is "DEFAULT", use code-defined default
153+
- When value is present (even if null/false/empty), use that value
154+
- Variant field will also be absent in delegation responses, resulting in undefined variant in resolution details
155+
- Responses with value fields work as before, maintaining backward compatibility
156+
157+
### Design Rationale
158+
159+
**Using "DEFAULT" reason**: We intentionally reuse the existing "DEFAULT" reason code rather than introducing a new one (like "CODE_DEFAULT"). The distinction between a configuration default and code default is clear from the response structure:
160+
161+
- Configuration default: Has both value and variant fields
162+
- Code default: Omits both value and variant fields
163+
164+
**Field Omission Pattern**: Using field presence/absence is a well-established pattern in protocol design:
165+
166+
- Unambiguous: Cannot confuse "null value" with "no opinion"
167+
- Language agnostic: Works across type systems
168+
- Protocol friendly: Natural in both JSON and Protobuf
169+
- Backward compatible: Existing responses always include values
170+
- Spec compliant: OpenFeature allows undefined variants
171+
172+
The omission of both value and variant fields creates a clear, consistent signal that the server is fully delegating the decision to the client's code default.
173+
174+
This approach maintains compatibility with the established OpenFeature terminology while providing clear semantics through response structure.
175+
176+
### API changes
177+
178+
**Flag Configuration**:
179+
180+
```yaml
181+
flags:
182+
my-feature:
183+
state: ENABLED
184+
defaultVariant: null # Explicitly use code default
185+
variants:
186+
on: true
187+
off: false
188+
targeting:
189+
if:
190+
- "===":
191+
- var: user-type
192+
- "beta"
193+
- on
194+
```
195+
196+
**OFREP Response** when code default is indicated:
197+
198+
#### Single flag evaluation response
199+
200+
```json
201+
{
202+
"key": "my-feature",
203+
"reason": "DEFAULT",
204+
"metadata": {}
205+
}
206+
```
207+
208+
#### Bulk flag evaluation response
209+
210+
```json
211+
{
212+
"flags": [
213+
{
214+
"key": "my-feature",
215+
"reason": "DEFAULT",
216+
"metadata": {}
217+
}
218+
]
219+
}
220+
```
221+
222+
Note: Both `value` and `variant` fields are intentionally omitted to signal "use code default"
223+
224+
**flagd RPC Response** (ResolveBooleanResponse):
225+
226+
```protobuf
227+
{
228+
"reason": "DEFAULT",
229+
"metadata": {}
230+
}
231+
```
232+
233+
Both the type-specific value field and variant field are omitted
234+
235+
**Provider behavior**:
236+
237+
1. Check response for presence of value field
238+
2. If value field is absent and error code is absent , use code-defined default
239+
3. If value field is present (even if null/false/empty), use that value
240+
4. The variant field will also be absent when delegating to code defaults
241+
5. This logic is consistent across in-process, RPC, and OFREP modes
242+
243+
This approach clearly differentiates between:
244+
245+
- Server returning an actual value with a variant (both fields present)
246+
- Server delegating to code default (both fields absent)
247+
248+
**Example evaluation flow**:
249+
250+
```javascript
251+
// Client code
252+
const value = client.getBooleanValue('my-feature', false, context);
253+
254+
// Server evaluates flag with defaultVariant: null
255+
// No targeting match, so server returns:
256+
{
257+
"reason": "DEFAULT",
258+
"metadata": {}
259+
// Note: both "value" and "variant" fields are omitted
260+
}
261+
262+
// Client detects missing value field and uses its default (false)
263+
// Resolution details show reason: "DEFAULT" with undefined variant
264+
```
265+
266+
### Consequences
267+
268+
- Good, because it eliminates the confusion between code and configuration defaults
269+
- Good, because it provides explicit control over default behavior without workarounds
270+
- Good, because it aligns flagd more closely with OpenFeature specification principles
271+
- Good, because it supports gradual flag rollout patterns more naturally
272+
- Good, because it provides the ability to delegate to whatever is defined in code
273+
- Good, because it requires no changes to existing RPC or protocol signatures
274+
- Good, because it uses established patterns (field presence) for clear semantics
275+
- Good, because it maintains full backward compatibility
276+
- Bad, because it requires updates across multiple components (flagd, providers, testbed)
277+
- Bad, because it introduces a new concept that users need to understand
278+
- Neutral, because existing configurations continue to work unchanged
279+
280+
### Remote Evaluation Considerations
281+
282+
This feature works consistently across all flagd evaluation modes through a unified pattern:
283+
284+
1. **In-Process Mode**: Direct evaluation returns responses without value fields when delegating
285+
2. **RPC Mode (gRPC/HTTP)**: Responses omit the type-specific value field to signal delegation
286+
3. **OFREP (OpenFeature Remote Evaluation Protocol)**: HTTP responses omit the value field
287+
288+
The key insight is using field omission to indicate "use code default":
289+
290+
- Present value field (even if null/false/empty) = use this value
291+
- Absent value field = use your code default
292+
- Works identically across all protocols and evaluation modes
293+
- No protocol changes required
294+
295+
### Implementation Plan
296+
297+
1. Update flagd-schemas with new JSON schema supporting null default variants
298+
2. Implement core logic in flagd to handle null defaults and omit value/variant fields
299+
3. Update flagd-testbed with comprehensive test cases for all evaluation modes
300+
4. Update OpenFeature providers to handle responses without value fields
301+
5. Documentation updates, migration guides, and proto comment additions
302+
303+
Note: No protobuf schema changes are required, but implementation must carefully handle field omission
304+
305+
### Testing Considerations
306+
307+
To ensure correct implementation across all components:
308+
309+
1. **Wire Format Tests**: Verify that protobuf messages with omitted fields are correctly serialized without the fields (not with zero values)
310+
2. **Provider Tests**: Each provider must have tests confirming they detect missing fields correctly in their language
311+
3. **Integration Tests**: End-to-end tests across different language combinations (e.g., Go flagd with Java provider)
312+
4. **OFREP Tests**: Verify JSON responses correctly omit fields (not set to null)
313+
5. **Backward Compatibility Tests**: Ensure old providers handle new responses gracefully
314+
6. **Consistency Tests**: Verify identical behavior across in-process, RPC, and OFREP modes
315+
316+
### Open questions
317+
318+
- How should providers handle responses with missing value fields in strongly-typed languages?
319+
- We'll handle the same way as with optional fields, using language-specific patterns (e.g., pointers in Go, `hasValue()` in Java).
320+
- Should we support both `null` and absent `defaultVariant` fields, or choose one approach?
321+
- Yes, we'll support both `null` and absent fields to maximize flexibility. An absent `defaultVariant` will be the equivalent of `null`.
322+
- What migration path should we recommend for users currently using workarounds?
323+
- Update the flag configurations to use `defaultVariant: null` and remove any misconfigured rulesets that force code defaults.
324+
- Should this feature be gated behind a configuration flag during initial rollout?
325+
- We'll avoid public facing documentation until the feature is fully implemented and tested.
326+
- How do we ensure consistent behavior across all provider implementations?
327+
- Gherkin tests will be added to the flagd testbed to ensure all providers handle the new behavior consistently.
328+
- Should providers validate that the reason is "DEFAULT" when value is omitted, or accept any omitted value as delegation?
329+
- Providers should accept any omitted value as delegation.
330+
- How do we handle edge cases where network protocols might strip empty fields?
331+
- It would behaving as expected, as the absence of fields is the intended signal.
332+
- When the client uses its code default after receiving a delegation response, what variant should be reported in telemetry/analytics?
333+
- The variant will be omitted, indicating that the code default was used.
334+
- Should we add explicit proto comments documenting the field omission behavior?
335+
- Leave this to the implementers, but it would be beneficial to add comments in the proto files to clarify this behavior for future maintainers.
336+
337+
## More Information
338+
339+
- [OpenFeature Specification - Flag Evaluation](https://openfeature.dev/specification/types#flag-evaluation)
340+
- [flagd Flag Definitions Reference](https://flagd.dev/reference/flag-definitions/)
341+
- [flagd JSON Schema Repository](https://github.com/open-feature/flagd-schemas)
342+
- [flagd Testbed](https://github.com/open-feature/flagd-testbed)

0 commit comments

Comments
 (0)