Skip to content

Conversation

@leoromanovsky
Copy link

@leoromanovsky leoromanovsky commented Oct 30, 2025

Motivation

  • The flagging library makes requests of remote configuration but we are unable to pass a context.Context and cancel or timeout if the service is down. The customer impact is that the server can hang which is an extremely negative customer experience.
  • We merged the ability to pass a context upstream (open-feature/go-sdk@1a0d39e) and it was release in 1.17.0

What does this PR do?

  • update openfeature to 1.17.0
  • Pass context during init, shutdown (with a default timeout) and evaluation

Reviewer's Checklist

  • Changed code has unit tests for its functionality at or near 100% coverage.
  • System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • There is a benchmark for any new code, or changes to existing code.
  • If this interacts with the agent in a new way, a system test has been added.
  • New code is free of linting errors. You can check this by running ./scripts/lint.sh locally.
  • Add an appropriate team label so this PR gets put in the right place for the release notes.
  • Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

Unsure? Have a question? Request a review!

@pr-commenter
Copy link

pr-commenter bot commented Oct 30, 2025

Benchmarks

Benchmark execution time: 2025-11-06 02:20:29

Comparing candidate commit 89ce456 in PR branch lr/of-context with baseline commit 53b84e7 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 24 metrics, 0 unstable metrics.

@leoromanovsky leoromanovsky changed the title Add ability to pass context into openfeature provider to support canc… feat(openfeature/provider): Add ability to pass context into openfeature provider to support cancellation; pins of go sdk to 1a0d39ea7e4f Oct 30, 2025
@leoromanovsky leoromanovsky changed the title feat(openfeature/provider): Add ability to pass context into openfeature provider to support cancellation; pins of go sdk to 1a0d39ea7e4f feat(openfeature/provider): Add ability to pass context into openfeature provider to support cancellation Nov 5, 2025
- Add ability to pass context into openfeature provider to support cancellation
- Update to OpenFeature v1.17.0 from pinned pre-release version
- Implement ContextAwareStateHandler with InitWithContext and ShutdownWithContext
- Add comprehensive timeout tests for SetProviderWithContextAndWait and ShutdownWithContext
- Simplify InitWithContext implementation using channels instead of sync.Cond
- Replace complex mutex lock/unlock cycles with clean channel-based signaling
- All tests passing with improved error handling for test scenarios
- Reduce initialization timeout from 30s to 5s for faster timeouts
- Reduce shutdown timeout from 10s to 5s for consistency
- Revert InitWithContext from channel approach back to sync.Cond
- sync.Cond supports multiple broadcasts for ongoing config updates
- All timeout tests passing with improved responsiveness
@github-actions github-actions bot added the apm:ecosystem contrib/* related feature requests or bugs label Nov 6, 2025
@leoromanovsky leoromanovsky added the team:ffe Feature Flagging & Experimentation label Nov 6, 2025
@leoromanovsky leoromanovsky marked this pull request as ready for review November 6, 2025 02:35
@leoromanovsky leoromanovsky requested a review from a team as a code owner November 6, 2025 02:35
Comment on lines +150 to +152
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_ = p.ShutdownWithContext(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly I think we need to put at the very least 15sec or more because reomte config can be really really slow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_ = p.ShutdownWithContext(ctx)
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
defer cancel()
_ = p.ShutdownWithContext(ctx)

Comment on lines +113 to +141
// Check if context was cancelled
select {
case <-ctx.Done():
return ctx.Err()
default:
}

// Use a condition variable with context support
// We need to handle the case where context gets cancelled while waiting
done := make(chan struct{})
go func() {
defer close(done)
p.mu.Lock()
defer p.mu.Unlock()
p.configChange.Wait()
}()

// Temporarily unlock to allow the configuration update and context handling
p.mu.Unlock()

select {
case <-ctx.Done():
// Relock before returning
p.mu.Lock()
return ctx.Err()
case <-done:
// Configuration might have been updated, relock and loop to check
p.mu.Lock()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we made the inside of the loop into it's own function so we can defer p.mu.Lock() instead of locking on each branch of the select statement. Do you agree ?

func (p *DatadogProvider) Init(openfeature.EvaluationContext) error {
func (p *DatadogProvider) Init(evaluationContext openfeature.EvaluationContext) error {
// Use a background context with a reasonable timeout for backward compatibility
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)

Sadly I think we need to put at the very least 15sec or more because remote config can be really really slow

)

var _ openfeature.FeatureProvider = (*DatadogProvider)(nil)
var _ openfeature.ContextAwareStateHandler = (*DatadogProvider)(nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var _ openfeature.ContextAwareStateHandler = (*DatadogProvider)(nil)
var _ openfeature.ContextAwareStateHandler = (*DatadogProvider)(nil)
var _ openfeature.StateHandler = (*DatadogProvider)(nil)

I guess 🤷

Comment on lines +367 to +376
// Check if context was cancelled before starting evaluation
select {
case <-ctx.Done():
return evaluationResult{
Value: defaultValue,
Reason: openfeature.ErrorReason,
Error: ctx.Err(),
}
default:
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I missed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

apm:ecosystem contrib/* related feature requests or bugs team:ffe Feature Flagging & Experimentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants