Skip to content

Conversation

@rachelyangdog
Copy link
Contributor

@rachelyangdog rachelyangdog commented Jan 13, 2026

What does this PR do?

WIP

  • Added DD_LOGS_OTEL_ENABLED configuration and when true enables support for OpenTelemetry logs (default: false)
  • When enabled, the tracer automatically initializes an OpenTelemetry Logs SDK pipeline that exports logs to the Datadog Agent
  • Logs emitted during active spans automatically include trace_id and span_id for both Datadog and OpenTelemetry spans
  • Resource attributes from Datadog env vars (DD_SERVICE, DD_ENV, DD_VERSION, DD_TAGS)
  • OTLP exporter with support for HTTP/JSON, HTTP/protobuf, and gRPC protocols
  • Configurable batching via OTEL_BLRP_* environment variables
  • Reports OTel logs configuration values (OTLP endpoint, protocol, headers, BLRP settings) and tracks the number of log records exported via the otel.log_records counter metric tagged with protocol (http|grpc) and encoding (json|protobuf).

Configuration

  • DD_LOGS_OTEL_ENABLED - Enable/disable (default: false)
  • OTEL_EXPORTER_OTLP_LOGS_ENDPOINT - Custom OTLP endpoint
  • OTEL_EXPORTER_OTLP_LOGS_PROTOCOL - Protocol: http/json, http/protobuf, grpc
  • OTEL_BLRP_* - Batch processor settings

Implementation

  • New package: ddtrace/opentelemetry/log/
  • Reuses existing OpenTelemetry SDK dependencies
  • Automatic bridging between Datadog spans and OpenTelemetry trace context

Motivation

The OpenTelemetry Logs API RFC explains that enabling this feature will allow customers using OpenTelemetry standards to migrate to Datadog SDKs with minimal friction while continuing to receive their logs as expected.

Reviewer's Checklist

  • Changed code has unit tests for its functionality at or near 100% coverage.
  • System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
  • There is a benchmark for any new code, or changes to existing code.
  • If this interacts with the agent in a new way, a system test has been added.
  • New code is free of linting errors. You can check this by running ./scripts/lint.sh locally.
  • Add an appropriate team label so this PR gets put in the right place for the release notes.
  • Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.

Unsure? Have a question? Request a review!

@rachelyangdog rachelyangdog requested review from a team as code owners January 13, 2026 21:16
@rachelyangdog rachelyangdog marked this pull request as draft January 13, 2026 21:16
@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 83.83234% with 81 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.30%. Comparing base (99e007d) to head (7743f0c).
⚠️ Report is 15 commits behind head on main.

Files with missing lines Patch % Lines
ddtrace/opentelemetry/log/exporter.go 74.09% 43 Missing and 7 partials ⚠️
ddtrace/opentelemetry/log/logger_provider.go 84.00% 6 Missing and 2 partials ⚠️
ddtrace/opentelemetry/log/test_exporter.go 78.57% 5 Missing and 1 partial ⚠️
ddtrace/opentelemetry/log/correlation.go 86.20% 3 Missing and 1 partial ⚠️
internal/config/config.go 60.00% 4 Missing ⚠️
ddtrace/opentelemetry/log/logger.go 62.50% 3 Missing ⚠️
ddtrace/opentelemetry/log/resource.go 94.54% 2 Missing and 1 partial ⚠️
ddtrace/opentelemetry/log/integration.go 87.50% 1 Missing and 1 partial ⚠️
ddtrace/opentelemetry/log/telemetry.go 99.08% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
internal/stacktrace/contribs_generated.go 100.00% <100.00%> (ø)
ddtrace/opentelemetry/log/telemetry.go 99.08% <99.08%> (ø)
ddtrace/opentelemetry/log/integration.go 87.50% <87.50%> (ø)
ddtrace/opentelemetry/log/logger.go 62.50% <62.50%> (ø)
ddtrace/opentelemetry/log/resource.go 94.54% <94.54%> (ø)
ddtrace/opentelemetry/log/correlation.go 86.20% <86.20%> (ø)
internal/config/config.go 55.04% <60.00%> (+4.48%) ⬆️
ddtrace/opentelemetry/log/test_exporter.go 78.57% <78.57%> (ø)
ddtrace/opentelemetry/log/logger_provider.go 84.00% <84.00%> (ø)
ddtrace/opentelemetry/log/exporter.go 74.09% <74.09%> (ø)

... and 26 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rachelyangdog
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4326458f1e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +47 to +51
config := oteltrace.SpanContextConfig{
TraceID: traceID,
SpanID: spanID,
TraceFlags: oteltrace.FlagsSampled, // DD spans are sampled by default
Remote: false,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Map DD sampling priority to OTel TraceFlags

This always sets TraceFlagsSampled, so a Datadog trace with sampling priority ≤ 0 (not kept) will still produce an OTel span context marked as sampled. In that case, OTLP log records will look sampled and downstream systems may try to correlate them to traces that will never be sent, which can skew sampling-based behavior. Consider deriving the flag from ddCtx.SamplingPriority() (or equivalent) instead of hard‑coding sampled.

Useful? React with 👍 / 👎.

Comment on lines +43 to +47
func InitGlobalLoggerProvider(ctx context.Context) error {
var err error
globalLoggerProviderOnce.Do(func() {
globalLoggerProviderMu.Lock()
defer globalLoggerProviderMu.Unlock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow InitGlobalLoggerProvider to retry after failure

Initialization is guarded by sync.Once, but failures inside the Do block still consume the once. If buildResource or newOTLPExporter fails (e.g., transient misconfiguration), subsequent calls return nil without attempting to initialize again, leaving the provider permanently nil for the process. Consider resetting the once on failure or persisting the error so callers can retry or get a consistent error.

Useful? React with 👍 / 👎.

Comment on lines +21 to +25
func StartIfEnabled(ctx context.Context) error {
cfg := config.Get()
if !cfg.LogsOtelEnabled() {
log.Debug("DD_LOGS_OTEL_ENABLED=false, skipping OTel LoggerProvider initialization")
return nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire StartIfEnabled into tracer lifecycle

This helper is defined but has no call sites in the repository, so setting DD_LOGS_OTEL_ENABLED=true never actually initializes the LoggerProvider unless users call StartIfEnabled manually. That makes the feature flag inert in typical tracer startup/shutdown flows. Consider invoking StartIfEnabled/StopIfEnabled from tracer initialization and shutdown to honor the env var.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetry Logs support follows the same manual-initialization model as OTel Metrics in dd-trace-go. This means:

  • StartIfEnabled() / StopIfEnabled() are user-facing APIs, not internal hooks
  • Users must explicitly call these functions to initialize the logs pipeline
  • tracer.Start() and tracer.Stop() deliberately do NOT touch logs

@pr-commenter
Copy link

pr-commenter bot commented Jan 13, 2026

Benchmarks

Benchmark execution time: 2026-01-13 22:59:48

Comparing candidate commit 94efc13 in PR branch rachel.yang/opentelemetry-logs-support with baseline commit 4d2c764 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 156 metrics, 8 unstable metrics.

@kakkoyun kakkoyun self-requested a review January 15, 2026 09:22
@rachelyangdog rachelyangdog marked this pull request as ready for review January 21, 2026 19:01
@rachelyangdog rachelyangdog changed the title WIP: feat(otel): adding support for OpenTelemetry logs feat(otel): adding support for OpenTelemetry logs Jan 22, 2026
@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Jan 26, 2026

⚠️ Tests

Fix all issues with Cursor

⚠️ Warnings

❄️ 2 New flaky tests detected

TestContextWithDDSpan from github.com/DataDog/dd-trace-go/v2/ddtrace/opentelemetry/log (Datadog) (Fix with Cursor)
Failed

=== RUN   TestContextWithDDSpan
--- FAIL: TestContextWithDDSpan (0.08s)
TestContextWithDDSpan/preserves_existing_OTel_span from github.com/DataDog/dd-trace-go/v2/ddtrace/opentelemetry/log (Datadog) (Fix with Cursor)
Failed

=== RUN   TestContextWithDDSpan/preserves_existing_OTel_span
    correlation_test.go:105: 
        	Error Trace:	D:/a/dd-trace-go/dd-trace-go/ddtrace/opentelemetry/log/correlation_test.go:105
        	Error:      	Not equal: 
        	            	expected: trace.SpanContext{traceID:trace.TraceID{0x69, 0x77, 0xb4, 0xaa, 0x0, 0x0, 0x0, 0x0, 0x35, 0xb7, 0xba, 0xe4, 0x87, 0xee, 0x1a, 0x67}, spanID:trace.SpanID{0x35, 0xb7, 0xba, 0xe4, 0x87, 0xee, 0x1a, 0x67}, traceFlags:0x1, traceState:trace.TraceState{list:[]trace.member{trace.member{Key:"dd", Value:"s:1;p:35b7bae487ee1a67;t.tid:6977b4aa00000000;t.dm:-1"}}}, remote:true}
        	            	actual  : trace.SpanContext{traceID:trace.TraceID{0x69, 0x77, 0xb4, 0xaa, 0x0, 0x0, 0x0, 0x0, 0x35, 0xb7, 0xba, 0xe4, 0x87, 0xee, 0x1a, 0x67}, spanID:trace.SpanID{0x35, 0xb7, 0xba, 0xe4, 0x87, 0xee, 0x1a, 0x67}, traceFlags:0x1, traceState:trace.TraceState{list:[]trace.member{trace.member{Key:"dd", Value:"s:1;p:35b7bae487ee1a67;t.dm:-1;t.tid:6977b4aa00000000"}}}, remote:true}
        	            	
        	            	Diff:
...

ℹ️ Info

🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 7743f0c | Docs | Datadog PR Page | Was this helpful? Give us feedback!

Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I have just some nits.

// DD environment variables for agent configuration
envDDTraceAgentURL = "DD_TRACE_AGENT_URL"
envDDAgentHost = "DD_AGENT_HOST"
envDDTraceAgentPort = "DD_TRACE_AGENT_PORT"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my local editor this is not used.

@@ -0,0 +1,76 @@
// Unless explicitly stated otherwise all files in this repository are licensed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename this file, so that it ends wit _test.go Or simply move to the exporter_test.go

I assumed this is only used in the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just a helper file used for the tests.


// IsRecording returns true if the span is recording.
func (w *ddSpanWrapper) IsRecording() bool {
// DD spans are always recording if not finished
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DD spans are always recording if not finished
// This always returns true because DD spans don't expose a "finished" state
// through the public API. In practice, this is acceptable because logs are
// typically emitted while spans are active (before Finish is called).

Comment on lines +420 to +422
InitialInterval: 1 * time.Second,
MaxInterval: 30 * time.Second,
MaxElapsedTime: 5 * time.Minute,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create constants for this and document the choices.

Suggested change
InitialInterval: 1 * time.Second,
MaxInterval: 30 * time.Second,
MaxElapsedTime: 5 * time.Minute,
InitialInterval: httpRetryInitialInterval,
MaxInterval: httpRetryMaxInterval,
MaxElapsedTime: httpRetryMaxElapsedTime,

Comment on lines +430 to +432
InitialInterval: 5 * time.Second,
MaxInterval: 30 * time.Second,
MaxElapsedTime: 5 * time.Minute,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Suggested change
InitialInterval: 5 * time.Second,
MaxInterval: 30 * time.Second,
MaxElapsedTime: 5 * time.Minute,
InitialInterval: grpcRetryInitialInterval,
MaxInterval: grpcRetryMaxInterval,
MaxElapsedTime: grpcRetryMaxElapsedTime,

sdklog.Exporter
telemetry *LogsExportTelemetry
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Compile-time check that telemetryExporter implements sdklog.Exporter.
var _ sdklog.Exporter = (*telemetryExporter)(nil)

log.Debug("Shutting down OTel LoggerProvider")

// Use a timeout context to avoid blocking indefinitely
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the above

Suggested change
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
ctx, cancel := context.WithTimeout(context.Background(), shutdownTimeout)

Comment on lines +166 to +170
// msConfig holds a milliseconds configuration value with its origin.
type msConfig struct {
value int
origin telemetry.Origin
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think configValue is a better name for this.

Suggested change
// msConfig holds a milliseconds configuration value with its origin.
type msConfig struct {
value int
origin telemetry.Origin
}
// configValue holds a configuration value (typically in milliseconds) with its origin.
// Used for telemetry reporting to track whether a value came from environment
// variables or defaults.
type configValue struct {
value int
origin telemetry.Origin
}

Comment on lines +37 to +52
func StopIfEnabled() {
provider := GetGlobalLoggerProvider()
if provider == nil {
// Not initialized, nothing to do
return
}

log.Debug("Shutting down OTel LoggerProvider")

// Use a timeout context to avoid blocking indefinitely
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

if err := ShutdownGlobalLoggerProvider(ctx); err != nil {
log.Warn("Error shutting down OTel LoggerProvider: %v", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be simplified to?:

Suggested change
func StopIfEnabled() {
provider := GetGlobalLoggerProvider()
if provider == nil {
// Not initialized, nothing to do
return
}
log.Debug("Shutting down OTel LoggerProvider")
// Use a timeout context to avoid blocking indefinitely
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
if err := ShutdownGlobalLoggerProvider(ctx); err != nil {
log.Warn("Error shutting down OTel LoggerProvider: %v", err)
}
// If the provider was not initialized, this is a no-op.
func StopIfEnabled() error {
ctx, cancel := context.WithTimeout(context.Background(), shutdownTimeout)
defer cancel()
return ShutdownGlobalLoggerProvider(ctx)

//
// If the feature is not enabled, this function is a no-op.
// Returns an error if initialization fails when the feature is enabled.
func StartIfEnabled(ctx context.Context) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just Start is simpler, we explain everything in the comments. Same for the Stop

Suggested change
func StartIfEnabled(ctx context.Context) error {
func Start(ctx context.Context) error {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants