Skip to content

[feature request] Make db.statement sanitization configurable for SqlClient/EFCore (preserve SQL comments / EF TagWith tags) #3554

@alexander-kucherov

Description

@alexander-kucherov

Component

OpenTelemetry.Instrumentation.SqlClient

Is your feature request related to a problem?

Make db.statement sanitization configurable in .NET (preserve SQL comments / EF TagWith tags)

Summary

The .NET SQL instrumentation currently always sanitizes SQL statements, including stripping comments, before setting db.statement / db.query.text on spans.
This makes it impossible to rely on SQL comments for tagging (e.g., EF Core TagWith) in telemetry, even when users are willing to accept the risk of collecting raw SQL.

This issue proposes making db.statement sanitization configurable in .NET (similar to the existing db-statement-sanitizer feature in the Java agent), while keeping sanitization enabled by default.

Current behavior

SqlProcessor sanitizes SQL by:

  • Replacing all literal values (strings, numbers, hex) with ?.
  • Special-casing IN (...) lists to a single ?.
  • Removing both single-line (-- ...) and multi-line (/* ... */) comments entirely from the statement.

ApplyConventionsForQueryText(Activity activity, string? commandText, bool emitOldAttributes, bool emitNewAttributes, bool sanitizeQuery = true) already exposes a sanitizeQuery boolean, but this parameter is not wired to any public configuration and effectively always uses the default true.

As a result, SQL comments never appear in db.statement / db.query.text on spans, regardless of user intent.

Problem

Many applications use SQL comments as a lightweight tagging mechanism, for example via EF Core TagWith, to associate queries with features, use-cases, or higher-level application concepts. When the sanitizer strips comments, these tags are lost from telemetry, even though:

  • The actual database sees the tagged SQL and executes it just fine.
  • Some users are willing (or explicitly configured) to allow raw SQL in their observability pipeline.
  • Some deployments already implement downstream redaction at the collector or log processor level.

In short, the current behavior is "always sanitize, always drop comments", with no way to opt out for environments that consciously accept the trade-off.

What is the expected behavior?

Proposed solution

Introduce a configurable "db-statement sanitizer" option for .NET database instrumentations and connect it to the existing sanitizeQuery parameter:

  1. Add a public configuration option, for example on the relevant instrumentation options type(s), such as:

  2. Wire this option into the code path that calls ApplyConventionsForQueryText, passing sanitizeQuery: options.DbStatementSanitizerEnabled.

  3. Keep DbStatementSanitizerEnabled defaulting to true to preserve the current safe-by-default behavior and stay consistent with the OpenTelemetry guidance that db.statement should generally be sanitized by default.

  4. Clearly document that:

  • Setting DbStatementSanitizerEnabled = false disables literal replacement and comment stripping.
  • It becomes the application's / operator's responsibility to prevent sensitive data from being emitted in db.statement when sanitization is disabled.

Java agent

The OpenTelemetry Java agent already implements a configurable DB statement sanitizer:

  • It sanitizes all database statements before setting the db.statement attribute by replacing values (strings, numbers) in the query with ?.

  • This behavior is enabled by default for all DB instrumentations.

  • It can be disabled via configuration, using:

  • System property: otel.instrumentation.common.db-statement-sanitizer.enabled

  • Environment variable: OTEL_INSTRUMENTATION_COMMON_DB_STATEMENT_SANITIZER_ENABLED

with the default value true, as documented in the Java agent instrumentation configuration docs for “DB statement sanitization”.

Aligning the .NET instrumentation with this pattern would:

  • Provide feature parity across languages.
  • Preserve safe-by-default behavior.
  • Allow advanced users (who understand the risks and/or already have downstream redaction) to opt out of sanitization to keep SQL comments and full statements in telemetry.

Which alternative solutions or features have you considered?

Custom processor or enricher

Additional context

Additional context

A particularly important use case is EF Core TagWith, which encodes tags as SQL comments. Because comments are stripped by the current sanitizer implementation, these tags can never appear in db.statement even when users explicitly rely on them for observability and are comfortable with collecting unsanitized SQL.

With a configurable db-statement sanitizer in .NET, users could:

  • Run with DbStatementSanitizerEnabled = true in environments where PII risk is high.
  • Temporarily or selectively disable it in controlled environments (e.g., staging, internal systems) where preserving tags and comments is more valuable than aggressive sanitization, or where other redaction layers are already in place.

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions