-
Notifications
You must be signed in to change notification settings - Fork 275
[db] Add docs for context info propagation #2236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[db] Add docs for context info propagation #2236
Conversation
docs/database/database-spans.md
Outdated
| | Attribute | Type | Description | Require level | Stability | | ||
| |----------------|--------|---------------------------------------|-------------------|----------------------------------------------------------------| | ||
| | `service.name` | string | Logical name of the service [1] | `Required` |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is because of the limitation of actual implementation.
When propagating traceparent via sqlcommenter is not available for certain databases due to the performance impact (google/sqlcommenter#284) and propagating traceparent via SET CONTEXT_INFO is also not available due to the additional network round trip delay, only pushing a service.name (or peer.service) via sqlcommenter become a solution to achieve limited correlation for knowing which service is making the query request.
Some questions here:
- What is the best name to use here?
service.nameorpeer.service.- I slightly prefer
peer.service.
- I slightly prefer
- Should we mark this field as
OptionalorRecommended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XSAM, it is reasonable to me to have this attribute when traceid cannot be injected due to performance reasons.
What do you think about making it conditionally required, where there is not information about span?
If you have traceparent, is it IMO useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. In such case service.name feels more like a baggage item that you want to propagate.
Then it could encoded something like
SELECT * FROM songs /* baggage: service.name=music-player:play */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this PR and still keep the sheet to represent the minimal data that should be propagated. Please let me know if you feel it's unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I'm not sure if we should be referencing the baggage concept or name for this attribute name.
We somehow need to propagate this information. I was thinking of reusing some existing propagator, instead of creating a new one. The alternative would be probably creating a new "service name propagator".
However, it feels that a baggage propagator can be used for this. From https://www.w3.org/TR/baggage/#example-use-case:
For example, if you need to annotate logs with some request-specific information, you could propagate a property using the baggage header.
I just noticed that here is a nice doc for context propagation for AWS: https://github.com/open-telemetry/semantic-conventions/blob/45e91aedff7d347955ba09269b871f8fd9049ce3/docs/non-normative/compatibility/aws.md#context-propagation
@pellared what do you think of going back to
service.nameor something slightly different likeresource.service.name?
service.name as I originally proposed. Especially that it is already a "reserved" key:
| - [`service.name`](#service) |
Example:
/*baggage='service.name%3Dbar'*/
@pellared, do you know how to express an attribute in baggage in the semantic convention doc?
I think that we just need to write it down as I do not think we have any semantic conventions for baggage key-values so far.
CC @reyang, @dyladan, @yurishkuro, @arminru
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the best name to use here? service.name or peer.service.
I slightly prefer peer.service.
service.name. peer.service applies to the remote (downstream service): link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared Do you recommend using the format /*baggage='service.name%3Dbar'*/ instead of /*baggage.service.name='bar'*/? I'm not sure if there is an existing propagator we can use. Just wanted to double-check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest exploring /*baggage='service.name%3Dbar'* and use of the baggage propagator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pellared Updated the examples accordingly.
Co-authored-by: Robert Pająk <[email protected]>
Co-authored-by: Robert Pająk <[email protected]>
docs/database/database-spans.md
Outdated
| | Attribute | Type | Description | Require level | Stability | | ||
| |----------------|--------|---------------------------------------|-------------------|----------------------------------------------------------------| | ||
| | `service.name` | string | Logical name of the service [1] | `Required` |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. In such case service.name feels more like a baggage item that you want to propagate.
Then it could encoded something like
SELECT * FROM songs /* baggage: service.name=music-player:play */
|
@open-telemetry/semconv-db-approvers, PTAL |
docs/database/database-spans.md
Outdated
| | Attribute | Type | Description | Require level | Stability | | ||
| |----------------|--------|---------------------------------------|-------------------|----------------------------------------------------------------| | ||
| | `service.name` | string | Logical name of the service [1] | `Required` |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XSAM, it is reasonable to me to have this attribute when traceid cannot be injected due to performance reasons.
What do you think about making it conditionally required, where there is not information about span?
If you have traceparent, is it IMO useless.
docs/database/database-spans.md
Outdated
| | Attribute | Type | Description | Require level | Stability | | ||
| |----------------|--------|---------------------------------------|-------------------|----------------------------------------------------------------| | ||
| | `service.name` | string | Logical name of the service [1] | `Required` |  | | ||
| | `traceparent` | string | The trace context of current span [2] | `Recommended` [3] |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open question, not a blocker for the PR, can be considered separately.
Should we support any text-propagators here by using format/names knows from http headers?
Ref: B3/B3multi as an example: https://github.com/openzipkin/b3-propagation?tab=readme-ov-file#single-header
docs/database/database-spans.md
Outdated
| | Attribute | Type | Description | Require level | Stability | | ||
| |------------------------|--------|---------------------------------------|-------------------|----------------------------------------------------------------| | ||
| | `baggage.service.name` | string | Logical name of the service [1] | `Required` |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If traceparent is present I do not think that service.name is needed. Isn't it the best way to have one or another?
Do we even need to define "require level"? Maybe we should just describe the meaning of each sqlcommenter attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
traceparentis present I do not think that service.name is needed. Isn't it the best way to have one or another?
Updated, and I keep the Require level to make it as an annotation as well. What do you think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we need even the require level. I think these are just different propagation options 😉 But we can keep it for now as I think there are other more important comments to be resolved/addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If traceparent is present I do not think that service.name is needed.
Even if traceparent is present, service.name is still useful. Exact correlation using trace_id requires both a database sample and the full trace for the same request, but because of sampling, many traces and DB executions are missed. Since the chance of both being captured is low (p_trace * p_db_sample), including service.name helps support approximate correlation when exact matching isn’t possible.
Co-authored-by: Robert Pająk <[email protected]>
Co-authored-by: Robert Pająk <[email protected]>
docs/database/sql-server.md
Outdated
|
|
||
| **[3] `traceparent`:** MUST be in the [text format](https://www.w3.org/TR/trace-context/#traceparent-header). | ||
|
|
||
| Instrumentations SHOULD make use of [SET CONTEXT_INFO](https://learn.microsoft.com/en-us/sql/t-sql/statements/set-context-info-transact-sql?view=sql-server-ver16) to add high-cardinality information(e.g. `traceparent`) as adding sql comments to queries changes their unique identifier in SQL Server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add to the guidance here that instrumentations will need to copy over the transaction (if present) to the new set context_info command. When we didn't do this, we ran into customer issues when there was an active transaction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate more about what copy over the transaction is actually doing? Could we just run set context_info before every query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the LOC I'm referencing for our C# client : https://github.com/DataDog/dd-trace-dotnet/pull/6233/files#diff-29ead36b0177265c725705ce75f5a441417f4c22e99f7080bc8ef0f486b723a7R145-R146
When creating a new set context_info command that we run before the original query, we simply copy the Transaction from the original DbCommand object to the new one. Otherwise, when executing the set context_info command we encountered a InvalidOperationException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SET context_info must run on the same physical connection as the SQL statement it’s meant to instrument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
docs/database/database-spans.md
Outdated
| **Status**: [Development][DocumentStatus] | ||
| Instrumentations SHOULD propagate the context information to the SQL queries following [sqlcommenter](https://google.github.io/sqlcommenter/spec/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to share a learning from my org. We've implemented a sqlcommenter-like approach for our instrumentations and we found it more beneficial to prepend the sql comment rather than append to the query text. This runs counter to the sqlcommenter spec, but we found that most databases truncate the query in views like pg_stat_activity, so prepending the attributes offered a better guarantee that they wouldn't be truncated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zacharycmontoya Thank you for pointing this out!
I think it makes sense to prepend the sql comment instead. I'll update this PR accordingly if there are no objections. cc @open-telemetry/semconv-db-approvers @XSAM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using prepend conflicts the spec of sql commenter. We might need to change the spec of sql commenter if prepend approach is superior than append approach.
@zacharycmontoya Could you elaborate more about this? How serious and often the truncate will happen to certain databases if we use append.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres limits query text to 1024 characters by default, but you can increase it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prepended comments break pg_hint_plan functionality:
EXPLAIN /*+ IndexScan(t) */ /* a */ SELECT * FROM t WHERE n = 1;
QUERY PLAN
-------------------------------------------------------------
Index Scan using i on t (cost=0.15..36.38 rows=13 width=4)
Index Cond: (n = 1)
(2 rows)
EXPLAIN /* a */ /*+ IndexScan(t) */ SELECT * FROM t WHERE n = 1;
QUERY PLAN
-----------------------------------------------------------------
Bitmap Heap Scan on t (cost=4.26..14.95 rows=13 width=4)
Recheck Cond: (n = 1)
-> Bitmap Index Scan on i (cost=0.00..4.25 rows=13 width=0)
Index Cond: (n = 1)
(4 rows)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea is to make it flexible by allowing the implementation of propagation for a specific database to choose between append or prepend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified this PR to incorporate the idea.
Co-authored-by: Zach Montoya <[email protected]>
@nenadnoveljic I only added the |
|
I have refactored the PR to the following structure. @trask @nenadnoveljic @zacharycmontoya PTAL
Failed CI is irrelevant to this PR. https://github.com/open-telemetry/semantic-conventions/actions/runs/15172537474/job/42665853199?pr=2236 (fix) |
| ### SQL commenter | ||
| Instrumentations SHOULD propagate the context information to the SQL queries following [SQL commenter](https://google.github.io/sqlcommenter/spec/). The instrumentation implementation MAY choose to either `append` the comment to the end of the query or `prepend` the comment at the beginning of the query, depending on the specific database system's requirements or preferences. | ||
| Instrumentations SHOULD propagate the context information to the SQL queries following [SQL commenter](https://google.github.io/sqlcommenter/spec/). The instrumentation implementation MAY choose to either **append** the comment to the end of the query or **prepend** the comment at the beginning of the query, depending on the specific database system's requirements or preferences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the parameter isn't set, would it make sense to prepend by default and append if we detect a hint in Postgres (/*+)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need a following discussion of #2236 (comment).
I did a small experiment on postgres 17.5 by running two long queries with different comments style more than 1024 bytes.
For instance:
- For prepend comment, I ran
/*foo='content longer than 1024 bytes...'*/ SELECT state,query FROM pg_stat_activity; - For append comment, I ran
SELECT state,query FROM pg_stat_activity /*foo='content longer than 1024 bytes...'*/;
The result I got from pg_stat_activity shows both queries get truncated.
- For prepend comment, I got
/*foo='content longer than 1024 bytes...- Only the truncated comment is left.
- For append comment, I got
SELECT state,query FROM pg_stat_activity /*foo='content longer than 1024 bytes...- Query remains untouched, as it is very short, but comment is also truncated.
So, postgres seems truncate anything more than 1024 bytes on pg_stat_activity whatever it is the query statement or comments. That means no matter the comments, a query that is more 1024 bytes is going to be truncated.
Comparing these two approaches
prepend
Pros:
- Always keep the comments. Propagated info won't be lost.
Cons:
- When truncating happens, the query statement get truncated.
- Users won't see a complete query, and the correlation based on query text may not work.
- It breaks pg_hint_plan functionality [db] Add docs for context info propagation #2236 (comment). So the instrumentation needs to actively detect a hint. (though I was not able to reproduce the issue, the hint plans I got are always the same)
append
Pros:
- Original query is kept (assuming the query itself won't be truncated)
Cons:
- Comments might be truncated. Propagated info will be lost.
In short, assuming prepend can break pg_hint_plan while append won't. append seems better for easy implementation. About the truncate part, we could use ALTER SYSTEM SET track_activity_query_size = 16384; to increase the limit (though it needs a restart to be effective)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trask Do you think it is ok to take this as non-blocker to this PR? we can leave this discussion to following issue/development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmolkova PTAL, is this PR on a good shape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not agree with this statement:
Instrumentations SHOULD propagate the context information to the SQL queries following SQL commenter.
- recommending instrumentations to modify SQL queries by default is dangerous and questionable
- it ruins prepared statements caches for most database systems
We can document other context propagation mechanisms (like there is something for MS SQL and postgres) for these specific databases.
It'd be ok-ish with having it opt-in for other instrumentations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be ok-ish with having it opt-in for other instrumentation.
Is it ok to doc Instrumentations MAY propagate the context information to the SQL queries following [SQL commenter](https://google.github.io/sqlcommenter/spec/). Use MAY and an annotation to express that it is an opt-in behavior?
I can also move it to MS SQL if you prefer, though I want to have a unified way to propagate context information.
|
|
||
| | Attribute | Type | Description | Examples | Require level | Stability | | ||
| |------------------------|--------|---------------------------------------|----------|-------------------|----------------------------------------------------------------| | ||
| | [`service.name`](/docs/registry/attributes/service.md) | string | Logical name of the service [22] | `shoppingcart` | `Recommended` |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this table talks about attributes, but it's under context propagation - I don't understand the suggestion here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means when using sqlcommenter for context propagation, we should propagate service.name.
The reason is here. #2236 (comment). This is a tradeoff because the database itself won't provide trace info. We can use database receiver sending SQL to fetch query samples with trace info, but the chance of getting a complete trace (from SQL client to SQL server) is low due to sampling.
Do you think suggesting a service.name propagator would make this sentence easy to understand?
| |------------------------|--------|---------------------------------------|----------|-------------------|----------------------------------------------------------------| | ||
| | [`service.name`](/docs/registry/attributes/service.md) | string | Logical name of the service [22] | `shoppingcart` | `Recommended` |  | | ||
|
|
||
| **[22] `service.name`:** Instrumentations MAY use [SDK-provided resource detectors](https://opentelemetry.io/docs/specs/semconv/resource/#semantic-attributes-with-sdk-provided-default-value) to set the default value for this attribute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instrumentations don't have access to resources - resources are SDK concept and instrumentations only use API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I forgot this part.
Sounds like service.name should be considered a parameter of the instrumentation that users should provide. Until we have an EntityProvider in the future.
|
|
||
| ### SET CONTEXT_INFO | ||
|
|
||
| Instrumentations MAY make use of [SET CONTEXT_INFO](https://learn.microsoft.com/en-us/sql/t-sql/statements/set-context-info-transact-sql?view=sql-server-ver16) to add text format of [`traceparent`](https://www.w3.org/TR/trace-context/#traceparent-header) before executing a query. This is an opt-in behavior that should be explicitly enabled by the user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's about tracestate, baggage and other possible headers?
context injection should be propagator concern, not instrumentation one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context injection should be propagator concern, not instrumentation one
I agree if SQL has a header concept (A list of key-value pairs), so it can have a dynamic space to put tracestate and baggage. We can also have a variety of propagators.
But, SQL does not have such a concept. And CONTEXT_INFO is the binary format in SQL Server, with a maximum size of 128 bytes. It is not feasible to assign more values due to this length limitation. Even if there is no limitation, I don't know what existing format we can use to put a list into a binary value.
So, I propose to use the value traceparent, a static value, for SQL Server.
|
|
||
| ### SQL commenter | ||
|
|
||
| Instrumentations SHOULD **append** the comment to the end of the query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why position is important? what MS SQL instrumentation should put into the comment? in which format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why position is important?
In Postgres, prepended comments can break query hints, as shown here. Appended comments may be truncated in long queries. Proposal: use prepended comments by default, except for Postgres.
what MS SQL instrumentation should put into the comment?
To transport static tags, such as service and peer.service, because they don't fit in context_info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in which format?
Examples:
append: SELECT * FROM songs /*service.name='shoppingcart'*/
prepend: /*service.name='shoppingcart'*/ SELECT * FROM songs
Another discussion: #2236 (comment)
|
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
|
Just saw this and very interested in any effort to clarify the role of sqlcommenter in OTel. @tammy-baylis-swi and i will be sharing our learnings at the Open Observability Summit June 26th, would love to talk about it with any maintainers or TC members there. |
|
Superseded by #2495 |
Relevant to #2162
Changes
Added documentations for instrumentations to propagate context information to databases
Merge requirement checklist
[chore]