-
Notifications
You must be signed in to change notification settings - Fork 798
Create span only after record is received while polling #1678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bce5031
to
6c2a3da
Compare
...nstrumentation-confluent-kafka/src/opentelemetry/instrumentation/confluent_kafka/__init__.py
Show resolved
Hide resolved
Please add a changelog entry |
There was probably a reason why span links were used. Please take a look at the change again. |
d9df5db
to
adcbe65
Compare
@srikanthccv yes I wrote a bit in the PR description about the usage of links here and why I think it's not the right choice here. Here's what I think:
Let me know what you think |
adcbe65
to
0704bfe
Compare
@mrajashree I think the link code needs to stay as it was mostly correct as it was mostly following https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/messaging/#batch-receiving as Kafka is a batch receiver. Also there is some more background here https://opentelemetry.io/blog/2022/instrument-kafka-clients/ |
@mrajashree do you have any thoughts on this after the comment from @owenhaynes? |
Hi, anything can be done to push this through? It would be great to have the consumer use the context from the kafka message and be in the same trace as the producer, this would be much more useful in a distributed system to trace async flows. In this example https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/messaging/#topic-with-multiple-consumers the spans created by the consumers have the same parent as the producer. |
It would be very useful to resolve this issue. As it stands we don't have viable distributed tracing - just separate Producer and Consumer Spans |
0704bfe
to
381c5f5
Compare
381c5f5
to
a608c43
Compare
hi @ocelotl and @owenhaynes, as per the opentelemtry semantic conventions for messaging, context propagation is a must, which is not being done currently https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/#context-propagation
There's some more background on this here: https://github.com/open-telemetry/oteps/blob/main/text/trace/0205-messaging-semantic-conventions-context-propagation.md This issue was also fixed in the java lib (open-telemetry/opentelemetry-java-instrumentation#3529) I think if we need to use links, we also need to figure out a way of propagating context and not starting a new trace each time consumer receives a message |
Hi @lzchen @ocelotl @shalevr - can you please take a look at this PR and merge. As the contributor @mrajashree mentioned this fix has been done in other language libraries and is needed by downstream by end-users. Thanks for your help in getting this fix in. |
can someone please review this PR ? it has been waiting for so log :( |
@luisRubiera tests are red, needs to be updated to match whatever changed in the meantime |
([#2355](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2355)) | ||
- AwsLambdaInstrumentor sets `cloud.account.id` span attribute | ||
([#2367](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2367)) | ||
- Create span only after record is received while polling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to unreleased section
Description
Fixes #1674
The package confluent-kafka has a class Consumer, which has a method called poll. As per docs, the recommended usage is calling consumer.poll infinitely, so it keeps polling the brokers for messages, and then each time it's called, the code is supposed to check if the message is
None
or has any error before trying to process the message. (Ref: https://github.com/confluentinc/confluent-kafka-python#basic-consumer-example)The package opentelemetry-instrumentation-confluent-kafka offers a wrapper around confluent-kafka's consumer, called ProxiedConsumer. ProxiedConsumer also has a method call poll, which calls ConfluentKafkaInstrumentor's wrap_poll method. This wrap_poll method calls the underlying consumer's poll method with the user specified timeout.
The confluent_kafka.Consumer.poll method is supposed to be called from an application from within an infinite loop, which means the
wrap_poll
method will also be called with each iteration of the infinite loop. This is the observed behavior with the current implementation ofwrap_poll
:wrap_poll
method is creating a span each time it is called. If we go by this example where the consumer.poll is called with a timeout of 1 second, the currentwrap_poll
implementation will create a span per second.wrap_poll
returns the record even if the record is None, it should only return record if the record exists.This PR fixes the above issues. This is assuming my understanding and usage of ProxiedConsumer is right, please correct me if not. Here's a sample code snippet based on the docs:
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Tested this change and verified that the span created by
wrap_poll
stays linked to all spans created previously for the current trace. Also the spans created afterwrap_poll
stay linked to the same trace for a kafka message.Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.