-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Write-up of context propagation techniques in OBI #8512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Write-up of context propagation techniques in OBI #8512
Conversation
dashpole
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really useful, and I think starting with this scope is good.
I'm imagining that users will probably come to this page with questions like "will this work with my java Spring Boot application?" (my google search suggests it might work with Spring MVC, but not with Spring WebFlux, which is reactive, or Java Virtual Threads...). With some digging I think people can use what you've written to answer their question here (e.g. is my framework reactive?). But I also think that the more we can enumerate popular examples of things that will work, or popular examples of things that won't work, the fewer user questions we will get.
| At a time of an outgoing client request, we use the parent to child relationship map, up to a depth of 3, | ||
| to lookup any, still active, incoming request that has launched the outgoing request goroutine. | ||
|
|
||
| Special consideration is taken for correct context propagation for `gRPC`, because the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you already implement the "metadata injection" portion of context propagation for gRPC, or is this referring to go auto-instrumentation? Out of curiosity, are languages below not supported because we haven't implemented it yet, or because the approach used in Go won't work in those languages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were just discussing this part at the SIG, so I'll summarise here. What we have for Go gRPC at the moment is using the bpf_probe_write_user helper, which is banned on locked kernels and it's Go specific. We have new design that will use the same approach as for HTTP and implement this for all languages. We need to change a few internal data structures to hold the streamID, since it's no longer a simple connection based key.
|
|
||
| ## What doesn't work | ||
|
|
||
| - Any reactive programming frameworks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the phrasing of "applications that do FOO, such as reactive frameworks" you use below, since it helps me understand what it is doing that OBI can't track. Some popular examples might also be helpful.
| - `.net` `async`/`await` create number of complex thread pools that are unpredictable | ||
| in how the work is scheduled. | ||
|
|
||
| ## Future work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably exclude this from the website, since it is speculative.
|
|
||
| `nginx` correlation doesn't for for `HTTP2`/`gRPC`. | ||
|
|
||
| ### Generic approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this language-agnostic?
| by the same thread. It detects if multiple current outgoing requests are handled by | ||
| the same thread and marks the correlation information as invalid. This helps us | ||
| prevent incorrect correlation, when the application framework handles multiple | ||
| connections on the same thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 that was my biggest question for asyncio, if OBI would generate incorrect parent-child relationships
Good point. I'll try to make a list of examples which work and which don't for the various programming languages. |
contribution guidelines,
including the First-time contributing? note.
the
Generative AI Contribution Policy.
This PR adds OBI documentation on how the parent child correlation works for different technologies, for the purpose of context propagation.
Relates to open-telemetry/opentelemetry-ebpf-instrumentation#903