-
Notifications
You must be signed in to change notification settings - Fork 320
Distributed Tracing for Entities (Isolated) #1198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
sophiatev
merged 71 commits into
main
from
stevosyan/distributed-tracing-for-entities-isolated
May 20, 2025
Merged
Distributed Tracing for Entities (Isolated) #1198
sophiatev
merged 71 commits into
main
from
stevosyan/distributed-tracing-for-entities-isolated
May 20, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… that an entity in WebJobs starts an orchestration
bachuv
reviewed
Mar 31, 2025
… activities to their parents
…to create in orchestration in the dotnet package
…ect the end time of activites (like, when the call/signal request to an entity is actually sent, or when an orchestration is actually created by an entity)
…scheduling a suborchestration
…yan/distributed-tracing-for-entities-isolated
…yan/distributed-tracing-for-entities-isolated
…yan/distributed-tracing-for-entities-isolated
…yan/distributed-tracing-for-entities-isolated
…ied TaskHubClient method
…id ID in the case of a non-null Activity
…yan/distributed-tracing-for-entities-isolated
bachuv
reviewed
May 12, 2025
…n orchestration signaling an entity is too short, then the message gets redelivered and a trace is created for each redelivery. we fixed this and only make the trace once
bachuv
reviewed
May 14, 2025
bachuv
approved these changes
May 15, 2025
…uest times DateTimeOffset
bachuv
previously approved these changes
May 19, 2025
Base automatically changed from
stevosyan/distributed-tracing-for-entities
to
main
May 20, 2025 21:49
bachuv
approved these changes
May 20, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for distributed tracing for entities in the .NET isolated framework. This repo is where the trace
Activitiesare actually created for signaling and calling entities and for entities starting orchestrations from an isolated app.RequestMessage,OperationRequest,OperationResult,SendSignalOperationAction, andStartNewOrchestrationOperationActionwere altered to include extra information from the durabletask-dotnet repo where the requests to entities are actually generated, and where the requests are also executed. This extra information includes the time of the requests, the end time of the execution and any error messages, and parent trace contexts.CreateTracefield added toRequestMessageis only used in the isolated case to indicate that we want to make an entity-specific trace for this request (it is set to true by the appropriate signal/call methods in durabletask-dotnet). In the in-process case, all of the traces are created in the WebJobs repo, so this field is not populated (and will be false by default).TraceHelper,Schema, andTraceActivityConstantswere updated with the instantiation of the entity-specific traceActivitiesClientyEntityHelpersandOrchestrationEntityContextmethods that generateEntityMessageEventswhich are used by the durabletask-dotnet repo were updated to attach this above-mentioned additional information to the message events.TaskEntityDispatcher, which is where all entity requests end up (orchestrations calling/signaling entities, clients signaling entities, entities signaling other entities, and entities starting orchestrations), and where the entities are actually invoked to fulfill the requests, was updated to instantiate the corresponding traces. One exception is that clients signaling entities via gRPC (i.e., when theDurableEntityClientis aGrpcDurableEntityClient) is handled in the WebJobs repo, where the call ultimately reaches theLocalGrpcListener. The PR for this repo is linked below.TaskEntityDispatcher.StartTraceActivityForSignalingEntityis used to create theActivityin the case of a client signaling an entity (viaShimDurableEntityClientin the dotnet repo, since the gRPC client call is handled by WebJobs) or in the case of an orchestration signaling an entity. In the former case,ShimDurableEntityClienthas access to the correct parent trace context viaActivity.Current.Contextso it attaches this context to the request message itself.StartTraceActivityForSignalingEntitythen parses and uses this context as the parent to theActivityfor signaling the entity. For an orchestration signaling an entity, the dotnet repo does not have access to the orchestration trace context and neither doesTaskEntityDispatcher. In this case, the way the parent trace context is attached is viaTaskOrchestrationDispatcher.ProcessSendEvent, whereActivity.Current.Contextholds the orchestration context. This method only has access to the associatedEventRaisedEvent, so this is what it attaches the parent trace context to and is what is eventually parsed and used byStartTraceActivityForSignalingEntity. Finally, in the case of an orchestration calling an entity, theActivityis only created at the very end once the call has completed. The code at that point only has access to theRequestMessage, soStartTraceActivityForSignalingEntityattaches the parent trace context from theEventRaisedEventto theRequestMessagesuch that it can eventually be used when making theActivityfor the call.The various other PRs related to this effort are
It is worth noting that the
Activitiesfor signaling an entity in the isolated case will have longer durations than in the in-process case. In the in-process case, theActivityfor a signal to an entity is created upon the request and almost immediately disposed. In the isolated case, we cannot immediately dispose theActivityupon the request since this would require creating theActivityin the dotnet repo where the request is generated. Instead, we create theActivityonce the signal request reaches DurableTask.Core and is actually processed by theTaskEntityDispatcher, and pass the request time as the start time of theActivity. Its end time will therefore be much more offset from its start time (the request time) relative to the in-process case. This is not an issue for calls to entities since these are only ended once the operation completes (in the isolated case, once we send the result back to the orchestration instance, and in the in-process case once the entity invocation completes).This is also true in the case of a client creating an orchestration using the
ShimDurableTaskClient- we only create theActivityfor the orchestration once the request reaches DurableTask.Core and is processed by theTaskOrchestrationDispatcher. Therefore the duration of the create orchestrationActivitywill be much longer than in all other cases where theActivityis started upon the request and almost immediately ended afterwards.An example trace generated by this simple orchestration


looks as follows
Each signal request has type ActivityKind.Producer and each call request has type ActivityKind.Client (an entity starting an orchestration is also of type ActivityKind.Producer). When an entity actually processes the request, for a signal the span has type ActivityKind.Producer and for a call the span has type ActivityKind.Server. Note that the call to
add_to_other_entity_step_1starts a cascade of entities signaling other entities until eventually the last call is simply anaddto the third entity.If instead of starting the orchestration via an HTTP request we signal an entity to start the orchestration, the trace would look like this
