Question: How best to handle "dead-lettering" failed orchestrations? #1766
-
Is your feature request related to a problem? Please describe. Describe the solution you'd like Describe alternatives you've considered Ideas we're looking at:
We shall spike the above techniques but appreciate any feedback in the meantime. Additional context |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 3 replies
-
I too have a general failure strategy that relies on dead letter alerts. however, I mimic the same strategy in Durable Functions by subscribing to the If a failure does occur and you want to try again, the recently introduced orchestration [restart API] (#1545) can be used to restart the orchestration from the start as though it had never been ran before. Which is exactly what you would be doing if you re-queued the message intent back into Service Bus after a DL or Failed Orchestration. I think the failure alerting in Durable Functions could be better integrated into Azure Monitor as a First Class feature IMO. But that aside, the above combinations should work without having to keep messages locked and renewed in ServiceBus while waiting for an orchestration to complete, or manually enqueuing failed messages back to Service Bus. |
Beta Was this translation helpful? Give feedback.
-
I think that @olitomlinson nails this issue on the head. In general, the restart API should mitigate a lot of the pain of the logistics of restarting an orchestration, as you no longer need to maintain your original message. At that point, you just need some way of knowing when your orchestration fails. We currently provide two main ways to identify failed orchestrations:
There are also some very helpful open source tools written by third parties, like Durable Functions Monitor. @olitomlinson, I remember us having a discussion regarding better Azure Monitor integration a long time ago, but I can't seem to find what issue we had that discussion under. I think having an issue for that enhancement would be super helpful, as generally we decide on what new features to work on based on user engagement with the top level tickets. |
Beta Was this translation helpful? Give feedback.
-
Not quite the same, but this one comes to mind #1527 (comment) I think we need an Issue in general for first-class Azure Monitor integration with DF and then we can build a list of various telemetry series to emit. |
Beta Was this translation helpful? Give feedback.
-
This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the suggestions @ConnorMcMahon @olitomlinson. We have taken on your points on board and are spiking the suggestions above to explore the possible solutions. |
Beta Was this translation helpful? Give feedback.
@ma499
I too have a general failure strategy that relies on dead letter alerts.
however, I mimic the same strategy in Durable Functions by subscribing to the
Failed
Orchestration lifecycle events, and then publishing a custom Event to Application Insights. I then have an Azure Monitor alert checking for the presence of these events. Which then triggers alerts in the same manner as a Service Bus DL.If a failure does occur and you want to try again, the recently introduced orchestration [restart API] (#1545) can be used to restart the orchestration from the start as though it had never been ran before. Which is exactly what you would be doing if you re-queued the message intent back into S…