-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add batch workflow refresh tasks #8793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bfd3e1b to
32e6d5e
Compare
32e6d5e to
94f9a55
Compare
|
Haven't review the actual changes yet but for the tdbg experience, let’s match what we have in the temporal CLI. so something like `tdbg workflow refresh-tasks -q '' means to start a batch operation. and if -q is not specified and --wid is specified, then it's not a batch op, but only operating on that single workflow. |
proto/internal/temporal/server/api/adminservice/v1/request_response.proto
Outdated
Show resolved
Hide resolved
proto/internal/temporal/server/api/adminservice/v1/request_response.proto
Outdated
Show resolved
Hide resolved
yycptt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @prathyushpv Can you help review this PR as well?
Do we care about setting Memos for the Admin Batch Operation, eg. |
service/worker/batcher/activities.go
Outdated
| case *adminservice.StartAdminBatchOperationRequest_RefreshWorkflowTasksOperation: | ||
| return processTask(ctx, limiter, task, | ||
| func(execution *commonpb.WorkflowExecution) error { | ||
| _, err := a.AdminClient.RefreshWorkflowTasks(ctx, &adminservice.RefreshWorkflowTasksRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to be clear, we're okay with any user being able to do this at any time?
refresh tasks is safe, but my concern is that someone is going to add a more dangerous admin operation here and not realize that it's exposed to all users in the namespace.
I would say we need a big warning comment somewhere but I'm not sure where it should go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a big warning to the proto definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need a general way for preventing user from starting this workflow directly, even for normal operations like signal/terminate etc, which allows them to bypass, for example, the concurrent limit.
a7bd46f to
54bd801
Compare
54bd801 to
30b305c
Compare
proto/internal/temporal/server/api/adminservice/v1/request_response.proto
Outdated
Show resolved
Hide resolved
I think it's nice to have them (reason and batch type) in the memo. |
|
|
||
| // StartAdminBatchOperationRequest starts an admin batch operation. | ||
| // WARNING: Batch Operations are exposed to all users of the namespace. Admin Batch Operations should be exercised with caution. | ||
| message StartAdminBatchOperationRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if a field to specify batch operation RPS would be helpful in input. We have max_operations_per_second in StartBatchOperationRequest. We can use this field to control the rate. It may not be strictly necessary since we are marking this batch operation with the lowest priority, and it should not block other operations.
cc: @yycptt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this max_operations_per_second field isn't properly handled by the activity? It seems to always default to the Dynamic Config RPS value https://github.com/temporalio/temporal/blob/admin-batch-refresh-workflows/service/worker/batcher/fx.go#L89.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah looks like it's not used at all right now and we just rely on the dc value to protect us.
|
|
||
| _, err = adminClient.StartAdminBatchOperation(ctx, &adminservice.StartAdminBatchOperationRequest{ | ||
| Namespace: nsName, | ||
| VisibilityQuery: query, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR.
Does visibility query parser support BusinessID today? I mean we added support for each component to define it's own alias for the underlying WorkflowID column, but we will also need a default alias I think. Very low priority though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least right now, WorkflowID should be able to be used as a default system search attribute to retrieve the BusinessID, we never made the change to disallow querying against WorkflowID yet, since that might impact scheduler query backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I didn't mean to disallow WorkflowID. just we need to allow BusinessID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, I just recall a discussion with Roey/Rodrigo about disallowing before, but maybe we just ignore that.
acea49c to
a286808
Compare
|
|
||
| _, err = adminClient.StartAdminBatchOperation(ctx, &adminservice.StartAdminBatchOperationRequest{ | ||
| Namespace: nsName, | ||
| VisibilityQuery: query, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I didn't mean to disallow WorkflowID. just we need to allow BusinessID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider adding tests for 1. admin batch and normal batch don't share the same limit and 2. ListBatchOperations doesn't show admin batch op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added tests to validate separate namespace divisions.
3a579fa to
cb3823c
Compare
What changed?
Add batch workflow refresh tasks Admin API.
Add tdbg commands to invoke
StartAdminBatchOperation.Details
AdminBatchOperations are executed in the Batched System Worker, similar to existing Frontend BatchOperations. The worker activity will fetch pages of workflow executions to perform operations on, and periodically heartbeat its results. For RefreshWorkflowTasks, each individual Refresh call will go through the admin handler.
Refreshing Workflow Tasks regenerates all pending tasks of an execution given its mutable state.
CLI usage
Why?
Unblock new Matcher migration and allow for general use case batch Admin calls. Currently, only supports
BatchOperationRefreshWorkflowTasks.How did you test it?