-
Notifications
You must be signed in to change notification settings - Fork 5k
Open
Labels
backendhelp wantedExtra attention is neededExtra attention is neededimprovementmake more easy to user or prompt friendlymake more easy to user or prompt friendly
Description
Search before asking
- I had searched in the issues and found no similar feature requirement.
Description
Description
The current rerun mechanism of workflow instances ignores the pre-configured workerGroup parameter, leading to random assignment of tasks to idle workers instead of the specified worker group. This breaks resource isolation and scheduling rules, making it impossible to control task execution nodes as expected during rerun scenarios.
Issue Description
When re-running a workflow instance, the system does not follow the specified workerGroup in the startup parameters, but randomly assigns the task to any idle worker node instead. This violates the expected resource isolation and scheduling rules, and cannot guarantee the consistency of task execution environment between the first run and rerun.
What version of DolphinScheduler are you using?
Version: 3.3.2
What Operating System are you using?
OS: Debian 12
What happened?
- Create a workflow and set a specific
workerGroup(e.g., "w1") in the startup parameters when running the workflow for the first time; - The first run correctly executes on the nodes in the specified
workerGroup; - When re-running the failed/finished workflow instance (via "Rerun" button), the system ignores the
workerGroupparameter; - The re-run task is assigned to any idle worker node, not the specified
workerGroup;
What you expected to happen?
- When re-running a workflow instance, the system should inherit and use the
workerGroupparameter specified in the original startup parameters; - The rerun task must be executed only on the nodes in the specified
workerGroup, consistent with the first run; - If the specified
workerGrouphas no idle nodes, the task should wait in the queue instead of being randomly assigned to other worker groups.
How to reproduce it (as minimally and clearly as possible)?
- Prepare a DolphinScheduler cluster with at least two independent worker groups (e.g., group A: node1/node2, group B: node3/node4);
- Create a simple test workflow (e.g., a shell task that prints the worker node name);
- Submit the workflow instance with startup parameter
workerGroup=group A; - Confirm the first run executes on node1/node2 (group A) by checking the task log;
- After the instance finishes/fails, click the "Rerun" button to re-execute the instance (without modifying any parameters);
- Check the task execution node: the rerun task runs on node3/node4 (group B) instead of group A;
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
backendhelp wantedExtra attention is neededExtra attention is neededimprovementmake more easy to user or prompt friendlymake more easy to user or prompt friendly