-
Notifications
You must be signed in to change notification settings - Fork 781
Description
Motivation: Why do you think this is important?
Today, as a Flyte user, I have the following options to set labels/annotations on the pods/CRD objects of K8s tasks in a flyte workflow execution:
-
Set via pod template:
def task( pod_template=PodTemplate(annotations=..., labels=...) )
This sets labels/annotations on the pods of individual tasks.
For distributed tasks (like pytorch, ray, ...) this sets the metadata not on the CRD object but its pod template spec. -
Set via
pyflyte run --labels ... --annotations ...This applies the metadata to all K8s objects in a flyte workflow execution, including task pods and task CRD objects. However, this mechanism doesn't work on an individual task level.
As a Flyte user, I would like to be able to specify specific labels/annotations for individual k8s task CRD objects like pytorch jobs, ray job, ... (the same way I already can today for pods via the pod template):
@task(
task_config=PyTorch(
num_workers=...,
...
# Proposed addition:
metadata=ObjectMeta(
annotations={"kueue.x-k8s.io/queue-name": "queue-name"},
labels={...}
)
)
)I propose to use the same syntax/flyteidl type for all K8s (non-pod) plugins like Elastic, TfJob, MpiJob, RayJobConfig, ...
In my concrete case, I would like to have this feature in order to leverage Kueue to gang schedule worker pods for distributed pytorch training tasks (e.g. as documented here).
This requires setting a queue name annotation on the underlying PytorchJob CRD object.
There have been previous asks from the community to enable such a feature/integration:
-
Attempts to integrate Yunikorn and Kueue more deeply into flytepropeller which weren't accepted though.
In contrast, the feature I propose allows users to choose to use Kueue while it isn't a Flyte-Kueue integration. Instead it is a very general feature that could be used for any other integration as well that makes use of annotations/labels to select workloads.
-
There have been discussions in Slack about using Kueue, suggesting to use e.g.
pyflyte run --labels/--annotationsto set the required metadata. However, this is not good enough because this applies the metadata to all nodes in the graph while you might want to do queueing/gang scheduling only for a subset.
Describe alternatives you've considered
Add task kwargs for labels and annotations:
@task(
# If we added these args ...
labels={...},
annotations={...),
# ... for simple python function tasks this would conflict with this existing arg:
pod_template=PodTemplate(annotations=...)
)Are you sure this issue hasn't been raised already?
- Yes
Have you read the Code of Conduct?
- Yes
Metadata
Metadata
Assignees
Labels
Type
Projects
Status