-
Couldn't load subscription status.
- Fork 35
Enable Dynamic Resource Allocation (DRA) #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
HI @poussa Great PR 🎉🎉🎉 Would it be better to include examples/values-dra.yaml in the https://github.com/llm-d-incubation/llm-d-modelservice/blob/main/hack/generate-example-output.sh to generate-example-output.sh? |
| class: gpu.nvidia.com | ||
| match: "exactly" | ||
| count: 1 | ||
| selectors: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to enumerate all types? Could we just write them in the comments? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is bit tricky. The default values needs to be defined somewhere. I was hoping the json schema will be the place -- but it is not. The schema default values are not propagated to helm templates. So the default values (e.g. match: "excatly") needs to be defined somewhere. Either in values.yamlor in template code _dra.tpl
Good idea, will do. |
DRA is similar to k8s extended resources (device plugins) but has more capabilities. Signed-off-by: Sakari Poussa <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies in my lack of knowledge in this field. The specs look right to me. However, there is an .extraObjects at the top level for creating custom resources.
examples/values-dra.yaml
Outdated
| dra: | ||
| enabled: true | ||
| type: "intel-gaudi3-x2" | ||
| claimTemplates: | ||
| - name: intel-gaudi3-x2 | ||
| class: gaudi.intel.com | ||
| match: "exactly" | ||
| count: 2 | ||
| selectors: | ||
| - cel: | ||
| expression: device.attributes["gaudi.intel.com"].model == 'Gaudi3' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies in advance for lack of knowledge in DRA. The specs look right to me. However, there is an .extraObjects at the top level for creating custom resources. It looks like there isn't that much abstraction that's going behind-the-scenes other than copying over the DRA definition in values.yaml. Could you clarify your use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just wondering what is easier to maintain here, examples with DRA that use .extraObject or adding a .dra field. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer dra object since we have accelerator object for extended resources. If we use dra object we can have the json schema validation which is not possible with extraObjects. The schema validation becomes important once we add more DRA features since the dra object may become quite complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
examples/values-dra.yaml
Outdated
|
|
||
| modelArtifacts: | ||
| name: meta-llama/Llama-3.3-70B-Instruc | ||
| uri: "pvc+hf://model-pvc/meta-llama/Llama-3.3-70B-Instruc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| uri: "pvc+hf://model-pvc/meta-llama/Llama-3.3-70B-Instruc" | |
| uri: "pvc+hf://model-pvc/meta-llama/Llama-3.3-70B-Instruct" |
|
@jgchn thanks for the approve but do not merge yet. I am testing and fixing the PR still. |
Signed-off-by: Sakari Poussa <[email protected]>
Signed-off-by: Sakari Poussa <[email protected]>
Signed-off-by: Sakari Poussa <[email protected]>
Signed-off-by: Sakari Poussa <[email protected]>
|
@poussa looks like CI is failing. I think the following should fix the lint and pre-commit: and looks like there is a broken link somewhere. |
Implements: #132 (Option 2)
DRA is similar to k8s extended resources (device plugins) but has more capabilities. In this first DRA PR the following capabilities are introduced:
drablock in values.yaml. Ifdra.enabled=truethis block is used instead of theacceleratorblock.resources.claimsandresourcesClaimsblockskind: ResourceClaimTemplatewithdeviceClassName,count, andselectorfields.More DRA capabilities will be added once the direction is set (i.e., is this the right way to enable DRA?).
Thoughts?