-
Notifications
You must be signed in to change notification settings - Fork 674
feat: add dynamoModel CRD #4166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Julien Mancuso <[email protected]>
Signed-off-by: Julien Mancuso <[email protected]>
Signed-off-by: Julien Mancuso <[email protected]>
WalkthroughIntroduces a new DynamoModel Kubernetes custom resource definition with associated API types, controller, and infrastructure for managing model endpoint discovery and LoRA loading. Adds ModelRef fields to existing CRDs for model association. Includes endpoint discovery utilities, a bounded-concurrency LoRA client, headless service generation, and updates to existing controllers and RBAC rules. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (5)
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml (1)
10005-10018: Tighten modelRef schema (non-empty name, disallow unknown keys).Looks good overall. To prevent empty strings and catch typos, add minimal validations.
Apply within this block:
modelRef: description: |- ModelRef references a model that this component serves When specified, a headless service will be created for endpoint discovery - properties: + additionalProperties: false + properties: name: description: Name is the base model identifier (e.g., "llama-3-70b-instruct-v1") type: string + minLength: 1 revision: description: Revision is the model revision/version (optional) type: string + minLength: 1 required: - name type: objectOptional (nice UX): add an additionalPrinterColumn to surface the model at kubectl get time:
@@ - additionalPrinterColumns: - description: Dynamo component jsonPath: .spec.dynamoComponent name: DynamoComponent type: string + - description: Model + jsonPath: .spec.modelRef.name + name: Model + type: stringPlease confirm:
- types.go defines
ModelRefwithjson:"modelRef,omitempty"and string fields, and controller tolerates missing/emptyrevision.- No model names require characters beyond basic DNS-1123 label charset; if they do, keep minLength but skip adding a strict pattern. Based on learnings.
deploy/cloud/operator/internal/modelendpoint/lora.go (1)
66-66: Standardize logging levels for success cases.Success logging is inconsistent:
loadLoRAusesInfolevel (line 66) whileunloadLoRAusesV(1)level (line 96). For consistency and operational visibility, both should log at the same level.Consider standardizing to
Infolevel for both operations:- logs.V(1).Info("Successfully unloaded LoRA", "address", address, "modelName", modelName) + logs.Info("Successfully unloaded LoRA", "address", address, "modelName", modelName)Also applies to: 96-96
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamomodels.yaml (1)
85-87: Consider adding validation to enforce loraPath usage.The
loraPathfield is described as "only applicable for lora model type" but there's no schema-level validation to enforce this constraint. Users could accidentally setloraPathon base or adapter models.Add CEL validation to ensure
loraPathis only set whenmodelTypeislora:loraPath: description: LoraPath is the path to the LoRA adapter (only applicable for lora model type) type: string modelName: description: ModelName is the full model identifier (e.g., "meta-llama/Llama-3.3-70B-Instruct-lora") type: string modelType: default: base description: ModelType specifies the type of model (e.g., "base", "lora", "adapter") enum: - base - lora - adapter type: string required: - baseModelName - modelName type: object + x-kubernetes-validations: + - rule: "self.modelType != 'lora' || has(self.loraPath)" + message: "loraPath is required when modelType is 'lora'" + - rule: "self.modelType == 'lora' || !has(self.loraPath)" + message: "loraPath should only be set when modelType is 'lora'"Also applies to: 91-98
deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml (2)
10005-10018: Validate and sanitize modelRef for Service/labels; document empty revision semanticsGood addition. Please confirm:
- Reconcile sanitizes modelRef.name (and revision if used) into valid DNS-1123 Service names (lowercase, [a-z0-9-], <=63), with truncation+hash to avoid collisions when names exceed 63 or contain dots/uppercases.
- If modelRef is used in labels/selector values, ensure label constraints (<=63; allowed charset) or apply normalization similarly.
- Clarify behavior when revision is empty (e.g., treated as “latest”, or excluded from identity). Add this to the Go type docstring so controller-gen propagates it.
Optionally, enforce constraints at the API by adding kubebuilder validation on the Go types (e.g., Patterns and MaxLength for name/revision) instead of manual YAML edits.
Based on learnings.
10005-10018: Improve kubectl UX with printer columnsConsider adding print columns on the Go type for:
- Model (.spec.modelRef.name)
- Revision (.spec.modelRef.revision)
Use +kubebuilder:printcolumn annotations so controller-gen emits them here (don’t hand-edit this YAML). This makes kubectl get dcd more informative.
Based on learnings.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml(1 hunks)deploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml(1 hunks)deploy/cloud/helm/crds/templates/nvidia.com_dynamomodels.yaml(1 hunks)deploy/cloud/helm/platform/components/operator/templates/manager-rbac.yaml(4 hunks)deploy/cloud/operator/PROJECT(1 hunks)deploy/cloud/operator/api/v1alpha1/dynamo_model_types.go(1 hunks)deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go(2 hunks)deploy/cloud/operator/api/v1alpha1/zz_generated.deepcopy.go(11 hunks)deploy/cloud/operator/cmd/main.go(2 hunks)deploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yaml(1 hunks)deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml(1 hunks)deploy/cloud/operator/config/crd/bases/nvidia.com_dynamomodels.yaml(1 hunks)deploy/cloud/operator/config/crd/kustomization.yaml(1 hunks)deploy/cloud/operator/config/rbac/role.yaml(4 hunks)deploy/cloud/operator/config/samples/kustomization.yaml(1 hunks)deploy/cloud/operator/internal/consts/consts.go(1 hunks)deploy/cloud/operator/internal/controller/dynamo_model_controller.go(1 hunks)deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go(2 hunks)deploy/cloud/operator/internal/controller/dynamographdeployment_controller.go(1 hunks)deploy/cloud/operator/internal/dynamo/graph.go(1 hunks)deploy/cloud/operator/internal/dynamo/headless_service.go(1 hunks)deploy/cloud/operator/internal/modelendpoint/client.go(1 hunks)deploy/cloud/operator/internal/modelendpoint/discovery.go(1 hunks)deploy/cloud/operator/internal/modelendpoint/lora.go(1 hunks)deploy/cloud/operator/internal/modelendpoint/types.go(1 hunks)deploy/cloud/operator/internal/workerpool/pool.go(1 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 1474
File: deploy/cloud/operator/internal/controller/dynamocomponent_controller.go:1308-1312
Timestamp: 2025-06-11T21:29:28.650Z
Learning: User julienmancuso expects replies in English; avoid switching languages unless explicitly requested.
📚 Learning: 2025-07-18T16:05:05.534Z
Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:1178-1180
Timestamp: 2025-07-18T16:05:05.534Z
Learning: The stopSignal field under lifecycle in DynamoComponentDeployment CRDs is autogenerated due to Kubernetes library upgrades (k8s.io/api and k8s.io/apimachinery from v0.32.3 to v0.33.1), not a manual design decision by the user.
Applied to files:
deploy/cloud/operator/PROJECTdeploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yamldeploy/cloud/operator/internal/consts/consts.godeploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yamldeploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yamldeploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml
📚 Learning: 2025-07-18T16:04:31.771Z
Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml:92-98
Timestamp: 2025-07-18T16:04:31.771Z
Learning: CRD schemas in files like deploy/cloud/helm/crds/templates/*.yaml are auto-generated from Kubernetes library upgrades and should not be manually modified as changes would be overwritten during regeneration.
Applied to files:
deploy/cloud/operator/PROJECTdeploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yamldeploy/cloud/helm/crds/templates/nvidia.com_dynamomodels.yamldeploy/cloud/operator/config/crd/kustomization.yamldeploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml
📚 Learning: 2025-09-04T19:03:06.643Z
Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 2872
File: examples/multimodal/deploy/agg_qwen.yaml:53-60
Timestamp: 2025-09-04T19:03:06.643Z
Learning: In the dynamo repository, Kubernetes Custom Resources use `gpu: "1"` format for GPU resource limits and requests, not the standard Kubernetes `nvidia.com/gpu: 1` format. This applies to DynamoGraphDeployment resources and other dynamo CRs.
Applied to files:
deploy/cloud/operator/PROJECTdeploy/cloud/helm/crds/templates/nvidia.com_dynamomodels.yamldeploy/cloud/operator/internal/consts/consts.godeploy/cloud/operator/config/crd/bases/nvidia.com_dynamomodels.yamldeploy/cloud/operator/config/samples/kustomization.yamldeploy/cloud/operator/config/crd/kustomization.yamldeploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yamldeploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml
📚 Learning: 2025-07-18T16:04:47.465Z
Learnt from: julienmancuso
Repo: ai-dynamo/dynamo PR: 2012
File: deploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml:1233-1235
Timestamp: 2025-07-18T16:04:47.465Z
Learning: The `stopSignal` field in Kubernetes CRDs like DynamoGraphDeployment and DynamoComponentDeployment is autogenerated by controller-gen when upgrading Kubernetes library versions, and represents expected upstream API changes rather than manual code that needs custom validation.
Applied to files:
deploy/cloud/operator/PROJECTdeploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yamldeploy/cloud/operator/config/crd/bases/nvidia.com_dynamocomponentdeployments.yamldeploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yamldeploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml
📚 Learning: 2025-10-24T04:21:08.751Z
Learnt from: biswapanda
Repo: ai-dynamo/dynamo PR: 3858
File: recipes/deepseek-r1/model-cache/model-download.yaml:18-32
Timestamp: 2025-10-24T04:21:08.751Z
Learning: In the recipes directory structure, model-specific recipes (e.g., recipes/deepseek-r1/, recipes/llama-3-70b/) contain hardcoded model names and revisions in their Kubernetes manifests (like model-download.yaml). Each recipe directory is deployment-specific and self-contained, so hardcoding model-specific values is the intended design pattern.
Applied to files:
deploy/cloud/operator/config/crd/kustomization.yaml
🧬 Code graph analysis (12)
deploy/cloud/operator/internal/modelendpoint/discovery.go (2)
deploy/cloud/operator/internal/modelendpoint/types.go (1)
Candidate(21-24)deploy/cloud/operator/api/v1alpha1/dynamo_model_types.go (1)
DynamoModelList(110-114)
deploy/cloud/operator/cmd/main.go (2)
deploy/cloud/operator/internal/controller/dynamo_model_controller.go (1)
DynamoModelReconciler(63-67)deploy/cloud/operator/internal/modelendpoint/client.go (2)
Client(43-45)NewClient(48-54)
deploy/cloud/operator/internal/controller/dynamographdeployment_controller.go (1)
deploy/cloud/operator/internal/dynamo/headless_service.go (1)
ReconcileModelServicesForComponents(37-97)
deploy/cloud/operator/internal/workerpool/pool.go (1)
deploy/cloud/operator/api/dynamo/schemas/schemas.go (1)
Duration(38-38)
deploy/cloud/operator/api/v1alpha1/dynamo_model_types.go (1)
deploy/cloud/operator/api/v1alpha1/groupversion_info.go (1)
SchemeBuilder(35-35)
deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (1)
deploy/cloud/operator/internal/dynamo/headless_service.go (2)
ReconcileModelServicesForComponents(37-97)AddBaseModelLabel(143-147)
deploy/cloud/operator/api/v1alpha1/zz_generated.deepcopy.go (2)
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (1)
ModelReference(279-287)deploy/cloud/operator/api/v1alpha1/dynamo_model_types.go (6)
DynamoModel(99-105)DynamoModelList(110-114)DynamoModelSpec(25-44)ModelSource(47-54)DynamoModelStatus(72-86)EndpointInfo(57-69)
deploy/cloud/operator/internal/controller/dynamo_model_controller.go (5)
deploy/cloud/operator/internal/modelendpoint/client.go (2)
Client(43-45)NewClient(48-54)deploy/cloud/operator/api/v1alpha1/dynamo_model_types.go (2)
DynamoModel(99-105)EndpointInfo(57-69)deploy/cloud/operator/internal/modelendpoint/discovery.go (2)
FindModelsForBaseModel(77-112)ExtractCandidates(35-73)deploy/cloud/operator/internal/modelendpoint/types.go (1)
Candidate(21-24)deploy/cloud/operator/internal/consts/consts.go (1)
DynamoSystemPort(22-22)
deploy/cloud/operator/internal/dynamo/headless_service.go (3)
deploy/cloud/operator/internal/controller_common/resource.go (2)
Reconciler(49-52)SyncResource(60-195)deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2)
DynamoComponentDeploymentSharedSpec(48-111)ModelReference(279-287)deploy/cloud/operator/internal/consts/consts.go (3)
KubeLabelDynamoBaseModel(41-41)DynamoSystemPortName(23-23)DynamoSystemPort(22-22)
deploy/cloud/operator/internal/dynamo/graph.go (1)
deploy/cloud/operator/internal/dynamo/headless_service.go (1)
AddBaseModelLabel(143-147)
deploy/cloud/operator/internal/modelendpoint/client.go (3)
deploy/cloud/operator/internal/modelendpoint/types.go (1)
Candidate(21-24)deploy/cloud/operator/api/v1alpha1/dynamo_model_types.go (2)
DynamoModel(99-105)EndpointInfo(57-69)deploy/cloud/operator/internal/workerpool/pool.go (2)
Task(28-31)Execute(43-102)
deploy/cloud/operator/internal/modelendpoint/lora.go (1)
deploy/cloud/operator/internal/modelendpoint/client.go (1)
Client(43-45)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: sglang (amd64)
- GitHub Check: trtllm (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: vllm (amd64)
- GitHub Check: operator (amd64)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (17)
deploy/cloud/operator/PROJECT (1)
27-34: LGTM!The DynamoModel resource configuration follows the same structure and conventions as the existing DynamoComponentDeployment and DynamoGraphDeployment resources.
deploy/cloud/operator/internal/controller/dynamographdeployment_controller.go (1)
344-354: LGTM!The model service reconciliation is appropriately placed after Grove scaling, ensuring that workload resources are created before setting up endpoint discovery services. Error handling follows the established pattern in this controller.
deploy/cloud/operator/internal/consts/consts.go (1)
41-41: LGTM!The new constant follows the established naming convention and is appropriately positioned with other Kubernetes label constants.
deploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go (2)
86-89: LGTM!The optional
ModelReffield is well-documented and designed for backward compatibility. The documentation clearly explains its purpose for endpoint discovery via headless services.
278-287: LGTM!The
ModelReferencetype is well-designed with appropriate validation markers. The requiredNamefield and optionalRevisionfield provide flexibility while ensuring essential information is present.deploy/cloud/operator/internal/controller/dynamocomponentdeployment_controller.go (2)
330-343: LGTM!The model service reconciliation is correctly implemented for component-level reconciliation. The componentMap contains only the current component, which is appropriate for this controller's scope.
943-955: Improved label handling.The function now properly initializes and populates labels instead of returning an empty map. This ensures that:
- Existing component labels are preserved
- Base model labels are added when a ModelRef is specified
This is a positive change that enables proper label propagation throughout the resource hierarchy.
deploy/cloud/operator/cmd/main.go (2)
63-63: LGTM!The modelendpoint import is correctly added and used for creating the EndpointClient.
564-571: LGTM!The DynamoModelReconciler setup follows the established pattern for controller initialization. The EndpointClient is appropriately created once and injected into the reconciler.
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml (1)
10139-10152: Go types are properly defined and aligned with the CRD schema.The verification confirms that the
ModelReferencestruct is correctly defined indeploy/cloud/operator/api/v1alpha1/dynamocomponentdeployment_types.go(lines 278–287) with proper kubebuilder annotations (+kubebuilder:validation:Requiredforname,+optionalforrevision). ThemodelReffield inDynamoComponentDeploymentSharedSpecis correctly typed as*ModelReferencewith the+optionaltag and proper JSON marshaling hints. The autogenerated CRD schema accurately reflects these Go types, and the structure aligns with howmodelRefis used in the controller code (e.g.,AddBaseModelLabelfunction). No issues found.deploy/cloud/helm/platform/components/operator/templates/manager-rbac.yaml (1)
65-72: LGTM: RBAC permissions properly scoped for DynamoModel CRD.The added permissions for EndpointSlices discovery and DynamoModel lifecycle management (including finalizers and status updates) are appropriate and follow standard Kubernetes controller patterns.
Also applies to: 372-372, 387-387, 396-396
deploy/cloud/operator/config/rbac/role.yaml (1)
89-96: LGTM: RBAC permissions consistent with Helm template.The RBAC additions mirror those in the Helm template and are properly scoped for the DynamoModel controller's operational needs.
Also applies to: 173-173, 188-188, 197-197
deploy/cloud/operator/config/crd/bases/nvidia.com_dynamomodels.yaml (2)
166-182: Verify whether podName should be required in EndpointInfo.The
podNamefield is not in therequiredlist (lines 179-181), suggesting it may be optional. However, in a Kubernetes environment with endpoint discovery via EndpointSlices, the pod name should typically always be known and valuable for debugging and observability.Please confirm whether
podNamecan legitimately be absent in any scenario. If not, consider adding it to the required fields:required: - address + - podName - ready
174-178: Clarify the design intent for base model endpoint tracking.The comment states "For base models: always false (no probing performed)," which suggests base model endpoints are tracked but never marked ready. This raises questions about the utility of endpoint tracking for base models and whether the status structure optimally serves both base and LoRA model use cases.
Please clarify the design rationale:
- Why track endpoints for base models if ready is always false?
- Is there a future plan to probe base model readiness?
- Would separate status structures for base vs LoRA models improve clarity?
deploy/cloud/helm/crds/templates/nvidia.com_dynamocomponentdeployments.yaml (1)
10005-10018: RBAC check for headless Service creationSince modelRef triggers headless service creation for endpoint discovery, verify the PR includes RBAC for Services and EndpointSlices (get/list/watch/create/update/patch) in the operator’s ClusterRole.
Based on learnings.
deploy/cloud/helm/crds/templates/nvidia.com_dynamographdeployments.yaml (1)
10139-10152: Verify that base CRD file has been updated to generate this template change.Per prior learnings on this codebase, CRD schemas in
deploy/cloud/helm/crds/templates/*.yamlare auto-generated from base CRD files indeploy/cloud/operator/config/crd/bases/and should not be manually edited, as manual changes would be overwritten during regeneration.The
modelReffield definition itself appears structurally sound as OpenAPI v3 schema. However, ensure that the corresponding base CRD file (deploy/cloud/operator/config/crd/bases/nvidia.com_dynamographdeployments.yaml) has been updated with this field, and that this template change was auto-generated from it rather than manually added.deploy/cloud/operator/internal/dynamo/headless_service.go (1)
91-105: Review comment is incorrect for Go 1.24.0The repository declares
go 1.24.0in deploy/cloud/operator/go.mod, which is well after Go 1.22. Starting with Go 1.22, loop variables are scoped per iteration rather than reused across iterations, so closures over loop variables are safe. The problematic code pattern the review describes is not an issue in this codebase.Additionally, the review references the wrong file. The actual loops capturing
candidateare in deploy/cloud/operator/internal/modelendpoint/client.go (LoadLoRA at lines 91–102, UnloadLoRA at lines 152–163), not in headless_service.go.The code requires no changes.
Likely an incorrect or invalid review comment.
deploy/cloud/operator/internal/controller/dynamo_model_controller.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Julien Mancuso <[email protected]>
Signed-off-by: Julien Mancuso <[email protected]>
Signed-off-by: Julien Mancuso <[email protected]>
Signed-off-by: Julien Mancuso <[email protected]>
Signed-off-by: Julien Mancuso <[email protected]>
| logs.Info("Finalizing DynamoModel", "modelType", model.Spec.ModelType) | ||
|
|
||
| // Only perform cleanup for LoRA models | ||
| if model.Spec.ModelType == "lora" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bring the left side to the lower case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b9d6c2
| logs.Info("Unloading LoRA from endpoints", "endpointCount", len(candidates)) | ||
|
|
||
| // Initialize endpoint client if needed | ||
| if r.EndpointClient == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check happens in the Reconcile and FinalizeResource(). Could a race condition happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b9d6c2
| candidates, serviceNames, err := r.getEndpointCandidates(ctx, model) | ||
| if err != nil { | ||
| // Error already logged and status updated in helper | ||
| return ctrl.Result{RequeueAfter: 30 * time.Second}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like 30 is used more than once. Make it a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b9d6c2
| ctx context.Context, | ||
| reconciler commonController.Reconciler, | ||
| owner client.Object, | ||
| services map[string]*v1alpha1.DynamoComponentDeploymentSharedSpec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat: these are components not services - rename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b9d6c2
| // Uses a hash of the model name to avoid label length/character restrictions | ||
| func AddBaseModelLabel(labels map[string]string, modelRef *v1alpha1.ModelReference) { | ||
| if modelRef != nil && modelRef.Name != "" { | ||
| labels[commonconsts.KubeLabelDynamoBaseModelHash] = HashModelName(modelRef.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could labels be nill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b9d6c2
| "sigs.k8s.io/controller-runtime/pkg/log" | ||
| ) | ||
|
|
||
| // ReconcileModelServicesForComponents creates headless services for components with modelRef |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name "headless_service" exposes implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 2b9d6c2
Signed-off-by: Julien Mancuso <[email protected]>
Overview:
add dynamoModel CRD
Summary by CodeRabbit
Release Notes
New Features
Chores
Example of a DGD and associated new DynamoModel CR :
the new controller would make sure the workers of the DGD (both decode and worker) would have the LORA loaded by calling their POST /v1/loras API.
internally we use headless service and associated endpointSlices to make sure the LORA are loaded