-
Notifications
You must be signed in to change notification settings - Fork 182
Proposal for Multi-Cluster InferencePools #1374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for Multi-Cluster InferencePools #1374
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Welcome @bexxmodd! |
Hi @bexxmodd. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/cc @robscott |
/ok-to-test |
@bexxmodd can you please remove the .DS_Store files? |
// - Ready: at least one EPP/parent is ready. | ||
// | ||
// +kubebuilder:validation:Optional | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the user needs to know the freshness, but controller will probably need to be able to check. Is there something (OG, resourceVersion, etc.) that's incremented on status changes?
### InferencePoolImport Naming | ||
|
||
The exporting controller will create an InferencePoolImport resource using the exported InferencePool namespace and name. A cluster name entry in | ||
`inferencepoolimport.statu.clusters[]` is added for each cluster that exports an InferencePool with the same ns/name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that ONLY having cluster name in status is effectively requiring the gateway controller to read from remote API servers whether the gateway controller is operating in endpoint mode or parent mode. Are we sure we want that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on Monday's meeting, it was my understanding that we achieved consensus on starting minimal:
- A list of exporting clusters.
- Implementations are responsible for discovering the exported InferencePool in the cluster.
- InferencePoolImport namespace/name sameness to aid in discovering the exported InferencePool and simplified UX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm not saying it's a MUST for this initial implementation to support non global controllers. I'm just saying we should probably document the requirement for implementors
@nirrozenbaum @kfswain @robscott @bexxmodd @mikemorris @ryanzhang-oss @nirrozenbaum @keithmattix @srampal @elevran thank you for your involvement with this proposal. I have added a topic to tomorrow's community meeting to discuss this PR. The plan is to use the meeting to resolve the final details so we can get this PR through the finish line shortly after. |
1. **Export an InferencePool:** An [Inference Platform Owner](https://gateway-api-inference-extension.sigs.k8s.io/concepts/roles-and-personas/) | ||
exports an InferencePool by annotating it. | ||
2. **Exporting Controller:** | ||
- Watches exported InferencePool resources (must have access to the K8s API server). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if this controller needs to watch all the inferencePool resources, sig MC has a KEP to avoid storing secrete in the controller https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/5339-clusterprofile-plugin-credentials
- Most of the multi-cluster manager projects (Kubefleet/OCM etc) uses pull mode instead of push so they don't watch any of the resources in the working cluster
// - Ready: at least one EPP/parent is ready. | ||
// | ||
// +kubebuilder:validation:Optional | ||
Conditions []metav1.Condition `json:"conditions,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, just moving the cluster names to the state won't really solve the freshness problem.
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Mc inference gw danehans
Signed-off-by: Daneyon Hansen <[email protected]>
Updates status.clusters refs
Signed-off-by: Daneyon Hansen <[email protected]>
Removes controller type and adds top-level conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,165 @@ | |||
# Multi-Cluster Inference Gateways | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the review meeting. we decided to rename this as Multi-cluster Inference Pools. Hence please rename this title also as Multi-Cluster Inference Pools to be accurate. Calling it Multi-Cluster Inference Gateways is inaccurate since there are other ways to do Multi-cluster Inference without use of Multi-cluster InferencePools which are the main focus of this proposal. Also add a line stating that Multi-cluster Inference Pools are one possible way of providing Multi-cluster Inference Gateways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srampal you must be referencing an outdated version of the proposal. The proposal is named docs/proposals/1374-multi-cluster-inference
and not docs/proposals/XXXX-mc-inference-gateways
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danehans Yes, I see you are right.. However the comment applies to the new version as well since even in the new proposal the title is "Multi-Cluster Inference Gateways". Its a nit I suppose so I wont be picky about it ... but just felt it should not inadvertently convey that this is the only or even primary way to implement multi-cluster inference gateways when it is really about extending Inferencepools. Your call .. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bexxmodd, danehans, robscott The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Initial design doc: https://docs.google.com/document/d/1QGvG9ToaJ72vlCBdJe--hmrmLtgOV_ptJi9D58QMD2w/edit?usp=sharing