-
Notifications
You must be signed in to change notification settings - Fork 206
Proposal for Multi-Cluster InferencePools #1374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for Multi-Cluster InferencePools #1374
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Welcome @bexxmodd! |
|
Hi @bexxmodd. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/cc @robscott |
|
/ok-to-test |
|
@bexxmodd can you please remove the .DS_Store files? |
| 1. **Export an InferencePool:** An [Inference Platform Owner](https://gateway-api-inference-extension.sigs.k8s.io/concepts/roles-and-personas/) | ||
| exports an InferencePool by annotating it. | ||
| 2. **Exporting Controller:** | ||
| - Watches exported InferencePool resources (must have access to the K8s API server). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- if this controller needs to watch all the inferencePool resources, sig MC has a KEP to avoid storing secrete in the controller https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/5339-clusterprofile-plugin-credentials
- Most of the multi-cluster manager projects (Kubefleet/OCM etc) uses pull mode instead of push so they don't watch any of the resources in the working cluster
| // - Ready: at least one EPP/parent is ready. | ||
| // | ||
| // +kubebuilder:validation:Optional | ||
| Conditions []metav1.Condition `json:"conditions,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, just moving the cluster names to the state won't really solve the freshness problem.
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Mc inference gw danehans
Signed-off-by: Daneyon Hansen <[email protected]>
Updates status.clusters refs
Signed-off-by: Daneyon Hansen <[email protected]>
Removes controller type and adds top-level conditions
robscott
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,165 @@ | |||
| # Multi-Cluster Inference Gateways | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the review meeting. we decided to rename this as Multi-cluster Inference Pools. Hence please rename this title also as Multi-Cluster Inference Pools to be accurate. Calling it Multi-Cluster Inference Gateways is inaccurate since there are other ways to do Multi-cluster Inference without use of Multi-cluster InferencePools which are the main focus of this proposal. Also add a line stating that Multi-cluster Inference Pools are one possible way of providing Multi-cluster Inference Gateways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srampal you must be referencing an outdated version of the proposal. The proposal is named docs/proposals/1374-multi-cluster-inference and not docs/proposals/XXXX-mc-inference-gateways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danehans Yes, I see you are right.. However the comment applies to the new version as well since even in the new proposal the title is "Multi-Cluster Inference Gateways". Its a nit I suppose so I wont be picky about it ... but just felt it should not inadvertently convey that this is the only or even primary way to implement multi-cluster inference gateways when it is really about extending Inferencepools. Your call .. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bexxmodd, danehans, robscott The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Proposal for mc inference gateway. * Formatting updates. * Give PR number to the proposal. * Adding author(s) * Removed ds_store files. * Removed ds_store file * Formatting updates. * Replaced lists to use asterisks. * Removed local from supported values. * Add details to MCI proposal Signed-off-by: Daneyon Hansen <[email protected]> * Makes EPP Port Singular Signed-off-by: Daneyon Hansen <[email protected]> * Removes Routing Mode Config and Resolved Open Questions Signed-off-by: Daneyon Hansen <[email protected]> * Update docs/proposals/1374-multi-cluster-inference/README.md Co-authored-by: Ryan Zhang <[email protected]> * Implement Sept 15 Meeting Feedback Signed-off-by: Daneyon Hansen <[email protected]> * Resolves review feedback Signed-off-by: Daneyon Hansen <[email protected]> * Adds TBD InferencePool status condition Signed-off-by: Daneyon Hansen <[email protected]> * Adds sync topology and mods workflow Signed-off-by: Daneyon Hansen <[email protected]> * Updates InferencePoolImport status to include parents Signed-off-by: Daneyon Hansen <[email protected]> * Minor fixed based on robscott review feedback Signed-off-by: Daneyon Hansen <[email protected]> * Refactors InferencePoolImport status Signed-off-by: Daneyon Hansen <[email protected]> * Updates status.clusters refs Signed-off-by: Daneyon Hansen <[email protected]> * Removes controller type and adds top-level conditions Signed-off-by: Daneyon Hansen <[email protected]> --------- Signed-off-by: Daneyon Hansen <[email protected]> Co-authored-by: Daneyon Hansen <[email protected]> Co-authored-by: Ryan Zhang <[email protected]>
Initial design doc: https://docs.google.com/document/d/1QGvG9ToaJ72vlCBdJe--hmrmLtgOV_ptJi9D58QMD2w/edit?usp=sharing