Skip to content

Conversation

bexxmodd
Copy link
Contributor

@bexxmodd bexxmodd commented Aug 14, 2025

Copy link

netlify bot commented Aug 14, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 75ed0b8
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68d6d18da5f6120008f669b7
😎 Deploy Preview https://deploy-preview-1374--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 14, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @bexxmodd!

It looks like this is your first PR to kubernetes-sigs/gateway-api-inference-extension 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api-inference-extension has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 14, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @bexxmodd. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 14, 2025
@bexxmodd
Copy link
Contributor Author

/cc @robscott

@k8s-ci-robot k8s-ci-robot requested a review from robscott August 14, 2025 00:50
@robscott
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 14, 2025
@nirrozenbaum
Copy link
Contributor

@bexxmodd can you please remove the .DS_Store files?

@bexxmodd
Copy link
Contributor Author

bexxmodd commented Aug 14, 2025

@bexxmodd can you please remove the .DS_Store files?

Removed.

Also, created PR to gitignore macOS generated files #1378

// - Ready: at least one EPP/parent is ready.
//
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the user needs to know the freshness, but controller will probably need to be able to check. Is there something (OG, resourceVersion, etc.) that's incremented on status changes?

### InferencePoolImport Naming

The exporting controller will create an InferencePoolImport resource using the exported InferencePool namespace and name. A cluster name entry in
`inferencepoolimport.statu.clusters[]` is added for each cluster that exports an InferencePool with the same ns/name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that ONLY having cluster name in status is effectively requiring the gateway controller to read from remote API servers whether the gateway controller is operating in endpoint mode or parent mode. Are we sure we want that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Monday's meeting, it was my understanding that we achieved consensus on starting minimal:

  1. A list of exporting clusters.
  2. Implementations are responsible for discovering the exported InferencePool in the cluster.
  3. InferencePoolImport namespace/name sameness to aid in discovering the exported InferencePool and simplified UX.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not saying it's a MUST for this initial implementation to support non global controllers. I'm just saying we should probably document the requirement for implementors

@danehans
Copy link
Contributor

@nirrozenbaum @kfswain @robscott @bexxmodd @mikemorris @ryanzhang-oss @nirrozenbaum @keithmattix @srampal @elevran thank you for your involvement with this proposal. I have added a topic to tomorrow's community meeting to discuss this PR. The plan is to use the meeting to resolve the final details so we can get this PR through the finish line shortly after.

1. **Export an InferencePool:** An [Inference Platform Owner](https://gateway-api-inference-extension.sigs.k8s.io/concepts/roles-and-personas/)
exports an InferencePool by annotating it.
2. **Exporting Controller:**
- Watches exported InferencePool resources (must have access to the K8s API server).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. if this controller needs to watch all the inferencePool resources, sig MC has a KEP to avoid storing secrete in the controller https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/5339-clusterprofile-plugin-credentials
  2. Most of the multi-cluster manager projects (Kubefleet/OCM etc) uses pull mode instead of push so they don't watch any of the resources in the working cluster

// - Ready: at least one EPP/parent is ready.
//
// +kubebuilder:validation:Optional
Conditions []metav1.Condition `json:"conditions,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, just moving the cluster names to the state won't really solve the freshness problem.

@bexxmodd bexxmodd changed the title Proposal for Multi-Cluster Inference Gateways Proposal for Multi-Cluster Inference Pooling Sep 18, 2025
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 25, 2025
@danehans danehans changed the title Proposal for Multi-Cluster Inference Pooling Proposal for Multi-Cluster InferencePools Sep 25, 2025
Removes controller type and adds top-level conditions
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 26, 2025
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bexxmodd and @danehans! This proposal LGTM, we can continue to iterate on this over the next month, but I think it's in a good enough state to merge as is.

@@ -0,0 +1,165 @@
# Multi-Cluster Inference Gateways

Copy link
Contributor

@srampal srampal Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the review meeting. we decided to rename this as Multi-cluster Inference Pools. Hence please rename this title also as Multi-Cluster Inference Pools to be accurate. Calling it Multi-Cluster Inference Gateways is inaccurate since there are other ways to do Multi-cluster Inference without use of Multi-cluster InferencePools which are the main focus of this proposal. Also add a line stating that Multi-cluster Inference Pools are one possible way of providing Multi-cluster Inference Gateways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srampal you must be referencing an outdated version of the proposal. The proposal is named docs/proposals/1374-multi-cluster-inference and not docs/proposals/XXXX-mc-inference-gateways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danehans Yes, I see you are right.. However the comment applies to the new version as well since even in the new proposal the title is "Multi-Cluster Inference Gateways". Its a nit I suppose so I wont be picky about it ... but just felt it should not inadvertently convey that this is the only or even primary way to implement multi-cluster inference gateways when it is really about extending Inferencepools. Your call .. :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srampal I don't see the title discrepancy that you're referencing. I plan on merging the PR since the intent of the proposal has been extensively reviewed and approved by @robscott. If I'm missing something about the naming, please open a follow-on PR to fix any naming issues you may have.

@danehans
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 30, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bexxmodd, danehans, robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 30, 2025
@k8s-ci-robot k8s-ci-robot merged commit bfd979d into kubernetes-sigs:main Sep 30, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants