Skip to content

Conversation

@delavet
Copy link
Contributor

@delavet delavet commented Nov 12, 2025

This is a proposal aimed at addressing the Active-Active HA issue raised in #1406. The core idea is to implement a distributed state store that guarantees eventual consistency, based on memberlist and CRDT.

We have conducted an initial PoC validation of this solution, and therefore, we are submitting the proposal to the community, looking forward to feedbacks from the reviewers. If there is anything I should know or anything I should do, please feel free to leave a comment.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: delavet
Once this PR has been reviewed and has the lgtm label, please assign kfswain for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 12, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @delavet. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 12, 2025
@netlify
Copy link

netlify bot commented Nov 12, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 0f105bc
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/691446304aef7f0008a48c65
😎 Deploy Preview https://deploy-preview-1851--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

# Active-Active HA Deployment Architecture Proposal
Author(s): @delavet

## Proposal Status
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to see in this proposal some analysis of the tradeoffs -
for example:

  • what is the overhead of redis based sync Vs implementing our own gossip and memberlist.
  • how much extra time would be spent on synchronization.
  • what are the alternatives? e.g., if we can make active passive work with excellent performance and upon a crash of the active make the passive work as expected within a millisecond, does it still worth the described effort?

I went over the proposal - this seems like a lot of complexity and a lot of code.
we need HA in the long term for sure, the question I'm trying to answer if if current proposal is justified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need the analysis of this part! I will organize the relevant analysis content and data soon and add it to the proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants