Skip to content

[DRAFT LFX] Refactor Webhook Certs to Secrets & Persist Network Allocations via CRD#231

Open
ballista01 wants to merge 1 commit intoopenkruise:masterfrom
ballista01:feature/webhook-network
Open

[DRAFT LFX] Refactor Webhook Certs to Secrets & Persist Network Allocations via CRD#231
ballista01 wants to merge 1 commit intoopenkruise:masterfrom
ballista01:feature/webhook-network

Conversation

@ballista01
Copy link
Member

Addresses:

Motivation / Problem Statement:

This PR tackles two key challenges to enhance the robustness and scalability of the Kruise Game controller manager:

  1. Webhook Certificate Management for High Availability: In a multi-replica kruise-game-controller-manager deployment, each replica generating its own self-signed certificate can lead to TLS verification errors (x509: certificate signed by unknown authority). This occurs because the Kubernetes API server, when calling the webhook service, might be routed to different pods with different, untrusted CAs.
  2. Network Allocation Persistence for Controller Scalability (Issue [feat] Kruise Game Multi-Replicas Migration #220): Currently, network port allocation information within cloud provider plugins (e.g., for NLB/CLB ports) is often cached in memory. This makes the controller stateful, prevents safe multi-replica deployments (due to inconsistent caches), and can lead to allocation data loss upon controller restarts.

Proposed Changes:

This PR introduces the following key changes:

  1. Webhook Certificate Management via Kubernetes Secrets:

    • What: The webhook certificate generation and storage mechanism has been refactored from a purely filesystem-based approach (FSCertWriter) to using a SecretCertWriter. This new writer ensures that the TLS certificate and private key for the webhook server are generated and stored within a Kubernetes Secret (e.g., webhook-server-cert in the kruise-game-system namespace).
    • Why:
      • Consistency: By sourcing the certificate from a single, shared Secret, all replicas of the kruise-game-controller-manager will use the exact same TLS certificate. This allows the Kubernetes API server's caBundle (for the webhook configurations) to consistently trust all webhook server instances.
      • Foundation for HA: This is a prerequisite for running multiple webhook server replicas reliably.
      • Cloud-Native Storage: Secrets are the standard Kubernetes way to store sensitive data like TLS certificates.
  2. Introduction of NetworkAllocation CRD for Port Persistence:

    • What: A new Custom Resource Definition, NetworkAllocation (game.kruise.io/v1alpha1), has been introduced.
      • NetworkAllocationSpec defines the desired allocation, including LbID, Port, Protocol, and a PodRef linking to the Pod that owns the allocation.
      • Cloud provider plugins (initially JDCloud NLB and TencentCloud CLB) have been modified to:
        • Create a NetworkAllocation CR instance when a port is successfully allocated to a Pod.
        • Delete the corresponding NetworkAllocation CR when the port is deallocated.
    • Why:
      • Durability: Persists vital port allocation state beyond the controller's memory, preventing loss on restarts.
      • Observability: Allows administrators and other components to inspect current port allocations using kubectl get networkallocations.
      • Step towards HA (Issue [feat] Kruise Game Multi-Replicas Migration #220): Moves critical state out of individual controller replica memory, which is essential for enabling safe multi-replica deployments of the controller manager. Storing this information in CRDs leverages Kubernetes' own extensibility and storage (etcd) as a first step, reducing immediate external dependencies for this persistence layer.

Key Benefits:

  • Improved reliability and trustworthiness of webhook TLS, especially in preparation for multi-replica deployments.
  • Enhanced durability and observability of network port allocations made by cloud provider plugins.
  • Provides a significant architectural step towards full multi-replica high availability for the kruise-game-controller-manager.

Limitations and Known Issues (Work in Progress):

  1. Network Allocation Decision Logic (Multi-Replica Safety):

    • While this PR successfully persists the results of port allocations using the NetworkAllocation CRD, the port allocation decision-making logic within the cloud provider plugins (e.g., the in-memory c.cache in jdcloud/nlb.go and tencentcloud/clb.go) is not yet fully multi-replica safe.
    • In a multi-replica scenario, controllers might still make allocation decisions based on their potentially stale local in-memory caches, which could lead to port conflicts or inefficiencies. This PR focuses on persisting the outcome as a foundational step.
  2. NetworkAllocation CRD Scalability:

    • The current implementation creates one NetworkAllocation CR per allocated port. For Pods requiring numerous ports or in large-scale deployments, this could result in a high volume of CRs, potentially impacting API server and etcd performance.
  3. Certificate Lifecycle Management:

    • The SecretCertWriter ensures consistent certificate generation and storage in a Secret. However, for full lifecycle management, including automated rotation and integration with trusted CAs (like Let's Encrypt), further integration with a tool like cert-manager would be beneficial.

Future Work and Proposed Next Steps:

  1. Achieve Full HA for Network Allocation Logic:
    • Refactor the port allocation decision-making within cloud provider plugins to be multi-replica safe. This would involve:
      • Consulting existing NetworkAllocation CRs (e.g., by listing them) to determine port availability before making an allocation.
      • Potentially implementing a distributed locking mechanism (e.g., using Lease objects or an external system) if strict atomicity during allocation is required across replicas.
  2. Optimize NetworkAllocation CRD:
    • Explore aggregating multiple port allocations for a single Pod into a single NetworkAllocation CR (e.g., by making spec.ports a list) to reduce the overall number of CR instances.
  3. Implement OwnerReferences:
    • Ensure NetworkAllocation CRs have OwnerReferences set to their respective Pod objects to enable automatic garbage collection when Pods are deleted. (Note: PodRef is in the spec, but explicit metadata.ownerReferences is needed for GC).
  4. Full cert-manager Integration:
    • Transition from the custom SecretCertWriter to leveraging cert-manager for managing the lifecycle of webhook TLS certificates, including automated provisioning and rotation.
  5. Enhanced NetworkAllocationStatus:
    • Add more detailed fields to NetworkAllocationStatus to reflect the actual state, potential conflicts, or last validation time.

Request for Feedback & Collaboration Offer:

This is a draft PR intended to showcase my understanding of the issues and a potential direction for solutions, particularly in the context of my LFX Mentorship application for OpenKruiseGame.

I have put my thought into this approach and its implications. If the community or mentors are interested, I would be very happy to draft a more detailed design document covering the architecture, API considerations, alternative approaches considered, and a long-term roadmap for these features. I'm eager to share this in any relevant working group or community forum to gather broader feedback and refine the solution collaboratively.

I welcome all feedback, critiques, and suggestions on this initial approach. Thank you for your consideration!

feat(network): Introduce NetworkAllocation CRD for port persistence
@kruise-bot kruise-bot requested review from furykerry and zmberg May 28, 2025 22:04
@kruise-bot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zmberg for approval by writing /assign @zmberg in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kruise-bot
Copy link

Welcome @ballista01! It looks like this is your first PR to openkruise/kruise-game 🎉

@kruise-bot
Copy link

@ballista01: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrisliu1995
Copy link
Member

@ballista01 Thank you for your contribution. Your proposal is inspiring. However, CRDs are not suitable for network plugins because different cloud providers have different parameters.

I' m appreciate for your PR. If you are still interested to OKG, you can work on other issues. Leave your email please, and I will contact to you.

@ballista01
Copy link
Member Author

@chrisliu1995 Hi, thanks for the explanation. Here's my email: ballista01@outlook.com.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments