-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Is this a feature request?
Yes
Problem
Currently, the Knative Revision is tightly coupled with the Kubernetes Deployment resource. The revision controller's reconciliation logic directly creates and manages a Deployment to scale the number of pods. This hardcoded dependency prevents users from leveraging Knative's powerful features with other types of workloads that might be better suited for specific use cases(e.g., request-based autoscaling, AI Agent Pod's resuming & restore ).
Proposal
We propose to decouple the Revision from the Deployment by introducing a pluggable workload interface. This would allow users or other controllers to specify which kind of workload a Revision should manage.
Our suggested approach is to introduce a "Workload" or "Scaler" abstraction layer. The Revision controller would interact with this abstract interface rather than directly with Deployments.
-
Define a Workload Interface: Create a Go interface that abstracts the necessary operations for a scalable resource, for example:
// Workload defines the interface for a resource that can be scaled by a Knative Revision. interface Workload { // Reconcile ensures the underlying workload resource matches the desired state. Reconcile(ctx context.Context, rev *v1.Revision) error // GetReadyReplicas returns the number of ready replicas for the workload. GetReadyReplicas(ctx context.Context, rev *v1.Revision) (int32, error) // GetStatus returns the current status of the workload. GetStatus(ctx context.Context, rev *v1.Revision) (*WorkloadStatus, error) }
-
Default Implementation: The default implementation of this interface would encapsulate the existing logic that manages
Deployments, ensuring full backward compatibility. -
Pluggable Mechanism: Allow users to specify the desired workload type via an annotation on the
ServiceorRevisionobject. For example:apiVersion: serving.knative.dev/v1 kind: Service metadata: name: my-stateful-service annotations: serving.knative.dev/workload-type: "Sandbox" # or "Sandbox", "CustomJob", etc. serving.knative.dev/workload-apiversion: "agents.x-k8s.io/v1alpha1" spec: template: # ... revision template spec
-
Controller Logic: The
Revisioncontroller would check for this annotation. If present, it would use the correspondingWorkloadimplementation. If absent, it would fall back to the defaultDeploymentimplementation.
Use Case: AI Agent Runtimes with Kubernetes Sandboxes
A compelling use case is managing AI Agent runtimes. The ideal model for this scenario is "one session, one pod," where each agent session gets a dedicated, isolated, and stateful environment. A standard Deployment is illsuited for this, as it's designed for stateless, replicated services.
A much better fit would be a StatefulSet of size 1, or a custom resource like Sandbox from the kubernetes-sigs/agent-sandbox project, which provides features like hibernation and strong isolation with gVisor.
With the current architecture, we are forced to choose:
- Use Knative and accept the limitations of a
Deploymentfor a stateful, session-based workload. - Use the more appropriate
Sandboxworkload but lose all the benefits of Knative's autoscaling and traffic management.
By making the workload type pluggable, we could combine the best of both worlds: using Knative to manage Sandbox resources, allowing us to scale agent environments to zero when inactive and efficiently manage traffic to them.
Benefits
- Flexibility: Enables Knative to manage a wider variety of workloads, including stateful applications, AI/ML models, and sandboxed environments.
- Extensibility: Allows the community and other projects to integrate new and custom workload types with Knative Serving without requiring changes to the core codebase.
- Unlocks New Use Cases: Makes Knative a more versatile and powerful platform for a broader range of serverless and event-driven applications beyond simple stateless services.
/area API
/kind feature-request