Skip to content

Feature Proposal: Decouple Revision from Deployment to Support Pluggable Workloads #16307

@alickreborn0

Description

@alickreborn0

Is this a feature request?

Yes

Problem

Currently, the Knative Revision is tightly coupled with the Kubernetes Deployment resource. The revision controller's reconciliation logic directly creates and manages a Deployment to scale the number of pods. This hardcoded dependency prevents users from leveraging Knative's powerful features with other types of workloads that might be better suited for specific use cases(e.g., request-based autoscaling, AI Agent Pod's resuming & restore ).

Proposal

We propose to decouple the Revision from the Deployment by introducing a pluggable workload interface. This would allow users or other controllers to specify which kind of workload a Revision should manage.

Our suggested approach is to introduce a "Workload" or "Scaler" abstraction layer. The Revision controller would interact with this abstract interface rather than directly with Deployments.

  1. Define a Workload Interface: Create a Go interface that abstracts the necessary operations for a scalable resource, for example:

    // Workload defines the interface for a resource that can be scaled by a Knative Revision.
    interface Workload {
        // Reconcile ensures the underlying workload resource matches the desired state.
        Reconcile(ctx context.Context, rev *v1.Revision) error
    
        // GetReadyReplicas returns the number of ready replicas for the workload.
        GetReadyReplicas(ctx context.Context, rev *v1.Revision) (int32, error)
    
        // GetStatus returns the current status of the workload.
        GetStatus(ctx context.Context, rev *v1.Revision) (*WorkloadStatus, error)
    }
  2. Default Implementation: The default implementation of this interface would encapsulate the existing logic that manages Deployments, ensuring full backward compatibility.

  3. Pluggable Mechanism: Allow users to specify the desired workload type via an annotation on the Service or Revision object. For example:

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: my-stateful-service
      annotations:
        serving.knative.dev/workload-type: "Sandbox" # or "Sandbox", "CustomJob", etc.
        serving.knative.dev/workload-apiversion: "agents.x-k8s.io/v1alpha1" 
    spec:
      template:
        # ... revision template spec
  4. Controller Logic: The Revision controller would check for this annotation. If present, it would use the corresponding Workload implementation. If absent, it would fall back to the default Deployment implementation.

Use Case: AI Agent Runtimes with Kubernetes Sandboxes

A compelling use case is managing AI Agent runtimes. The ideal model for this scenario is "one session, one pod," where each agent session gets a dedicated, isolated, and stateful environment. A standard Deployment is illsuited for this, as it's designed for stateless, replicated services.

A much better fit would be a StatefulSet of size 1, or a custom resource like Sandbox from the kubernetes-sigs/agent-sandbox project, which provides features like hibernation and strong isolation with gVisor.

With the current architecture, we are forced to choose:

  • Use Knative and accept the limitations of a Deployment for a stateful, session-based workload.
  • Use the more appropriate Sandbox workload but lose all the benefits of Knative's autoscaling and traffic management.

By making the workload type pluggable, we could combine the best of both worlds: using Knative to manage Sandbox resources, allowing us to scale agent environments to zero when inactive and efficiently manage traffic to them.

Benefits

  • Flexibility: Enables Knative to manage a wider variety of workloads, including stateful applications, AI/ML models, and sandboxed environments.
  • Extensibility: Allows the community and other projects to integrate new and custom workload types with Knative Serving without requiring changes to the core codebase.
  • Unlocks New Use Cases: Makes Knative a more versatile and powerful platform for a broader range of serverless and event-driven applications beyond simple stateless services.

/area API
/kind feature-request

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/APIAPI objects and controllerskind/featureWell-understood/specified features, ready for coding.kind/feature-request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions