Skip to content

Commit 72bdb12

Browse files
Merge pull request #44 from hzxuzhonghu/agent-code-runtime
Add runtime api design proposal
2 parents a4bbec6 + 3d918e3 commit 72bdb12

File tree

3 files changed

+520
-0
lines changed

3 files changed

+520
-0
lines changed
Lines changed: 334 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,334 @@
1+
---
2+
title: Sandbox Template for Agent and CodeInterpreter Runtimes
3+
authors:
4+
- @hzxuzhonghu
5+
reviewers:
6+
- "@volcano-sh/agentcube-approvers"
7+
- TBD
8+
approvers:
9+
- "@volcano-sh/agentcube-approvers"
10+
- TBD
11+
creation-date: 2025-11-27
12+
13+
---
14+
15+
## Declarative API for Agent and CodeInterpreter Runtimes
16+
17+
### Summary
18+
19+
This proposal outlines a design for introducing a declarative API to manage runtimes specifically for Agent and CodeInterpreter runtimes in AgentCube. The goal is to enable users to define desired states for their sandboxes, allowing the system to automatically handle creation on the arrival of first invocation. The whole workflow out of the scope of this proposal. So we will focus on the declarative API design and about the whole system architecture, there will be another proposal.
20+
21+
### Motivation
22+
23+
#### Goals
24+
25+
- Provide a declarative API for developers to specify desired runtime states for Agent and CodeInterpreter sandboxes.
26+
- Enable automatic sandbox creation upon first invocation based on the defined runtime templates.
27+
- Ensure seamless integration with existing AgentCube components and workflows.
28+
29+
#### Non-Goals
30+
- This proposal does not cover the entire workflow of sandbox lifecycle management, focusing solely on the declarative API design.
31+
- It does not address `Function` runtimes.
32+
- It does not include implementation details for the automatic creation mechanism triggered by first invocation.
33+
- It does not cover the workflow details of the dataplane.
34+
35+
### Proposal
36+
37+
#### User Stories (Optional)
38+
39+
##### Story 1
40+
41+
As an agentic AI developer, I want a security‑isolated runtime for executing code derived from LLM‑generated code, so that potentially untrusted analysis code can run safely without impacting other tenants, control‑plane components, or production data.
42+
I should be able to declare this isolated runtime as part of a sandbox template, without manually managing pods, images, or security settings each time I add or update a code‑interpreter‑capable agent.
43+
44+
##### Story 2
45+
46+
As an agentic AI developer, I want to deploy my agents on the serverless platform in a declarative manner, specifying runtime configurations such as resource limits, environment variables, and security contexts in a sandbox template, so that I can ensure consistent and repeatable deployments across different environments without manual intervention.
47+
48+
### Design Details
49+
50+
#### Why do we need separate api rather than existing SandboxTemplate?
51+
52+
There is a existing [`SandboxTemplate`](http://github.com/kubernetes-sigs/agent-sandbox/blob/main/extensions/api/v1alpha1/sandboxtemplate_types.go#L57) in the kubernetes-sigs/agent-sandbox project. However, it is designed to be a generic template for various sandbox types and may not cater specifically to the unique requirements of Agent and CodeInterpreter runtimes.
53+
54+
- SandboxTemplate simply reuses pod template, making it hard to express multi-version runtimes. As it is common to have multiple versions of Agent or CodeInterpreter runtimes, a more specialized template is needed to manage these variations effectively.
55+
- Different runtimes may have distinct configuration needs that are not adequately addressed by a generic pod template. Likely we need to support a warmpool for code interpreter runtime, but not for agent runtime. Because code interpreter needs very low latency for cold start, while agent runtime can afford longer cold start time.
56+
- Different runtimes may serve different protocols or endpoints that require specific handling not covered by a generic template.
57+
58+
By introducing a dedicated runtime template, we can tailor the API to better suit the specific needs of these runtimes, such as specialized configuration options, lifecycle management, and integration points.
59+
60+
#### Agent Runtime CRD
61+
62+
```go
63+
64+
// AgentRuntime defines the desired state of an agent runtime environment.
65+
type AgentRuntime struct {
66+
metav1.TypeMeta `json:",inline"`
67+
// metadata is a standard object metadata
68+
// +optional
69+
metav1.ObjectMeta `json:"metadata,omitempty,omitzero" protobuf:"bytes,1,opt,name=metadata"`
70+
// Spec defines the desired state of the AgentRuntime.
71+
Spec AgentRuntimeSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"`
72+
// Status represents the current state of the AgentRuntime.
73+
Status AgentRuntimeStatus `json:"status" protobuf:"bytes,3,opt,name=status"`
74+
}
75+
76+
77+
type AgentRuntimeSpec struct {
78+
// Ports is a list of ports that the agent runtime will expose.
79+
Ports []TargetPort
80+
81+
// Template describes the template that will be used to create an agent sandbox.
82+
// +kubebuilder:validation:Required
83+
Template *SandboxTemplate `json:"template" protobuf:"bytes,1,opt,name=template"`
84+
85+
// SessionTimeout describes the duration after which an inactive session will be terminated.
86+
// +kubebuilder:validation:Required
87+
// +kubebuilder:default="15m"
88+
SessionTimeout *metav1.Duration `json:"sessionTimeout,omitempty" protobuf:"bytes,2,opt,name=sessionTimeout"`
89+
90+
// MaxSessionDuration describes the maximum duration for a session.
91+
// After this duration, the session will be terminated no matter active or inactive.
92+
// +kubebuilder:validation:Required
93+
// +kubebuilder:default="8h"
94+
MaxSessionDuration *metav1.Duration `json:"maxSessionDuration,omitempty" protobuf:"bytes,3,opt,name=maxSessionDuration"`
95+
}
96+
97+
type ProtocolType string
98+
99+
const (
100+
ProtocolTypeHTTP ProtocolType = "HTTP"
101+
ProtocolTypeHTTPS ProtocolType = "HTTPS"
102+
)
103+
104+
type TargetPort struct {
105+
// PathPrefix is the path prefix to route to this port.
106+
// For example, if PathPrefix is "/api", requests to "/api/..." will be routed to this port.
107+
// +optional
108+
PathPrefix string `json:"pathPrefix,omitempty" protobuf:"bytes,4,opt,name=pathPrefix"`
109+
// Name is the name of the port.
110+
// +optional
111+
Name string `json:"name,omitempty" protobuf:"bytes,1,opt,name=name"`
112+
// Port is the port number.
113+
Port uint32 `json:"port" protobuf:"varint,2,opt,name=port"`
114+
// Protocol is the protocol of the port.
115+
// +kubebuilder:default=HTTP
116+
// +kubebuilder:validation:Enum=HTTP;HTTPS;
117+
Protocol ProtocolType `json:"protocol" protobuf:"bytes,3,opt,name=protocol"`
118+
}
119+
120+
type SandboxTemplate struct {
121+
// Labels to apply to the sandbox Pod.
122+
// +optional
123+
Labels map[string]string `json:"labels,omitempty" protobuf:"bytes,1,rep,name=labels"`
124+
125+
// Annotations to apply to the sandbox Pod.
126+
// +optional
127+
Annotations map[string]string `json:"annotations,omitempty" protobuf:"bytes,2,rep,name=annotations"`
128+
129+
// Spec is the Pod's spec
130+
// +kubebuilder:validation:Required
131+
Spec corev1.PodSpec `json:"spec" protobuf:"bytes,3,opt,name=spec"`
132+
}
133+
```
134+
135+
Below is an example of how to define an `AgentRuntime` CRD for an agent runtime environment:
136+
137+
```yaml
138+
apiVersion: runtime.agentcube.io/v1alpha1 # adjust to your actual group/version
139+
kind: AgentRuntime
140+
metadata:
141+
name: foo
142+
labels:
143+
app: foo
144+
spec:
145+
# Ports exposed by the runtime; your router/proxy will use these.
146+
ports:
147+
- name: http
148+
port: 8080
149+
protocol: HTTP
150+
pathPrefix: /api
151+
- name: metrics
152+
port: 9090
153+
protocol: HTTP
154+
pathPrefix: /metrics
155+
156+
# Template used to create the sandbox pod per session
157+
template:
158+
labels:
159+
app: code-foo
160+
component: runtime
161+
spec:
162+
containers:
163+
- name: runtime
164+
image: ghcr.io/your-org/foo-runtime:latest
165+
imagePullPolicy: IfNotPresent
166+
ports:
167+
- name: http
168+
containerPort: 8080
169+
protocol: TCP
170+
env:
171+
- name: PYTHONUNBUFFERED
172+
value: "1"
173+
resources:
174+
requests:
175+
cpu: "500m"
176+
memory: "1Gi"
177+
limits:
178+
cpu: "2"
179+
memory: "4Gi"
180+
securityContext:
181+
runAsNonRoot: true
182+
allowPrivilegeEscalation: false
183+
readOnlyRootFilesystem: true
184+
restartPolicy: Always
185+
186+
# After 15 minutes of inactivity, terminate the session’s sandbox
187+
sessionTimeout: 15m
188+
189+
# Hard cap: session will be terminated after 8 hours regardless of activity
190+
maxSessionDuration: 8h
191+
```
192+
193+
With the `AgentRuntime` published, callers can access the agent runtime through the endpoint `https://<agent-frontend>:<frontend-port>/v1/namespaces/{agentNamespace}/agent-runtimes/{agentName}/invocations/<agent specific path>`.
194+
195+
196+
#### CodeInterpreter CRD
197+
198+
```go
199+
// CodeInterpreter defines the desired state of a code interpreter runtime environment.
200+
//
201+
// This runtime is designed for running potentially untrusted, LLM-generated code in an
202+
// isolated sandbox, typically per user/session, with stricter security and resource controls
203+
// compared to a generic AgentRuntime.
204+
type CodeInterpreter struct {
205+
metav1.TypeMeta `json:",inline"`
206+
// metadata is a standard object metadata
207+
// +optional
208+
metav1.ObjectMeta `json:"metadata,omitempty,omitzero" protobuf:"bytes,1,opt,name=metadata"`
209+
// Spec defines the desired state of the CodeInterpreter.
210+
Spec CodeInterpreterSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"`
211+
// Status represents the current state of the CodeInterpreter.
212+
Status CodeInterpreterStatus `json:"status" protobuf:"bytes,3,opt,name=status"`
213+
}
214+
215+
// CodeInterpreterSpec describes how to create and manage code-interpreter sandboxes.
216+
type CodeInterpreterSpec struct {
217+
// Ports is a list of ports that the code interpreter runtime will expose.
218+
// These ports are typically used by the router / apiserver to proxy HTTP or gRPC
219+
// traffic into the sandbox (e.g., /execute, /files, /health).
220+
// If not specified, defaults to use agentcube's code interpreter.
221+
// +optional
222+
Ports []TargetPort `json:"ports,omitempty" protobuf:"bytes,1,rep,name=ports"`
223+
224+
// Template describes the template that will be used to create a code interpreter sandbox.
225+
// This SHOULD be more locked down than a generic agent runtime (e.g. no hostPath,
226+
// restricted capabilities, read-only root filesystem, etc.).
227+
// +kubebuilder:validation:Required
228+
Template *CodeInterpreterSandboxTemplate `json:"template" protobuf:"bytes,2,opt,name=template"`
229+
230+
// SessionTimeout describes the duration after which an inactive code-interpreter
231+
// session will be terminated. Any sandbox that has not received requests within
232+
// this duration is eligible for cleanup.
233+
// +kubebuilder:validation:Required
234+
// +kubebuilder:default="15m"
235+
SessionTimeout *metav1.Duration `json:"sessionTimeout,omitempty" protobuf:"bytes,3,opt,name=sessionTimeout"`
236+
237+
// MaxSessionDuration describes the maximum duration for a code-interpreter session.
238+
// After this duration, the session will be terminated regardless of activity, to
239+
// prevent long-lived sandboxes from accumulating unbounded state.
240+
// +kubebuilder:validation:Required
241+
// +kubebuilder:default="8h"
242+
MaxSessionDuration *metav1.Duration `json:"maxSessionDuration,omitempty" protobuf:"bytes,4,opt,name=maxSessionDuration"`
243+
244+
// WarmPoolSize specifies the number of pre-warmed sandboxes to maintain
245+
// for this code interpreter runtime. Pre-warmed sandboxes can reduce startup
246+
// latency for new sessions at the cost of additional resource usage.
247+
// +optional
248+
WarmPoolSize *int32 `json:"warmPoolSize,omitempty" protobuf:"varint,5,opt,name=warmPoolSize"`
249+
}
250+
251+
// CodeInterpreterSandboxTemplate mirrors SandboxTemplate but is kept separate in case
252+
// we want to evolve code-interpreter specific defaults independently in the future.
253+
// For now, it can be used in controllers or validation if a distinct type is helpful.
254+
type CodeInterpreterSandboxTemplate struct {
255+
// Labels to apply to the sandbox Pod.
256+
// +optional
257+
Labels map[string]string `json:"labels,omitempty" protobuf:"bytes,1,rep,name=labels"`
258+
259+
// Annotations to apply to the sandbox Pod.
260+
// +optional
261+
Annotations map[string]string `json:"annotations,omitempty" protobuf:"bytes,2,rep,name=annotations"`
262+
263+
// RuntimeClassName specifies the Kubernetes RuntimeClass used to run the sandbox
264+
// (e.g., for kuasar / Kata-based isolation).
265+
// +optional
266+
RuntimeClassName *string `json:"runtimeClassName,omitempty" protobuf:"bytes,3,opt,name=runtimeClassName"`
267+
268+
// Image indicates the container image to use for the code interpreter runtime.
269+
Image string `json:"image,omitempty" protobuf:"bytes,4,opt,name=image"`
270+
271+
// Environment specifies the environment variables to set in the code interpreter runtime.
272+
Environment []corev1.EnvVar `json:"environment,omitempty" protobuf:"bytes,5,rep,name=environment"`
273+
274+
// Entrypoint array. Not executed within a shell.
275+
// The container image's ENTRYPOINT is used if this is not provided.
276+
// +optional
277+
// +listType=atomic
278+
Command []string `json:"command,omitempty" protobuf:"bytes,6,rep,name=command"`
279+
280+
// Arguments to the entrypoint.
281+
// The container image's CMD is used if this is not provided.
282+
// +optional
283+
Args []string `json:"args,omitempty" protobuf:"bytes,7,rep,name=args"`
284+
285+
// Compute Resources required by this container.
286+
// More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
287+
// +optional
288+
Resources corev1.ResourceRequirements `json:"resources,omitempty" protobuf:"bytes,8,opt,name=resources"`
289+
}
290+
```
291+
292+
Similarly, below is an example of how to define a `CodeInterpreter` for a code interpreter runtime environment:
293+
294+
```yaml
295+
apiVersion: runtime.example.com/v1
296+
kind: CodeInterpreter
297+
metadata:
298+
name: example-code-interpreter
299+
spec:
300+
ports:
301+
- name: http
302+
port: 8080
303+
template:
304+
labels:
305+
app: example-code-interpreter
306+
annotations:
307+
description: "An example code interpreter runtime"
308+
runtimeClassName: "kata"
309+
image: "example/code-interpreter:latest"
310+
environment:
311+
- name: EXAMPLE_ENV
312+
value: "example"
313+
command:
314+
- "/usr/local/bin/code-interpreter"
315+
args:
316+
- "--config"
317+
- "/etc/code-interpreter/config.yaml"
318+
resources:
319+
requests:
320+
cpu: "500m"
321+
memory: "512Mi"
322+
limits:
323+
cpu: "1"
324+
memory: "1Gi"
325+
sessionTimeout: "15m"
326+
maxSessionDuration: "8h"
327+
warmPoolSize: 2
328+
```
329+
330+
With the `CodeInterpreter` published, callers can access the runtime through the endpoint `https://<agent-frontend>:<frontend-port>/v1/namespaces/{namespace}/code-interpreters/{name}/invocations/<code interpreter specific path>`.
331+
332+
### Alternatives
333+
334+
We can also design a restful api server to manage the lifecycle of agent runtimes, that will make the runtime management more flexible. However, it will introduce additional complexity in terms of deployment, scaling, and maintenance of the restful api server. So in the first stage, we can make use of kubernetes CRD and operator to manage different kinds of runtime.

0 commit comments

Comments
 (0)