|
| 1 | +# Go Client SDK for Agent Sandbox |
| 2 | + |
| 3 | +This Go client provides a simple, high-level interface for creating and interacting with |
| 4 | +sandboxes managed by the Agent Sandbox controller. It handles the full SandboxClaim lifecycle |
| 5 | +(creation, readiness, cleanup) so callers only need to think about running commands and |
| 6 | +transferring files. |
| 7 | + |
| 8 | +It supports three connectivity modes: **Gateway** (Kubernetes Gateway API), **Port-Forward** |
| 9 | +(native SPDY tunnel), and **Direct URL** (in-cluster or custom domain). |
| 10 | + |
| 11 | +## Architecture |
| 12 | + |
| 13 | +The client operates in three modes: |
| 14 | + |
| 15 | +1. **Production (Gateway Mode):** Traffic flows from the Client -> Cloud Load Balancer (Gateway) |
| 16 | + -> Router Service -> Sandbox Pod. The client watches the Gateway resource for an external IP. |
| 17 | +2. **Development (Port-Forward Mode):** Traffic flows from the Client -> SPDY tunnel -> Router |
| 18 | + Service -> Sandbox Pod. Uses `client-go/tools/portforward` natively — no `kubectl` required. |
| 19 | +3. **Advanced / Internal Mode:** The client connects directly to a provided `APIURL`, bypassing |
| 20 | + discovery. Useful for in-cluster agents or custom domains. |
| 21 | + |
| 22 | +## Prerequisites |
| 23 | + |
| 24 | +- A running Kubernetes cluster with a valid kubeconfig (or in-cluster config). This is required even in Direct URL mode because the client creates Kubernetes clientsets for SandboxClaim lifecycle management. |
| 25 | +- The [**Agent Sandbox Controller**](https://github.com/kubernetes-sigs/agent-sandbox?tab=readme-ov-file#installation) installed. |
| 26 | +- The **Sandbox Router** deployed in the target namespace (`sandbox-router-svc`). |
| 27 | +- A `SandboxTemplate` created in the target namespace. |
| 28 | +- Go 1.26+. |
| 29 | + |
| 30 | +## Installation |
| 31 | + |
| 32 | +```bash |
| 33 | +go get sigs.k8s.io/agent-sandbox/clients/go/sandbox |
| 34 | +``` |
| 35 | + |
| 36 | +## Usage Examples |
| 37 | + |
| 38 | +### 1. Production Mode (Gateway) |
| 39 | + |
| 40 | +Use this when running against a cluster with a public Gateway IP. The client automatically |
| 41 | +discovers the Gateway address. |
| 42 | + |
| 43 | +```go |
| 44 | +client, err := sandbox.NewClient(sandbox.Options{ |
| 45 | + TemplateName: "my-sandbox-template", |
| 46 | + GatewayName: "external-http-gateway", |
| 47 | + GatewayNamespace: "default", |
| 48 | + Namespace: "default", |
| 49 | +}) |
| 50 | +if err != nil { log.Fatal(err) } |
| 51 | +defer client.Close(context.Background()) |
| 52 | + |
| 53 | +ctx := context.Background() |
| 54 | +if err := client.Open(ctx); err != nil { log.Fatal(err) } |
| 55 | + |
| 56 | +result, err := client.Run(ctx, "echo 'Hello from Cloud!'") |
| 57 | +if err != nil { log.Fatal(err) } |
| 58 | +fmt.Println(result.Stdout) |
| 59 | +``` |
| 60 | + |
| 61 | +### 2. Developer Mode (Port-Forward) |
| 62 | + |
| 63 | +Use this for local development or CI. If you omit `GatewayName` and `APIURL`, the client |
| 64 | +automatically establishes an SPDY port-forward tunnel to the Router Service. |
| 65 | + |
| 66 | +```go |
| 67 | +client, err := sandbox.NewClient(sandbox.Options{ |
| 68 | + TemplateName: "my-sandbox-template", |
| 69 | + Namespace: "default", |
| 70 | +}) |
| 71 | +if err != nil { log.Fatal(err) } |
| 72 | +defer client.Close(context.Background()) |
| 73 | + |
| 74 | +ctx := context.Background() |
| 75 | +if err := client.Open(ctx); err != nil { log.Fatal(err) } |
| 76 | + |
| 77 | +result, err := client.Run(ctx, "echo 'Hello from Local!'") |
| 78 | +if err != nil { log.Fatal(err) } |
| 79 | +fmt.Println(result.Stdout) |
| 80 | +``` |
| 81 | + |
| 82 | +### 3. Advanced / Internal Mode |
| 83 | + |
| 84 | +Use `APIURL` to bypass discovery entirely. Useful for: |
| 85 | + |
| 86 | +- **Internal Agents:** Running inside the cluster (connect via K8s DNS). |
| 87 | +- **Custom Domains:** Connecting via HTTPS (e.g., `https://sandbox.example.com`). |
| 88 | + |
| 89 | +```go |
| 90 | +client, err := sandbox.NewClient(sandbox.Options{ |
| 91 | + TemplateName: "my-sandbox-template", |
| 92 | + APIURL: "http://sandbox-router-svc.default.svc.cluster.local:8080", |
| 93 | + Namespace: "default", |
| 94 | +}) |
| 95 | +if err != nil { log.Fatal(err) } |
| 96 | +defer client.Close(context.Background()) |
| 97 | + |
| 98 | +ctx := context.Background() |
| 99 | +if err := client.Open(ctx); err != nil { log.Fatal(err) } |
| 100 | + |
| 101 | +entries, err := client.List(ctx, ".") |
| 102 | +if err != nil { log.Fatal(err) } |
| 103 | +fmt.Println(entries) |
| 104 | +``` |
| 105 | + |
| 106 | +### 4. Custom Ports |
| 107 | + |
| 108 | +If your sandbox runtime listens on a port other than 8888, specify `ServerPort`. |
| 109 | + |
| 110 | +```go |
| 111 | +client, err := sandbox.NewClient(sandbox.Options{ |
| 112 | + TemplateName: "my-sandbox-template", |
| 113 | + ServerPort: 3000, |
| 114 | +}) |
| 115 | +``` |
| 116 | + |
| 117 | +### File Operations |
| 118 | + |
| 119 | +```go |
| 120 | +// Write a file (only the base filename is sent; directory components are discarded). |
| 121 | +// Paths like "", ".", "..", and "/" are rejected with an error. |
| 122 | +err := client.Write(ctx, "script.py", []byte("print('hello')")) |
| 123 | + |
| 124 | +// Read a file |
| 125 | +data, err := client.Read(ctx, "script.py") |
| 126 | + |
| 127 | +// Check existence |
| 128 | +exists, err := client.Exists(ctx, "script.py") |
| 129 | +``` |
| 130 | + |
| 131 | +### 5. Custom TLS / Transport |
| 132 | + |
| 133 | +If your Gateway uses HTTPS with a private CA, provide a custom transport: |
| 134 | + |
| 135 | +```go |
| 136 | +tlsConfig := &tls.Config{RootCAs: myCAPool} |
| 137 | +client, err := sandbox.NewClient(sandbox.Options{ |
| 138 | + TemplateName: "my-sandbox-template", |
| 139 | + GatewayName: "external-https-gateway", |
| 140 | + GatewayScheme: "https", |
| 141 | + HTTPTransport: &http.Transport{TLSClientConfig: tlsConfig}, |
| 142 | +}) |
| 143 | +``` |
| 144 | + |
| 145 | +## Configuration |
| 146 | + |
| 147 | +All options are documented on the `Options` struct in |
| 148 | +[options.go](sandbox/options.go). Key fields: |
| 149 | + |
| 150 | +- `TemplateName` *(required)* — name of the `SandboxTemplate`. |
| 151 | +- `GatewayName` — set to enable Gateway mode. |
| 152 | +- `APIURL` — set for Direct URL mode (takes precedence over `GatewayName`). |
| 153 | +- `EnableTracing` / `TracerProvider` — OpenTelemetry integration. |
| 154 | + |
| 155 | +Any operation accepts `WithTimeout` to override the default request timeout: |
| 156 | + |
| 157 | +```go |
| 158 | +result, err := client.Run(ctx, "make build", sandbox.WithTimeout(10*time.Minute)) |
| 159 | +``` |
| 160 | + |
| 161 | +## Retry Behavior |
| 162 | + |
| 163 | +Operations are automatically retried on 5xx responses and connection errors with |
| 164 | +exponential backoff. See constants in [transport.go](sandbox/transport.go) for details. |
| 165 | + |
| 166 | +## Port-Forward Recovery |
| 167 | + |
| 168 | +In port-forward mode, a background monitor detects tunnel death and clears the |
| 169 | +client's ready state. Subsequent operations fail immediately with `ErrNotReady` |
| 170 | +(wrapping `ErrPortForwardDied`) instead of timing out. |
| 171 | + |
| 172 | +To recover, call `Open()` again — the client will verify the claim and sandbox |
| 173 | +still exist, then establish a new tunnel: |
| 174 | + |
| 175 | +```go |
| 176 | +result, err := client.Run(ctx, "echo hi") |
| 177 | +if errors.Is(err, sandbox.ErrNotReady) { |
| 178 | + // Port-forward died; reconnect. |
| 179 | + if reconnErr := client.Open(ctx); reconnErr != nil { |
| 180 | + if errors.Is(reconnErr, sandbox.ErrOrphanedClaim) { |
| 181 | + // Sandbox no longer ready or verification failed; clean up and start fresh. |
| 182 | + client.Close(ctx) |
| 183 | + reconnErr = client.Open(ctx) |
| 184 | + } |
| 185 | + if reconnErr != nil { |
| 186 | + log.Fatal("reconnect failed:", reconnErr) |
| 187 | + } |
| 188 | + } |
| 189 | + result, err = client.Run(ctx, "echo hi") |
| 190 | +} |
| 191 | +``` |
| 192 | + |
| 193 | +If `Close()` fails to delete the claim (e.g., API server unavailable), the client |
| 194 | +preserves the claim name so `Close()` can be retried to clean up the orphaned claim. |
| 195 | +Calling `Open()` on a client with an orphaned claim returns `ErrOrphanedClaim`. |
| 196 | + |
| 197 | +## Error Sentinel Reference |
| 198 | + |
| 199 | +| Error | Meaning | |
| 200 | +|-------|---------| |
| 201 | +| `ErrNotReady` | Client is not open or transport died. Call `Open()`. | |
| 202 | +| `ErrAlreadyOpen` | `Open()` called on an already-open client. Call `Close()` first. | |
| 203 | +| `ErrOrphanedClaim` | A previous claim could not be cleaned up (failed `Close()`, failed `Open()` rollback, or sandbox disappeared during reconnect); call `Close()` to retry deletion. | |
| 204 | +| `ErrTimeout` | Sandbox or Gateway did not become ready within the configured timeout. | |
| 205 | +| `ErrClaimFailed` | SandboxClaim creation was rejected by the API server. | |
| 206 | +| `ErrPortForwardDied` | The SPDY tunnel dropped. Call `Open()` to reconnect. | |
| 207 | +| `ErrRetriesExhausted` | All HTTP retry attempts failed. | |
| 208 | +| `ErrSandboxDeleted` | The Sandbox was deleted before becoming ready. | |
| 209 | +| `ErrGatewayDeleted` | The Gateway was deleted during address discovery. | |
| 210 | + |
| 211 | +## Testing / Mocking |
| 212 | + |
| 213 | +The package exports two interfaces: |
| 214 | + |
| 215 | +- **`Client`** — the core API (`Open`, `Close`, `Run`, `Read`, `Write`, `List`, |
| 216 | + `Exists`, `IsReady`). Accept this in your APIs to enable testing with fakes. |
| 217 | +- **`SandboxInfo`** — read-only identity accessors (`ClaimName`, `SandboxName`, |
| 218 | + `PodName`, `Annotations`). These are on the concrete `*SandboxClient` (and the |
| 219 | + `SandboxInfo` interface) rather than `Client`, so adding new accessors is not |
| 220 | + a breaking change for mock implementors. |
| 221 | + |
| 222 | +```go |
| 223 | +// Accept the narrow Client interface for testability. |
| 224 | +func ProcessInSandbox(ctx context.Context, sb sandbox.Client) error { |
| 225 | + if err := sb.Open(ctx); err != nil { |
| 226 | + return err |
| 227 | + } |
| 228 | + defer sb.Close(context.Background()) |
| 229 | + result, err := sb.Run(ctx, "echo hello") |
| 230 | + // ... |
| 231 | +} |
| 232 | + |
| 233 | +// When you need identity metadata, accept the concrete type or SandboxInfo. |
| 234 | +func LogSandboxIdentity(info sandbox.SandboxInfo) { |
| 235 | + log.Printf("claim=%s sandbox=%s pod=%s", info.ClaimName(), info.SandboxName(), info.PodName()) |
| 236 | +} |
| 237 | +``` |
| 238 | + |
| 239 | +## Running Tests |
| 240 | + |
| 241 | +### Unit Tests |
| 242 | + |
| 243 | +```bash |
| 244 | +go test ./clients/go/sandbox/ -v -count=1 |
| 245 | +``` |
| 246 | + |
| 247 | +### Integration Tests |
| 248 | + |
| 249 | +Integration tests require a running cluster with the Agent Sandbox controller and a |
| 250 | +`SandboxTemplate` installed. They are behind the `integration` build tag. |
| 251 | + |
| 252 | +```bash |
| 253 | +# Dev mode (port-forward) |
| 254 | +INTEGRATION_TEST=1 go test ./clients/go/sandbox/ -tags=integration -v -timeout=300s |
| 255 | + |
| 256 | +# Gateway mode |
| 257 | +go test ./clients/go/sandbox/ -tags=integration -v -timeout=300s \ |
| 258 | + -args --gateway-name=external-http-gateway --gateway-namespace=default |
| 259 | + |
| 260 | +# Direct URL mode |
| 261 | +go test ./clients/go/sandbox/ -tags=integration -v -timeout=300s \ |
| 262 | + -args --api-url=http://sandbox-router:8080 |
| 263 | +``` |
0 commit comments