|
| 1 | +--- |
| 2 | +title: Enterprise environments (DEV/UAT/PROD) |
| 3 | +subtitle: Promotion and configuration management for assistants and squads |
| 4 | +--- |
| 5 | + |
| 6 | +## Purpose |
| 7 | + |
| 8 | +Provide enterprise teams a repeatable, auditable way to build, test, and promote assistant and squad configurations across environments. |
| 9 | + |
| 10 | +## Audience |
| 11 | + |
| 12 | +- **Platform admins**: environment boundaries, access control, and compliance |
| 13 | +- **DevOps/Eng**: CI/CD and automation |
| 14 | +- **Forward-deployed engineers**: day-to-day configuration and migrations |
| 15 | + |
| 16 | +## Principles |
| 17 | + |
| 18 | +- **Isolation**: Separate organizations per environment: `dev`, `uat` (or `staging`), `prod`. |
| 19 | +- **Config as Code**: Store assistant/squad/tool/knowledge-base configs as JSON/YAML in Git. |
| 20 | +- **Immutability + Promotion**: Create in `dev`, validate in `uat`, promote to `prod` via automation. |
| 21 | +- **Least privilege**: RBAC, secrets isolation, and data boundaries per environment. |
| 22 | +- **Reproducibility**: Idempotent apply, drift detection, and rollbacks from Git history. |
| 23 | + |
| 24 | +## Environment topology |
| 25 | + |
| 26 | +- **Organizations**: One org per environment, e.g., `acme-dev`, `acme-uat`, `acme-prod`. |
| 27 | +- **Networking & data**: |
| 28 | + - `dev`: synthetic or scrubbed data |
| 29 | + - `uat`: production-like data, avoid real PII when possible |
| 30 | + - `prod`: real data with strict logging/audit |
| 31 | +- **Access**: |
| 32 | + - `dev`: engineers only |
| 33 | + - `uat`: QA, SMEs |
| 34 | + - `prod`: restricted operators; changes via CI/CD only |
| 35 | + |
| 36 | +## Resources under management |
| 37 | + |
| 38 | +Treat these as declarative resources: |
| 39 | +- **Assistants**: system prompt, tools, routing, grounding, safety settings |
| 40 | +- **Squads/Teams**: membership and permissions |
| 41 | +- **Tools/Integrations**: function schemas, external service configs |
| 42 | +- **Knowledge Bases**: document sources, embedding settings |
| 43 | +- **Runtimes/Policies**: rate limits, safety policies, fallback models |
| 44 | + |
| 45 | +Reference resources by stable logical names (slugs) in config; resolve to IDs at apply time. |
| 46 | + |
| 47 | +## Repository structure (example) |
| 48 | + |
| 49 | +```text |
| 50 | +/platform |
| 51 | + /assistants |
| 52 | + order-agent.yaml |
| 53 | + support-agent.yaml |
| 54 | + /squads |
| 55 | + support-level1.yaml |
| 56 | + /tools |
| 57 | + jira.yaml |
| 58 | + zendesk.yaml |
| 59 | + /knowledge |
| 60 | + product-faqs.yaml |
| 61 | + /policies |
| 62 | + safety.yaml |
| 63 | + environments.yaml # maps env → org IDs, model defaults, endpoints |
| 64 | + schemas/ # JSONSchema for validation |
| 65 | +``` |
| 66 | + |
| 67 | +Do not commit secrets. Store them in your secret manager (e.g., Vault, AWS Secrets Manager, GCP Secret Manager) and reference via placeholders. |
| 68 | + |
| 69 | +## Config format (YAML examples) |
| 70 | + |
| 71 | +```yaml |
| 72 | +kind: Assistant |
| 73 | +apiVersion: v1 |
| 74 | +metadata: |
| 75 | + name: order-agent |
| 76 | + description: Handles order inquiries |
| 77 | +spec: |
| 78 | + systemPromptRef: prompts/order-agent.md |
| 79 | + model: gpt-4.1 |
| 80 | + tools: |
| 81 | + - ref: jira |
| 82 | + - ref: zendesk |
| 83 | + knowledge: |
| 84 | + - ref: product-faqs |
| 85 | + safetyPolicyRef: policies/safety.yaml |
| 86 | +``` |
| 87 | +
|
| 88 | +```yaml |
| 89 | +kind: Tool |
| 90 | +apiVersion: v1 |
| 91 | +metadata: |
| 92 | + name: jira |
| 93 | +spec: |
| 94 | + type: http |
| 95 | + authRef: secrets/jira-token # resolved from secret manager |
| 96 | + endpoint: https://jira.example.com |
| 97 | + operations: |
| 98 | + - name: createIssue |
| 99 | + method: POST |
| 100 | + path: /rest/api/3/issue |
| 101 | + schemaRef: schemas/jira-create-issue.json |
| 102 | +``` |
| 103 | +
|
| 104 | +## Promotion workflow |
| 105 | +
|
| 106 | +1. **Develop in DEV** |
| 107 | + - Create/modify configs in Git. |
| 108 | + - Run local validation (schema/lint) and a plan/diff against `dev`. |
| 109 | + - Apply to `dev`; run unit/integration tests and data access checks. |
| 110 | +2. **Promote to UAT** |
| 111 | + - Open a PR; CI runs `plan` against `uat` and posts a diff. |
| 112 | + - On approval, CI applies to `uat` using a service principal for the `uat` org. |
| 113 | +3. **Promote to PROD** |
| 114 | + - Change window + ticket if required. |
| 115 | + - CI runs `plan` against `prod`, requires approvals from owners. |
| 116 | + - CI applies to `prod`; record the change set and artifacts. |
| 117 | +4. **Rollback** |
| 118 | + - Revert Git commit → CI reapplies previous config (idempotent). |
| 119 | + - Keep backup exports from each apply job for audit. |
| 120 | + |
| 121 | +## Applying configs via API |
| 122 | + |
| 123 | +Use a small deployer that: |
| 124 | +- Reads YAML/JSON |
| 125 | +- Resolves references and secrets for the target environment |
| 126 | +- Translates to API payloads |
| 127 | +- Uses idempotency keys and labels to detect drift |
| 128 | + |
| 129 | +Example pseudo-commands: |
| 130 | + |
| 131 | +```bash |
| 132 | +# Export (backup) |
| 133 | +curl -sS -H "Authorization: Bearer $TOKEN" \ |
| 134 | + GET `https://api.vendor.com/v1/assistants?label=order-agent` > backups/order-agent-dev.json |
| 135 | + |
| 136 | +# Apply (create or update) |
| 137 | +curl -sS -H "Authorization: Bearer $TOKEN" -H "Idempotency-Key: $KEY" \ |
| 138 | + -H "Content-Type: application/json" \ |
| 139 | + -X PUT `https://api.vendor.com/v1/assistants/order-agent` \ |
| 140 | + --data-binary @rendered/order-agent.dev.json |
| 141 | +``` |
| 142 | + |
| 143 | +Recommendations: |
| 144 | +- **Idempotency**: One key per resource per pipeline run |
| 145 | +- **Labeling**: Tag resources with `env`, `app`, `owner`, `sha` for traceability |
| 146 | +- **Drift**: Fetch current → compute diff → fail pipeline on unmanaged drift |
| 147 | + |
| 148 | +## CI/CD example (GitHub Actions) |
| 149 | + |
| 150 | +```yaml |
| 151 | +name: Platform Deploy |
| 152 | + |
| 153 | +on: |
| 154 | + pull_request: |
| 155 | + push: |
| 156 | + branches: [ main ] |
| 157 | + |
| 158 | +jobs: |
| 159 | + plan: |
| 160 | + runs-on: ubuntu-latest |
| 161 | + steps: |
| 162 | + - uses: actions/checkout@v4 |
| 163 | + - uses: actions/setup-node@v4 |
| 164 | + with: { node-version: 20 } |
| 165 | + - run: npm ci |
| 166 | + - name: Validate |
| 167 | + run: npm run validate:all |
| 168 | + - name: Plan UAT |
| 169 | + env: |
| 170 | + ORG_ID: ${{ secrets.UAT_ORG_ID }} |
| 171 | + API_TOKEN: ${{ secrets.UAT_TOKEN }} |
| 172 | + run: npm run plan -- --env uat --out plan-uat.txt |
| 173 | + - uses: actions/upload-artifact@v4 |
| 174 | + with: { name: plan-uat, path: plan-uat.txt } |
| 175 | + |
| 176 | + deploy-prod: |
| 177 | + if: github.ref == 'refs/heads/main' |
| 178 | + needs: [ plan ] |
| 179 | + permissions: { contents: read } |
| 180 | + runs-on: ubuntu-latest |
| 181 | + environment: |
| 182 | + name: prod |
| 183 | + url: https://console.vendor.com/orgs/${{ secrets.PROD_ORG_ID }} |
| 184 | + steps: |
| 185 | + - uses: actions/checkout@v4 |
| 186 | + - uses: actions/setup-node@v4 |
| 187 | + with: { node-version: 20 } |
| 188 | + - run: npm ci |
| 189 | + - name: Apply PROD |
| 190 | + env: |
| 191 | + ORG_ID: ${{ secrets.PROD_ORG_ID }} |
| 192 | + API_TOKEN: ${{ secrets.PROD_TOKEN }} |
| 193 | + run: npm run apply -- --env prod --approve |
| 194 | +``` |
| 195 | +
|
| 196 | +## Naming and referencing |
| 197 | +
|
| 198 | +- Use unique slugs (e.g., `order-agent`) per environment |
| 199 | +- Prefer logical refs in specs; map to environment-specific IDs at render/apply time |
| 200 | +- Save Git commit SHA as a label on each resource for traceability |
| 201 | + |
| 202 | +## Security and compliance |
| 203 | + |
| 204 | +- **RBAC**: Developers write to `dev`, read `uat`, no direct `prod` writes; CI principals per org |
| 205 | +- **Secrets**: Keep out of Git. Resolve via `secrets://path` at apply time; rotate per policy |
| 206 | +- **Audit**: Keep apply logs, request/response checksums, and exported snapshots per run; enable API audit logs in each org |
| 207 | + |
| 208 | +## Testing and validation |
| 209 | + |
| 210 | +- **Static**: JSONSchema validation; lint refs and schema compatibility |
| 211 | +- **Dynamic**: Dry-run/plan renders and diffs |
| 212 | +- **Behavioral**: Golden-path chat transcripts in `dev` and `uat`; tool execution smoke tests; canary in `prod` |
| 213 | + |
| 214 | +## Operational runbooks |
| 215 | + |
| 216 | +- **Create a new assistant**: add YAML → PR → CI plans → approve → deploy to `uat` → UAT signoff → deploy to `prod` |
| 217 | +- **Change a tool**: update tool YAML; bump assistant `spec.tools`; ensure backward compatibility; run smoke tests |
| 218 | +- **Incident rollback**: revert commit; re-run apply; confirm labels reverted |
| 219 | + |
| 220 | +## FAQ |
| 221 | + |
| 222 | +- **How do we copy an assistant to another environment?** Export from source org (GET), normalize to YAML/JSON, check into Git, then apply to target org via CI using the deployer. |
| 223 | +- **What exactly is the “config”?** The full API payload needed to create/update the assistant, its referenced tools, knowledge bases, and policies. Store it declaratively and resolve environment-specific references at apply time. |
| 224 | +- **We don’t have built-in versioning yet. What should we do now?** Use Git as the source of truth, add labels with commit SHAs to resources, and require CI-only writes to `prod`. |
| 225 | +- **How do we handle environment-specific differences (models, endpoints)?** Parameterize via `environments.yaml` and templates; keep the logical spec identical across envs, only vary parameters. |
| 226 | + |
| 227 | +## Promotion checklist |
| 228 | + |
| 229 | +- **Config**: validated and reviewed |
| 230 | +- **Secrets**: present in target environment |
| 231 | +- **Diff**: plan shows expected changes only |
| 232 | +- **Tests**: UAT signoff recorded |
| 233 | +- **Approvals**: change ticket and reviewers complete |
| 234 | +- **Backups**: exported current `prod` state saved |
| 235 | +- **Monitoring**: alerts enabled for error rate and tool failures |
| 236 | + |
| 237 | +## Minimal example: render + apply (Node) |
| 238 | + |
| 239 | +```javascript |
| 240 | +import { readFileSync } from 'fs'; |
| 241 | +import yaml from 'js-yaml'; |
| 242 | +import fetch from 'node-fetch'; |
| 243 | +
|
| 244 | +const token = process.env.API_TOKEN; |
| 245 | +const orgId = process.env.ORG_ID; |
| 246 | +
|
| 247 | +async function upsertAssistant(doc) { |
| 248 | + const url = `https://api.vendor.com/v1/assistants/${doc.metadata.name}?org=${orgId}`; |
| 249 | + const res = await fetch(url, { |
| 250 | + method: 'PUT', |
| 251 | + headers: { |
| 252 | + Authorization: `Bearer ${token}`, |
| 253 | + 'Content-Type': 'application/json', |
| 254 | + 'Idempotency-Key': process.env.IDEMPOTENCY_KEY |
| 255 | + }, |
| 256 | + body: JSON.stringify(render(doc)) |
| 257 | + }); |
| 258 | + if (!res.ok) throw new Error(`Apply failed: ${res.status} ${await res.text()}`); |
| 259 | +} |
| 260 | + |
| 261 | +function render(doc) { |
| 262 | + return { |
| 263 | + name: doc.metadata.name, |
| 264 | + description: doc.metadata.description, |
| 265 | + model: doc.spec.model, |
| 266 | + tools: doc.spec.tools.map(t => ({ name: t.ref })), |
| 267 | + labels: { env: process.env.ENV, sha: process.env.GIT_SHA } |
| 268 | + }; |
| 269 | +} |
| 270 | + |
| 271 | +const doc = yaml.load(readFileSync(process.argv[2], 'utf8')); |
| 272 | +upsertAssistant(doc) |
| 273 | + .then(() => console.log('Applied')) |
| 274 | + .catch(e => { console.error(e); process.exit(1); }); |
| 275 | +``` |
0 commit comments