Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
487 changes: 409 additions & 78 deletions darwin-cluster-manager/charts/darwin-fastapi-serve/README.md

Large diffs are not rendered by default.

107 changes: 107 additions & 0 deletions darwin-cluster-manager/charts/darwin-fastapi-serve/VALIDATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Helm Template Validation Report

## Test Summary

All template rendering tests passed successfully. The chart correctly renders different resource configurations based on deployment strategy.

## Test Results

### 1. Default (Kubernetes Deployment + RollingUpdate)

**Command:**
```bash
helm template test-ml-serve . --set name=test-deployment --set image.tag=test
```

**Results:**
- βœ… Renders `Deployment` resource
- βœ… Renders single `Service` resource
- βœ… Strategy is `RollingUpdate` with configurable `maxSurge` and `maxUnavailable`
- βœ… HPA targets `Deployment` kind
- βœ… Ingress backend points to main service
- βœ… PDB selects pods correctly

### 2. Kubernetes Recreate Strategy

**Command:**
```bash
helm template test-recreate . --set name=test --set image.tag=v1 --set deployment.kubernetes.type=Recreate
```

**Results:**
- βœ… Renders `Deployment` resource
- βœ… Strategy is `Recreate` (no rollingUpdate configuration)
- βœ… Single service and standard ingress configuration

### 3. Argo Rollouts Canary + ALB

**Command:**
```bash
helm template test-canary . \
--set name=test \
--set image.tag=v1 \
--set deployment.strategy=argo-rollouts \
--set deployment.rollouts.strategy=canary \
--set deployment.rollouts.trafficRouting.provider=alb
```

**Results:**
- βœ… Renders `Rollout` resource (not Deployment)
- βœ… Renders `service-stable.yaml` (stable service)
- βœ… Renders `service-canary.yaml` (canary service)
- βœ… Renders `service-root.yaml` (ALB root service with `use-annotation` port)
- βœ… Does NOT render main `service.yaml` (correctly conditional)
- βœ… Ingress backend service name uses root service
- βœ… Ingress backend port name is `use-annotation` (required for ALB action-based routing)
- βœ… HPA targets `Rollout` kind (apiVersion: argoproj.io/v1alpha1)
- βœ… Rollout includes `trafficRouting.alb` configuration with ingress references

### 4. Argo Rollouts Blue/Green

**Command:**
```bash
helm template test-bluegreen . \
--set name=test \
--set image.tag=v1 \
--set deployment.strategy=argo-rollouts \
--set deployment.rollouts.strategy=blueGreen
```

**Results:**
- βœ… Renders `Rollout` resource (not Deployment)
- βœ… Renders `service-active.yaml` (active/blue service)
- βœ… Renders `service-preview.yaml` (preview/green service)
- βœ… Does NOT render canary or root services (correctly conditional)
- βœ… Ingress backend points to active service
- βœ… HPA targets `Rollout` kind
- βœ… Rollout includes `blueGreen.activeService` and `blueGreen.previewService` references

## Resource Conditionals Summary

| Strategy Mode | Workload | Services | Ingress Backend |
|--------------|----------|----------|-----------------|
| kubernetes (default) | Deployment | main service | main service |
| kubernetes (Recreate) | Deployment | main service | main service |
| argo-rollouts (canary + ALB) | Rollout | stable, canary, root | root service (port: use-annotation) |
| argo-rollouts (canary + NGINX) | Rollout | stable, canary | stable service |
| argo-rollouts (blueGreen) | Rollout | active, preview | active service |

## Validation Status

βœ… **All template rendering tests passed**

The chart correctly:
- Renders the appropriate workload kind based on strategy
- Creates strategy-specific services
- Configures ingress backends appropriately
- Updates HPA scaleTargetRef to match workload kind
- Maintains backwards compatibility (default behavior unchanged)

## Manual Testing Required

The following tests require a live cluster and are documented but not automated:

- **9.2**: Manual validation in staging cluster for canary rollout behavior (stepWeight progression, pause, rollback, HPA scaling)
- **9.3**: Manual validation in staging cluster for blue/green behavior (preview service accessibility, promotion, traffic cutover)

These will be performed during the staging deployment phase.
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Blue/Green deployment with manual promotion
# This example shows blue/green strategy where the new version (green/preview)
# is deployed and accessible for testing before manually promoting to active (blue)

deployment:
strategy: argo-rollouts
rollouts:
strategy: blueGreen
blueGreen:
# Manual promotion (operator must explicitly promote after validation)
autoPromotionEnabled: false

# Optional: Auto-promote after N seconds if no manual action
# autoPromotionSeconds: 300

# Optional: Keep old version for N seconds after promotion for quick rollback
# scaleDownDelaySeconds: 30

# Optional: Preview ingress for testing the new version before promotion
previewIngress:
enabled: true
namespace: darwin
ingressClass: alb
hosts:
- ml-serve-preview.darwin.dream11-k8s.local
annotations:
external-dns.alpha.kubernetes.io/hostname: ml-serve-preview.darwin.dream11-k8s.local
path: "/*"
pathType: ImplementationSpecific

# Internal ingress (routes to active service)
ingressInt:
enabled: true
ingressClass: alb
namespace: darwin
path: "/*"
pathType: ImplementationSpecific
hosts:
- ml-serve.darwin.dream11-k8s.local
annotations:
external-dns.alpha.kubernetes.io/hostname: ml-serve.darwin.dream11-k8s.local
healthcheckPath: "/healthcheck"
tags: "Environment=prod, Service=ml-serve-bluegreen"

# External ingress disabled for this example
ingressExt:
enabled: false

# HPA settings (scales both blue and green ReplicaSets)
hpa:
enabled: true
maxReplicas: 5
cpu: 70

# Application settings
name: ml-serve-bluegreen
replicaCount: 2
image:
repository: my-registry/ml-serve
tag: v1.0.0
pullPolicy: Always

service:
enabled: true
type: ClusterIP
httpPort: 8000
externalPort: 80

# PDB for availability
pdb:
enabled: true
minAvailable: 1

# Resources
resources:
limits:
cpu: 2
memory: 4G
requests:
cpu: 1
memory: 2G

---
# How to use this configuration:
#
# 1. Deploy initial version:
# helm upgrade --install ml-serve ./darwin-fastapi-serve -f bluegreen.yaml
#
# 2. Deploy new version (creates green/preview environment):
# helm upgrade ml-serve ./darwin-fastapi-serve -f bluegreen.yaml --set image.tag=v2.0.0
#
# 3. Test preview environment:
# curl https://ml-serve-preview.darwin.dream11-k8s.local/healthcheck
#
# 4. Promote to active (blue becomes green, green becomes blue):
# kubectl argo rollouts promote ml-serve-bluegreen -n darwin
#
# 5. Rollback if needed (abort promotion):
# kubectl argo rollouts abort ml-serve-bluegreen -n darwin
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Canary deployment with ALB (internal + external ingresses)
# This example shows progressive traffic shifting on both internal and external ingresses

deployment:
strategy: argo-rollouts
rollouts:
strategy: canary
trafficRouting:
provider: alb
servicePort: 80
canary:
steps:
- setWeight: 10
- pause: {duration: 2m}
- setWeight: 25
- pause: {duration: 5m}
- setWeight: 50
- pause: {duration: 5m}
- setWeight: 75
- pause: {duration: 2m}

# Internal ingress configuration
ingressInt:
enabled: true
ingressClass: alb
namespace: darwin
path: "/*"
pathType: ImplementationSpecific
hosts:
- ml-serve-internal.darwin.dream11-k8s.local
annotations:
external-dns.alpha.kubernetes.io/hostname: ml-serve-internal.darwin.dream11-k8s.local
healthcheckPath: "/healthcheck"
tags: "Environment=prod, Service=ml-serve-canary"

# External ingress configuration (public-facing)
ingressExt:
enabled: true
ingressClass: alb
namespace: darwin
path: "/*"
pathType: ImplementationSpecific
hosts:
- ml-serve.api.example.com
annotations:
external-dns.alpha.kubernetes.io/hostname: ml-serve.api.example.com
healthcheckPath: "/healthcheck"
tags: "Environment=prod, Service=ml-serve-canary, Visibility=public"

# HPA settings (targets Rollout automatically)
hpa:
enabled: true
maxReplicas: 10
cpu: 70
memory:
target: Utilization
value: 80

# Application settings
name: ml-serve-canary-public
replicaCount: 3
image:
repository: my-registry/ml-serve
tag: v1.0.0
pullPolicy: Always

service:
enabled: true
type: ClusterIP
httpPort: 8000
externalPort: 80

# PDB for high availability
pdb:
enabled: true
minAvailable: 2

# Resource limits for production
resources:
limits:
cpu: 4
memory: 8G
requests:
cpu: 2
memory: 4G
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Canary deployment with ALB (internal ingress only)
# This example shows progressive traffic shifting with AWS Load Balancer Controller

deployment:
strategy: argo-rollouts
rollouts:
strategy: canary
trafficRouting:
provider: alb
servicePort: 80
canary:
steps:
- setWeight: 20
- pause: {duration: 1m}
- setWeight: 40
- pause: {duration: 1m}
- setWeight: 60
- pause: {duration: 1m}
- setWeight: 80
- pause: {duration: 1m}

# Internal ingress configuration
ingressInt:
enabled: true
ingressClass: alb
path: "/*"
pathType: ImplementationSpecific
hosts:
- ml-serve-internal.darwin.dream11-k8s.local
annotations:
external-dns.alpha.kubernetes.io/hostname: ml-serve-internal.darwin.dream11-k8s.local

# External ingress disabled for internal-only deployments
ingressExt:
enabled: false

# HPA settings (works with Rollouts)
hpa:
enabled: true
maxReplicas: 5
cpu: 60

# Application settings
name: ml-serve-canary-internal
replicaCount: 2
image:
repository: my-registry/ml-serve
tag: v1.0.0
pullPolicy: Always

service:
enabled: true
type: ClusterIP
httpPort: 8000
externalPort: 80
Loading
Loading