Skip to content

Commit 2c388e5

Browse files
robhollandcarlydf
andauthored
Add migration doc draft. (#91)
Fixes #34 --------- Co-authored-by: Carly de Frondeville <[email protected]>
1 parent 751dc8f commit 2c388e5

File tree

4 files changed

+1128
-0
lines changed

4 files changed

+1128
-0
lines changed

docs/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@ This documentation structure is designed to support various types of technical d
1111

1212
## Index
1313

14+
### [Migration Guide](migration-guide.md)
15+
Comprehensive guide for migrating from existing unversioned worker deployment systems to the Temporal Worker Controller. Includes step-by-step instructions, configuration mapping, and common patterns.
16+
1417
### [Limits](limits.md)
1518
Technical constraints and limitations of the Temporal Worker Controller system, including maximum field lengths and other operational boundaries.
1619

docs/concepts.md

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
# Temporal Worker Controller Concepts
2+
3+
This document defines key concepts and terminology used throughout the Temporal Worker Controller documentation.
4+
5+
## Core Terminology
6+
7+
### Temporal Worker Deployment
8+
A logical grouping in Temporal that represents a collection of workers that are deployed together and should be versioned together. Examples include "payment-processor", "notification-sender", or "data-pipeline-worker". This is a concept within Temporal itself, not specific to Kubernetes. See https://docs.temporal.io/production-deployment/worker-deployments/worker-versioning for more details.
9+
10+
**Key characteristics:**
11+
- Identified by a unique worker deployment name (e.g., "payment-processor/staging")
12+
- Can have multiple concurrent worker versions running simultaneously
13+
- Versions of a Worker Deployment are identified by Build IDs (e.g., "v1.5.1", "v1.5.2")
14+
- Temporal routes workflow executions to appropriate worker versions based on the `RoutingConfig` of the Worker Deployment that the versions are in.
15+
16+
### `TemporalWorkerDeployment` CRD
17+
The Kubernetes Custom Resource Definition that manages one Temporal Worker Deployment. This is the primary resource you interact with when using the Temporal Worker Controller.
18+
19+
**Key characteristics:**
20+
- One `TemporalWorkerDeployment` Custom Resource per Temporal Worker Deployment
21+
- Manages the lifecycle of all versions for that worker deployment
22+
- Defines rollout strategies, resource requirements, and connection details
23+
- Controller creates and manages multiple Kubernetes `Deployment` resources based on this spec
24+
25+
The actual Kubernetes `Deployment` resources that run worker pods. The controller automatically creates these - you don't manage them directly.
26+
27+
**Key characteristics:**
28+
- Multiple Kubernetes `Deployment` resources per `TemporalWorkerDeployment` Custom Resource (one per version)
29+
- Named with the pattern: `{worker-deployment-name}-{build-id}` (e.g., `staging/payment-processor-v1.5.1`)
30+
- Each runs a specific version of your worker code
31+
32+
### Key Relationship
33+
**One `TemporalWorkerDeployment` Custom Resource → Multiple Kubernetes `Deployment` resources (managed by controller)**
34+
35+
Make changes to the spec of your `TemporalWorkerDeployment` Custom Resource, and the controller handles all the underlying Kubernetes `Deployment` resources for different versions.
36+
37+
## Version States
38+
39+
Worker deployment versions progress through various states during their lifecycle:
40+
41+
### NotRegistered
42+
The version has been specified in the `TemporalWorkerDeployment` custom resource but hasn't been registered with Temporal yet. This typically happens when:
43+
- The worker pods are still starting up
44+
- There are connectivity issues to Temporal
45+
- The worker code has errors preventing registration
46+
47+
### Inactive
48+
The version is registered with Temporal but isn't automatically receiving any new workflow executions through the Worker Deployment's `RoutingConfig`. This is the initial state for new versions before they are promoted via Versioning API calls. Inactive versions can receive workflow executions via `VersioningOverride` only.
49+
50+
### Ramping
51+
The version is receiving a percentage of new workflow executions. If managed by a Progressive rollout, the percentage gradually increases according to the configured rollout steps. If the rollout is Manual, the user is responsible for setting the ramp percentage and ramping version.
52+
53+
### Current
54+
The current version receives all new workflow executions except those routed to the Ramping version. This is the "stable" version that handles the majority of traffic - all new workflows not being ramped to a newer version, plus all existing AutoUpgrade workflows running on the task queues in this Worker Deployment.
55+
56+
### Draining
57+
The version is no longer receiving new workflow executions but may still be processing existing workflows.
58+
59+
### Drained
60+
All Pinned workflows on this version have completed. The version is ready for cleanup according to the sunset configuration.
61+
62+
## Rollout Strategies
63+
64+
### Manual Strategy
65+
Requires explicit human intervention to promote versions. New versions remain in the `Inactive` state until manually promoted.
66+
67+
**Use cases:**
68+
- Advanced deployment scenarios that are not supported by the other strategies (eg. user wants to do custom testing and validation before making changes to how workflow traffic is routed)
69+
70+
### AllAtOnce Strategy
71+
Immediately routes 100% of new workflow executions to the target version once it's healthy and registered.
72+
73+
**Use cases:**
74+
- Non-production environments
75+
- Low-risk deployments
76+
- When you want immediate cutover without gradual rollout
77+
78+
### Progressive Strategy
79+
Gradually increases the percentage of new workflow executions routed to the new version according to configured steps.
80+
81+
**Use cases:**
82+
- Production deployments where you want to validate new versions gradually
83+
- When you want automated rollouts with built-in safety checks
84+
- Deployments that benefit from canary analysis
85+
86+
## Configuration Concepts
87+
88+
### Worker Options
89+
Configuration that tells the controller how to connect to the same Temporal cluster and namespace that the worker is connected to:
90+
- **connection**: Reference to a `TemporalConnection` custom resource
91+
- **temporalNamespace**: The Temporal namespace to connect to
92+
- **deploymentName**: The logical deployment name in Temporal (auto-generated if not specified)
93+
94+
### Rollout Configuration
95+
Defines how new versions are promoted:
96+
- **strategy**: Manual, AllAtOnce, or Progressive
97+
- **steps**: For Progressive strategy, defines ramp percentages and pause durations
98+
- **gate**: Optional workflow that must succeed on all task queues in the target Worker Deployment Version before promotion continues
99+
100+
### Sunset Configuration
101+
Defines how Drained versions are cleaned up:
102+
- **scaledownDelay**: How long to wait after a version has been Drained before scaling pods to zero
103+
- **deleteDelay**: How long to wait after a version has been Drained before deleting the Kubernetes `Deployment`
104+
105+
### Template
106+
The pod template used for the target version of this worker deployment. Similar to the pod template used in a standar Kubernetes `Deployment`, but managed by the controller.
107+
108+
## Environment Variables
109+
110+
The controller automatically sets these environment variables for all worker pods:
111+
112+
### TEMPORAL_ADDRESS
113+
The host and port of the Temporal server, derived from the `TemporalConnection` custom resource.
114+
The worker must connect to this Temporal endpoint, but since this is user provided and not controller generated, the user does not necessarily need to access this env var to get that endpoint if it already knows the endpoint another way.
115+
116+
### TEMPORAL_NAMESPACE
117+
The Temporal namespace the worker should connect to, from `spec.workerOptions.temporalNamespace`.
118+
The worker must connect to this Temporal namespace, but since this is user provided and not controller generated, the user does not necessarily need to access this env var to get that namespace if it already knows the namespace another way.
119+
120+
### TEMPORAL_DEPLOYMENT_NAME
121+
The worker deployment name in Temporal, auto-generated from the `TemporalWorkerDeployment` name and Kubernetes namespace.
122+
The worker *must* use this to configure its `worker.DeploymentOptions`.
123+
124+
### TEMPORAL_WORKER_BUILD_ID
125+
The build ID for this specific version, derived from the container image tag and hash of the target pod template.
126+
The worker *must* use this to configure its `worker.DeploymentOptions`.
127+
128+
## Resource Management Concepts
129+
130+
### Rainbow Deployments
131+
The pattern of running multiple versions of the same service simultaneously. Running multiple versions of your workers simultaneously is essential for supporting Pinned workflows in Temporal, as Pinned workflows must continue executing on the worker version they started on.
132+
133+
### Version Lifecycle Management
134+
The automated process of:
135+
1. Registering new versions with Temporal
136+
2. Gradually routing traffic to new versions
137+
3. Cleaning up resources for drained versions
138+
139+
### Controller-Managed Resources
140+
Resources that are created, updated, and deleted automatically by the controller:
141+
- `TemporalWorkerDeployment` custom resources, to update their status
142+
- Kubernetes `Deployment` resources for each version
143+
- Labels and annotations for tracking and management

0 commit comments

Comments
 (0)