Skip to content
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
88f638f
Add migration doc draft.
robholland Jul 23, 2025
9bf4fd7
Split concepts into separate file.
robholland Aug 22, 2025
ed9a510
Remove prom annotations.
robholland Aug 22, 2025
e21e16e
Remove Next Steps which isn't really related to migration per se.
robholland Aug 22, 2025
a88aa9a
Update doc reference.
robholland Aug 26, 2025
dd2fcc2
Address some feedback.
robholland Aug 26, 2025
f400ebc
Apply suggestions from code review
robholland Aug 26, 2025
d624086
Merge branch 'main' into rh-migration
robholland Sep 4, 2025
a9c2b8b
More consistent use of custom resource.
robholland Sep 4, 2025
cd05bb6
Amendments based on feedback.
robholland Sep 5, 2025
fa035b8
Add a bit more detail about the spec triggering deploys.
robholland Sep 5, 2025
3e4843e
Don't imply things are optional.
robholland Sep 5, 2025
15d660f
Update go worker code.
robholland Sep 5, 2025
270e35e
Cutover -> rollout.
robholland Sep 5, 2025
97c9da7
Correct code.
robholland Sep 5, 2025
0a3867a
Var cleanup.
robholland Sep 5, 2025
1a22a3e
Clarity.
robholland Sep 5, 2025
81d58b9
Clarity.
robholland Sep 5, 2025
ade47a1
Remove core k8s concepts.
robholland Sep 5, 2025
0126b9e
Scaling to 1 isn't useful.
robholland Sep 5, 2025
7f3f7e5
Feedback.
robholland Sep 7, 2025
1358194
Recommend progressive rather than manual.
robholland Sep 7, 2025
0d05f12
Split configuration reference into it's own file.
robholland Sep 7, 2025
07313eb
Config reference iteration.
robholland Sep 7, 2025
443575f
Add TLS to example custom resource.
robholland Sep 7, 2025
c9b4394
Merge branch 'main' into rh-migration
robholland Sep 7, 2025
31ce011
Recommend conservative Progressive instead of Manual initial rollout
carlydf Sep 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ This documentation structure is designed to support various types of technical d

## Index

### [Migration Guide](migration-guide.md)
Comprehensive guide for migrating from existing unversioned worker deployment systems to the Temporal Worker Controller. Includes step-by-step instructions, configuration mapping, and common patterns.

### [Limits](limits.md)
Technical constraints and limitations of the Temporal Worker Controller system, including maximum field lengths and other operational boundaries.

Expand Down
144 changes: 144 additions & 0 deletions docs/concepts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# Temporal Worker Controller Concepts

This document defines key concepts and terminology used throughout the Temporal Worker Controller documentation.

## Core Terminology

### Temporal Worker Deployment
A logical grouping in Temporal that represents a collection of workers that are deployed together and should be versioned together. Examples include "payment-processor", "notification-sender", or "data-pipeline-worker". This is a concept within Temporal itself, not specific to Kubernetes. See https://docs.temporal.io/production-deployment/worker-deployments/worker-versioning for more details.

**Key characteristics:**
- Identified by a unique worker deployment name (e.g., "payment-processor/staging")
- Can have multiple concurrent worker versions running simultaneously
- Versions of a Worker Deployment are identified by Build IDs (e.g., "v1.5.1", "v1.5.2")
- Temporal routes workflow executions to appropriate worker versions based on the `RoutingConfig` of the Worker Deployment that the versions are in.

### `TemporalWorkerDeployment` CRD
The Kubernetes Custom Resource Definition that manages one Temporal Worker Deployment. This is the primary resource you interact with when using the Temporal Worker Controller.

**Key characteristics:**
- One `TemporalWorkerDeployment` Custom Resource per Temporal Worker Deployment
- Manages the lifecycle of all versions for that worker deployment
- Defines rollout strategies, resource requirements, and connection details
- Controller creates and manages multiple Kubernetes `Deployment` resources based on this spec

The actual Kubernetes `Deployment` resources that run worker pods. The controller automatically creates these - you don't manage them directly.

**Key characteristics:**
- Multiple Kubernetes `Deployment` resources per `TemporalWorkerDeployment` Custom Resource (one per version)
- Named with the pattern: `{worker-deployment-name}-{build-id}` (e.g., `staging/payment-processor-v1.5.1`)
- Each runs a specific version of your worker code

### Key Relationship
**One `TemporalWorkerDeployment` Custom Resource → Multiple Kubernetes `Deployment` resources (managed by controller)**

Make changes to the spec of your `TemporalWorkerDeployment` Custom Resource, and the controller handles all the underlying Kubernetes `Deployment` resources for different versions.

## Version States

Worker deployment versions progress through various states during their lifecycle:

### NotRegistered
The version has been specified in the `TemporalWorkerDeployment` custom resource but hasn't been registered with Temporal yet. This typically happens when:
- The worker pods are still starting up
- There are connectivity issues to Temporal
- The worker code has errors preventing registration

### Inactive
The version is registered with Temporal but isn't automatically receiving any new workflow executions through the Worker Deployment's `RoutingConfig`. This is the initial state for new versions before they are promoted via Versioning API calls. Inactive versions can receive workflow executions via `VersioningOverride` only.

### Ramping
The version is receiving a percentage of new workflow executions. If managed by a Progressive rollout, the percentage gradually increases according to the configured rollout steps. If the rollout is Manual, the user is responsible for setting the ramp percentage and ramping version.

### Current
The current version receives all new workflow executions except those routed to the Ramping version. This is the "stable" version that handles the majority of traffic - all new workflows not being ramped to a newer version, plus all existing AutoUpgrade workflows running on the task queues in this Worker Deployment.

### Draining
The version is no longer receiving new workflow executions but may still be processing existing workflows.

### Drained
All Pinned workflows on this version have completed. The version is ready for cleanup according to the sunset configuration.

## Rollout Strategies

### Manual Strategy
Requires explicit human intervention to promote versions. New versions remain in the `Inactive` state until manually promoted.

**Use cases:**
- During migration from manual deployment systems
- Testing and validation scenarios

### AllAtOnce Strategy
Immediately routes 100% of new workflow executions to the target version once it's healthy and registered.

**Use cases:**
- Non-production environments
- Low-risk deployments
- When you want immediate cutover without gradual rollout

### Progressive Strategy
Gradually increases the percentage of new workflow executions routed to the new version according to configured steps.

**Use cases:**
- Production deployments where you want to validate new versions gradually
- When you want automated rollouts with built-in safety checks
- Deployments that benefit from canary analysis

## Configuration Concepts

### Worker Options
Configuration that tells the controller how to connect to the same Temporal cluster and namespace that the worker is connected to:
- **connection**: Reference to a `TemporalConnection` custom resource
- **temporalNamespace**: The Temporal namespace to connect to
- **deploymentName**: The logical deployment name in Temporal (auto-generated if not specified)

### Rollout Configuration
Defines how new versions are promoted:
- **strategy**: Manual, AllAtOnce, or Progressive
- **steps**: For Progressive strategy, defines ramp percentages and pause durations
- **gate**: Optional workflow that must succeed on all task queues in the target Worker Deployment Version before promotion continues

### Sunset Configuration
Defines how Drained versions are cleaned up:
- **scaledownDelay**: How long to wait after a version has been Drained before scaling pods to zero
- **deleteDelay**: How long to wait after a version has been Drained before deleting the Kubernetes `Deployment`

### Template
The pod template used for the target version of this worker deployment. Similar to the pod template used in a standar Kubernetes `Deployment`, but managed by the controller.

## Environment Variables

The controller automatically sets these environment variables for all worker pods:

### TEMPORAL_ADDRESS
The host and port of the Temporal server, derived from the `TemporalConnection` custom resource.
The worker must connect to this Temporal endpoint, but since this is user provided and not controller generated, the user does not necessarily need to access this env var to get that endpoint if it already knows the endpoint another way.

### TEMPORAL_NAMESPACE
The Temporal namespace the worker should connect to, from `spec.workerOptions.temporalNamespace`.
The worker must connect to this Temporal namespace, but since this is user provided and not controller generated, the user does not necessarily need to access this env var to get that namespace if it already knows the namespace another way.

### TEMPORAL_DEPLOYMENT_NAME
The worker deployment name in Temporal, auto-generated from the `TemporalWorkerDeployment` name and Kubernetes namespace.
The worker *must* use this to configure its `worker.DeploymentOptions`.

### TEMPORAL_WORKER_BUILD_ID
The build ID for this specific version, derived from the container image tag and hash of the target pod template.
The worker *must* use this to configure its `worker.DeploymentOptions`.

## Resource Management Concepts

### Rainbow Deployments
The pattern of running multiple versions of the same service simultaneously. Running multiple versions of your workers simultaneously is essential for supporting Pinned workflows in Temporal, as Pinned workflows must continue executing on the worker version they started on.

### Version Lifecycle Management
The automated process of:
1. Registering new versions with Temporal
2. Gradually routing traffic to new versions
3. Cleaning up resources for drained versions

### Controller-Managed Resources
Resources that are created, updated, and deleted automatically by the controller:
- `TemporalWorkerDeployment` custom resources, to update their status
- Kubernetes `Deployment` resources for each version
- Labels and annotations for tracking and management
Loading
Loading