Skip to content

Conversation

@Shivs11
Copy link
Member

@Shivs11 Shivs11 commented Aug 1, 2025

What was changed

  • Handle updates to TemporalConnection, for TWD's that are associated with one, so that these updates are propagated deployments.
  • This was achieved by adding the hashed contents of a TemporalConnection as pod annotations. Thus, if a TemporalConnection were to get updated, it would result in new deployments with the updated annotation. The new updated secret would also be mounted so that the workers starting can be started with the latest secrets.

Why?

  • core functionality

Checklist

  1. Closes
  1. How was this tested:
  • Added unit tests and also tested this locally. The only thing I was not able to do was add an integration test since I was not able to configure our temporaltest server using mTLS.

Here's how I tested this locally:

  • I ran a worker pod which was using the temporal-cloud-mtls secret to connect to Temporal.
  • I updated the namespace the worker was connected to by changing the mTLS secret. This resulted in the controller and the worker pods throwing connection errors (expected)
  • I updated the local secret in the local k8s cluster (and also changed it's name to temporal-cloud-mtls-1 since we want our controller to also refresh the client it's using)
  • Noticed that a rolling deployment was conducted and everything stabalized.

Proof of a rolling update:
Screenshot 2025-08-07 at 5 41 34 PM

  1. Any docs updates needed?
  • None

@Shivs11 Shivs11 marked this pull request as ready for review August 1, 2025 14:31
@Shivs11 Shivs11 requested review from a team and jlegrone as code owners August 1, 2025 14:31
Namespace: workerDeploy.Spec.WorkerOptions.TemporalNamespace,
HostPort: temporalConnection.Spec.HostPort,
Namespace: workerDeploy.Spec.WorkerOptions.TemporalNamespace,
MutualTLSSecret: temporalConnection.Spec.MutualTLSSecret,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlegrone - curious to hear your thoughts on this:

I added the name of the MutualTLSSecret as a key over here since a change to the secret name, for the same connectionSpec object, should trigger our worker controller to refresh the client it is using! This was required since I realized that there could be a world where the old secret is not expired but just replaced.

However, one area where this breaks is if someone just replaces the contents of an existing secret without changing the name. In this way, the controller does not really get a new client.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you solve the problem of content changing without name changing by storing a hash of the name + contents instead of just the name?

Or, if that's inefficient because it means you have to read the secret contents all the time just to generate the map key, we could do a different solution for the contents changing, where if an error occurs when calling the temporal APIs that suggests "wrong credentials", we could try reloading them once just in case it's changed. Then, do a namespace describe to make sure they work, and if they work, restart all the workers.

I don't think this needs to be solved in this PR though, but if you don't solve the "changed content, not expired, same name" thing, do just create an issue about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for responding to this;

yes, I did think about the storing a hash of the name + contents idea but I decided against it because in one of my earlier PR's, we discussed that reading the contents of a secret on every reconciliation loop could be intensive.

having said that, I like the APIs that suggests "wrong credentials", we could try reloading them once just in case it's changed idea but don't think it should come in this PR. I can make a separate issue for this!

Namespace: workerDeploy.Spec.WorkerOptions.TemporalNamespace,
HostPort: temporalConnection.Spec.HostPort,
Namespace: workerDeploy.Spec.WorkerOptions.TemporalNamespace,
MutualTLSSecret: temporalConnection.Spec.MutualTLSSecret,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you solve the problem of content changing without name changing by storing a hash of the name + contents instead of just the name?

Or, if that's inefficient because it means you have to read the secret contents all the time just to generate the map key, we could do a different solution for the contents changing, where if an error occurs when calling the temporal APIs that suggests "wrong credentials", we could try reloading them once just in case it's changed. Then, do a namespace describe to make sure they work, and if they work, restart all the workers.

I don't think this needs to be solved in this PR though, but if you don't solve the "changed content, not expired, same name" thing, do just create an issue about it.


// If the connection spec hash has changed, update the deployment
currentHash := k8s.ComputeConnectionSpecHash(connection)
if currentHash != existingDeployment.Spec.Template.Annotations[k8s.ConnectionSpecHashAnnotation] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible for annotations to ever be nil? like if the user doesn't provide any or something. Could you make sure to handle the case where the user provides no initial annotations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the code which builds pod annotations:

func ComputeConnectionSpecHash(connection temporaliov1alpha1.TemporalConnectionSpec) string {
	// HostPort is required, but MutualTLSSecret can be empty for non-mTLS connections
	if connection.HostPort == "" {
		return ""
	}

	hasher := sha256.New()

	// Hash connection spec fields in deterministic order
	_, _ = hasher.Write([]byte(connection.HostPort))
	_, _ = hasher.Write([]byte(connection.MutualTLSSecret))

	return hex.EncodeToString(hasher.Sum(nil))
}

Right now, pod annotations will actually never be nil. The piece of code conducting the check connection.HostPort == "" is actually a sign of defensive programming. In the odd world where some client were to forget to specify the hostport in the TemporalConnection spec, the controller would error out earlier in the reconciliation loop.

Specifically, this would happen when it would try to create/fetch a client since having a nil hostport would force the controller to try creating a new SDK client trying to connect to nil temporal hostport. This shall fail!

@Shivs11 Shivs11 merged commit 5d9fc97 into temporalio:main Aug 15, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants