Skip to content

Commit 91391fe

Browse files
cicoyledapr-bot
andauthored
[Master] Respect disabling placement with global.actors.enabled (dapr#8589)
* allow globally disabling placement connection Signed-off-by: Cassandra Coyle <[email protected]> * add test Signed-off-by: Cassandra Coyle <[email protected]> * cherrypick Signed-off-by: Cassandra Coyle <[email protected]> --------- Signed-off-by: Cassandra Coyle <[email protected]> Co-authored-by: Dapr Bot <[email protected]>
1 parent 7d779db commit 91391fe

File tree

3 files changed

+111
-14
lines changed

3 files changed

+111
-14
lines changed

docs/release_notes/v1.15.4.md

Lines changed: 89 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,12 @@
33
This update includes bug fixes:
44

55
- [Fix degradation of Workflow runtime performance over time](#fix-degradation-of-workflow-runtime-performance-over-time)
6-
- [Allow Service Account for MetalBear mirrord operator in sidecar injector](#allow-service-account-for-metalbear-mirrord-operator-in-sidecar-injector)
6+
- [Fix remote Actor invocation 500 retry](#fix-remote-actor-invocation-500-retry)
7+
- [Fix Global Actors Enabled Configuration](#fix-global-actors-enabled-configuration)
78
- [Prevent panic of reminder operations on slow Actor Startup](#prevent-panic-of-reminder-operations-on-slow-actor-startup)
9+
- [Remove client-side rate limiter from Sentry](#remove-client-side-rate-limiter-from-sentry)
10+
- [Allow Service Account for MetalBear mirrord operator in sidecar injector](#allow-service-account-for-metalbear-mirrord-operator-in-sidecar-injector)
11+
- [Fix Scheduler Client connection pruning](#fix-scheduler-client-connection-pruning)
812

913
## Fix degradation of Workflow runtime performance over time
1014

@@ -26,23 +30,48 @@ This caused Jobs to fail, and enter failure policy retry loops.
2630

2731
Refactor the Scheduler connection pool logic to properly prune stale connections to prevent job execution occurring on stale connections and causing failure policy loops.
2832

29-
## Allow Service Account for MetalBear mirrord operator in sidecar injector
33+
## Fix remote Actor invocation 500 retry
3034

3135
### Problem
3236

33-
Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.
37+
An actor invocation across hosts which result in a 500 HTTP header response code would result in the request being retried 5 times.
3438

3539
### Impact
3640

37-
Running mirrord in `copy_target` mode would cause the pod to initalise with without the dapr container.
41+
Services which return a 500 HTTP header response code would result in requests under normal operation to return slowly, and request the service on the same request multiple times.
3842

3943
### Root cause
4044

41-
Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.
45+
The Actor engine considered a 500 HTTP header response code to be a retriable error, rather than a successful request which returned a non-200 status code.
4246

4347
### Solution
4448

45-
Add the Mirrord Operator into the allow list of Service Accounts for the dapr sidecar injector.
49+
Remove the 500 HTTP header response code from the list of retriable errors.
50+
51+
### Problem
52+
53+
## Fix Global Actors Enabled Configuration
54+
55+
### Problem
56+
57+
When `global.actors.enabled` was set to `false` via Helm or the environment variable `ACTORS_ENABLED=false`, the Dapr sidecar would still attempt to connect to the placement service, causing readiness probe failures and repeatedly logged errors about failing to connect to placement.
58+
Fixes this [issue](https://github.com/dapr/dapr/issues/8551).
59+
60+
### Impact
61+
62+
Dapr sidecars would fail their readiness probes and log errors like:
63+
```
64+
Failed to connect to placement dns:///dapr-placement-server.dapr-system.svc.cluster.local:50005: failed to create placement client: rpc error: code = Unavailable desc = last resolver error: produced zero addresses
65+
```
66+
67+
### Root cause
68+
69+
The sidecar injector was not properly respecting the global actors enabled configuration when setting up the placement service connection.
70+
71+
### Solution
72+
73+
The sidecar injector now properly respects the `global.actors.enabled` helm configuration and `ACTORS_ENABLED` environment variable. When set to `false`, it will not attempt to connect to the placement service, allowing the sidecar to start successfully without actor functionality.
74+
4675

4776
## Prevent panic of reminder operations on slow Actor Startup
4877

@@ -61,3 +90,57 @@ The Dapr runtime would attempt to use the reminder service before it was initial
6190
### Solution
6291

6392
Correctly return an errors that the actor runtime was not ready in time for the reminder operation.
93+
94+
## Remove client-side rate limiter from Sentry
95+
96+
### Problem
97+
98+
A cold start of many Dapr deployments would take a long time, and even cause some crash loops.
99+
100+
### Impact
101+
102+
A large Dapr deployment would take a non-linear more amount of time that a smaller one to completely roll out.
103+
104+
### Root cause
105+
106+
The Sentry Kubernetes client was configured with a rate limiter which would be exhausted when services all new Dapr deployment at once, cause many client to wait significantly.
107+
108+
### Solution
109+
110+
Remove the client-side rate limiting from the Sentry Kubernetes client.
111+
112+
## Allow Service Account for MetalBear mirrord operator in sidecar injector
113+
114+
### Problem
115+
116+
Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.
117+
118+
### Impact
119+
120+
Running mirrord in `copy_target` mode would cause the pod to initalise without the dapr container.
121+
122+
### Root cause
123+
124+
Mirrord Operator is not on the allow list of Service Accounts for the dapr sidecar injector.
125+
126+
### Solution
127+
128+
Add the Mirrord Operator into the allow list of Service Accounts for the dapr sidecar injector.
129+
130+
## Fix Scheduler Client connection pruning
131+
132+
### Problem
133+
134+
Daprd would attempt to connect to stale Scheduler addresses.
135+
136+
### Impact
137+
138+
Network resource usage and error reporting from service mesh sidecars.
139+
140+
### Root cause
141+
142+
Daprd would not close Scheduler gRPC connections to hosts which no longer exist.
143+
144+
### Solution
145+
146+
Daprd now closes connections to Scheduler hosts when they are no longer in the list of active hosts.

pkg/injector/service/config_test.go

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,27 @@ import (
2626
)
2727

2828
func TestGetInjectorConfig(t *testing.T) {
29+
t.Setenv("NAMESPACE", "test-namespace")
30+
t.Setenv("SIDECAR_IMAGE", "daprd-test-image")
31+
32+
t.Run("respect globally disabling placement", func(t *testing.T) {
33+
t.Setenv("ACTORS_ENABLED", "false")
34+
cfg, err := GetConfig()
35+
require.NoError(t, err)
36+
assert.False(t, cfg.parsedActorsEnabled)
37+
assert.Equal(t, "false", cfg.ActorsEnabled)
38+
})
39+
t.Run("default placement is enabled", func(t *testing.T) {
40+
cfg, err := GetConfig()
41+
require.NoError(t, err)
42+
assert.Empty(t, cfg.ActorsEnabled)
43+
assert.True(t, cfg.parsedActorsEnabled)
44+
})
45+
2946
t.Run("with kube cluster domain env", func(t *testing.T) {
3047
t.Setenv("TLS_CERT_FILE", "test-cert-file")
3148
t.Setenv("TLS_KEY_FILE", "test-key-file")
32-
t.Setenv("SIDECAR_IMAGE", "daprd-test-image")
3349
t.Setenv("SIDECAR_IMAGE_PULL_POLICY", "Always")
34-
t.Setenv("NAMESPACE", "test-namespace")
3550
t.Setenv("KUBE_CLUSTER_DOMAIN", "cluster.local")
3651
t.Setenv("ALLOWED_SERVICE_ACCOUNTS", "test1:test-service-account1,test2:test-service-account2")
3752
t.Setenv("ALLOWED_SERVICE_ACCOUNTS_PREFIX_NAMES", "namespace:test-service-account1,namespace2*:test-service-account2")
@@ -49,9 +64,7 @@ func TestGetInjectorConfig(t *testing.T) {
4964
t.Run("not set kube cluster domain env", func(t *testing.T) {
5065
t.Setenv("TLS_CERT_FILE", "test-cert-file")
5166
t.Setenv("TLS_KEY_FILE", "test-key-file")
52-
t.Setenv("SIDECAR_IMAGE", "daprd-test-image")
5367
t.Setenv("SIDECAR_IMAGE_PULL_POLICY", "IfNotPresent")
54-
t.Setenv("NAMESPACE", "test-namespace")
5568
t.Setenv("KUBE_CLUSTER_DOMAIN", "")
5669

5770
cfg, err := GetConfig()
@@ -65,8 +78,6 @@ func TestGetInjectorConfig(t *testing.T) {
6578
t.Run("sidecar run options not set", func(t *testing.T) {
6679
t.Setenv("TLS_CERT_FILE", "test-cert-file")
6780
t.Setenv("TLS_KEY_FILE", "test-key-file")
68-
t.Setenv("SIDECAR_IMAGE", "daprd-test-image")
69-
t.Setenv("NAMESPACE", "test-namespace")
7081

7182
// Default values are true
7283
t.Setenv("SIDECAR_RUN_AS_NON_ROOT", "")

pkg/injector/service/pod_patch.go

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,13 @@ func (i *injector) getPodPatchOperations(ctx context.Context, ar *admissionv1.Ad
8080
sidecar.CurrentTrustAnchors = trustAnchors
8181
sidecar.DisableTokenVolume = !token.HasKubernetesToken()
8282

83-
// Set addresses for actor services
83+
// Set addresses for actor services only if it's not explicitly globally disabled
8484
// Even if actors are disabled, however, the placement-host-address flag will still be included if explicitly set in the annotation dapr.io/placement-host-address
8585
// So, if the annotation is already set, we accept that and also use placement for actors services
86-
if sidecar.PlacementAddress == "" {
86+
if !i.config.GetActorsEnabled() {
87+
sidecar.ActorsService = ""
88+
sidecar.PlacementAddress = ""
89+
} else if sidecar.PlacementAddress == "" {
8790
// Set configuration for the actors service
8891
actorsSvcName, actorsSvc := i.config.GetActorsService()
8992
actorsSvcAddr := actorsSvc.Address(i.config.Namespace, i.config.KubeClusterDomain)

0 commit comments

Comments
 (0)