Skip to content

Commit 4022cb8

Browse files
committed
fix: retarget blue-green previewService before scaling up preview ReplicaSet (argoproj#1368)
Signed-off-by: Jesse Suen <[email protected]>
1 parent 1e52def commit 4022cb8

File tree

11 files changed

+63
-35
lines changed

11 files changed

+63
-35
lines changed

docs/features/bluegreen.md

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,25 @@ spec:
6868
scaleDownDelayRevisionLimit: *int32
6969
```
7070
71+
## Sequence of Events
72+
73+
The following describes the sequence of events that happen during a blue-green update.
74+
75+
1. Beginning at a fully promoted, steady-state, a revision 1 ReplicaSet is pointed to by both the `activeService` and `previewService`.
76+
1. A user initiates an update by modifying the pod template (`spec.template.spec`).
77+
1. The revision 2 ReplicaSet is created with size 0.
78+
1. The preview service is modified to point to the revision 2 ReplicaSet. The `activeService` remains pointing to revision 1.
79+
1. The revision 2 ReplicaSet is scaled to either `spec.replicas` or `previewReplicaCount` if set.
80+
1. Once revision 2 ReplicaSet Pods are fully available, `prePromotionAnalysis` begins.
81+
1. Upon success of `prePromotionAnalysis`, the blue/green pauses if `autoPromotionEnabled` is false, or `autoPromotionSeconds` is non-zero.
82+
1. The rollout is resumed either manually by a user, or automatically by surpassing `autoPromotionSeconds`.
83+
1. The revision 2 ReplicaSet is scaled to the `spec.replicas`, if the `previewReplicaCount` feature was used.
84+
1. The rollout "promotes" the revision 2 ReplicaSet by updating the `activeService` to point to it. At this point, there are no services pointing to revision 1
85+
1. `postPromotionAnalysis` analysis begins
86+
1. Once `postPromotionAnalysis` completes successfully, the update is successful and the revision 2 ReplicaSet is marked as stable. The rollout is considered fully-promoted.
87+
1. After waiting `scaleDownDelaySeconds` (default 30 seconds), the revision 1 ReplicaSet is scaled down
88+
89+
7190
### autoPromotionEnabled
7291
The AutoPromotionEnabled will make the rollout automatically promote the new ReplicaSet to the active service once the new ReplicaSet is healthy. This field is defaulted to true if it is not specified.
7392

@@ -111,15 +130,6 @@ This feature is used to provide an endpoint that can be used to test a new versi
111130

112131
Defaults to an empty string
113132

114-
Here is a timeline of how the active and preview services work (if you use a preview service):
115-
116-
1. During the Initial deployment there is only one ReplicaSet. Both active and preview services point to it. This is the **old** version of the application.
117-
1. A change happens in the Rollout resource. A new ReplicaSet is created. This is the **new** version of the application. The preview service is modified to point to the new ReplicaSet. The active service still points to the old version.
118-
1. The blue/green deployment is "promoted". Both active and preview services are pointing to the new version. The old version is still there but no service is pointing at it.
119-
1. Once the the blue/green deployment is scaled down (see the `scaleDownDelaySeconds` field) the old ReplicaSet is has 0 replicas and we are back to the initial state. Both active and preview services point to the new version (which is the only one present anyway)
120-
121-
122-
123133
### previewReplicaCount
124134
The PreviewReplicaCount field will indicate the number of replicas that the new version of an application should run. Once the application is ready to promote to the active service, the controller will scale the new ReplicaSet to the value of the `spec.replicas`. The rollout will not switch over the active service to the new ReplicaSet until it matches the `spec.replicas` count.
125135

@@ -136,3 +146,4 @@ Defaults to 30
136146
The ScaleDownDelayRevisionLimit limits the number of old active ReplicaSets to keep scaled up while they wait for the scaleDownDelay to pass after being removed from the active service.
137147

138148
If omitted, all ReplicaSets will be retained for the specified scaleDownDelay
149+

rollout/analysis_test.go

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1517,6 +1517,7 @@ func TestDoNotCreateBackgroundAnalysisRunOnNewCanaryRollout(t *testing.T) {
15171517

15181518
f.expectCreateReplicaSetAction(rs1)
15191519
f.expectUpdateRolloutStatusAction(r1) // update conditions
1520+
f.expectUpdateReplicaSetAction(rs1) // scale replica set
15201521
f.expectPatchRolloutAction(r1)
15211522
f.run(getKey(r1, t))
15221523
}
@@ -1551,6 +1552,7 @@ func TestDoNotCreateBackgroundAnalysisRunOnNewCanaryRolloutStableRSEmpty(t *test
15511552

15521553
f.expectCreateReplicaSetAction(rs1)
15531554
f.expectUpdateRolloutStatusAction(r1) // update conditions
1555+
f.expectUpdateReplicaSetAction(rs1) // scale replica set
15541556
f.expectPatchRolloutAction(r1)
15551557
f.run(getKey(r1, t))
15561558
}
@@ -1686,6 +1688,7 @@ func TestDoNotCreatePrePromotionAnalysisRunOnNewRollout(t *testing.T) {
16861688

16871689
f.expectCreateReplicaSetAction(rs)
16881690
f.expectUpdateRolloutStatusAction(r)
1691+
f.expectUpdateReplicaSetAction(rs) // scale RS
16891692
f.expectPatchRolloutAction(r)
16901693
f.run(getKey(r, t))
16911694
}

rollout/bluegreen.go

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,12 @@ func (c *rolloutContext) rolloutBlueGreen() error {
2626
return err
2727
}
2828

29+
// This must happen right after the new replicaset is created
30+
err = c.reconcilePreviewService(previewSvc)
31+
if err != nil {
32+
return err
33+
}
34+
2935
if replicasetutil.CheckPodSpecChange(c.rollout, c.newRS) {
3036
return c.syncRolloutStatusBlueGreen(previewSvc, activeSvc)
3137
}
@@ -40,11 +46,6 @@ func (c *rolloutContext) rolloutBlueGreen() error {
4046
return err
4147
}
4248

43-
err = c.reconcilePreviewService(previewSvc)
44-
if err != nil {
45-
return err
46-
}
47-
4849
c.reconcileBlueGreenPause(activeSvc, previewSvc)
4950

5051
err = c.reconcileActiveService(activeSvc)

rollout/bluegreen_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ func TestBlueGreenCreatesReplicaSet(t *testing.T) {
5454

5555
f.expectCreateReplicaSetAction(rs)
5656
servicePatchIndex := f.expectPatchServiceAction(previewSvc, rsPodHash)
57+
f.expectUpdateReplicaSetAction(rs) // scale up RS
5758
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r)
5859
expectedPatchWithoutSubs := `{
5960
"status":{

rollout/canary_test.go

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,15 +77,19 @@ func TestCanaryRolloutBumpVersion(t *testing.T) {
7777
f.replicaSetLister = append(f.replicaSetLister, rs1)
7878

7979
createdRSIndex := f.expectCreateReplicaSetAction(rs2)
80+
updatedRSIndex := f.expectUpdateReplicaSetAction(rs2) // scale up RS
8081
updatedRolloutRevisionIndex := f.expectUpdateRolloutAction(r2) // update rollout revision
8182
updatedRolloutConditionsIndex := f.expectUpdateRolloutStatusAction(r2) // update rollout conditions
8283
f.expectPatchRolloutAction(r2)
8384
f.run(getKey(r2, t))
8485

8586
createdRS := f.getCreatedReplicaSet(createdRSIndex)
86-
assert.Equal(t, int32(1), *createdRS.Spec.Replicas)
87+
assert.Equal(t, int32(0), *createdRS.Spec.Replicas)
8788
assert.Equal(t, "2", createdRS.Annotations[annotations.RevisionAnnotation])
8889

90+
updatedRS := f.getUpdatedReplicaSet(updatedRSIndex)
91+
assert.Equal(t, int32(1), *updatedRS.Spec.Replicas)
92+
8993
updatedRollout := f.getUpdatedRollout(updatedRolloutRevisionIndex)
9094
assert.Equal(t, "2", updatedRollout.Annotations[annotations.RevisionAnnotation])
9195

@@ -475,6 +479,7 @@ func TestCanaryRolloutCreateFirstReplicasetNoSteps(t *testing.T) {
475479
rs := newReplicaSet(r, 1)
476480

477481
f.expectCreateReplicaSetAction(rs)
482+
f.expectUpdateReplicaSetAction(rs) // scale up rs
478483
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r)
479484
patchIndex := f.expectPatchRolloutAction(r)
480485
f.run(getKey(r, t))
@@ -514,6 +519,7 @@ func TestCanaryRolloutCreateFirstReplicasetWithSteps(t *testing.T) {
514519
rs := newReplicaSet(r, 1)
515520

516521
f.expectCreateReplicaSetAction(rs)
522+
f.expectUpdateReplicaSetAction(rs) // scale up rs
517523
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r)
518524
patchIndex := f.expectPatchRolloutAction(r)
519525
f.run(getKey(r, t))
@@ -559,12 +565,15 @@ func TestCanaryRolloutCreateNewReplicaWithCorrectWeight(t *testing.T) {
559565
f.replicaSetLister = append(f.replicaSetLister, rs1)
560566

561567
createdRSIndex := f.expectCreateReplicaSetAction(rs2)
568+
updatedRSIndex := f.expectUpdateReplicaSetAction(rs2)
562569
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r2)
563570
f.expectPatchRolloutAction(r2)
564571
f.run(getKey(r2, t))
565572

566573
createdRS := f.getCreatedReplicaSet(createdRSIndex)
567-
assert.Equal(t, int32(1), *createdRS.Spec.Replicas)
574+
assert.Equal(t, int32(0), *createdRS.Spec.Replicas)
575+
updatedRS := f.getUpdatedReplicaSet(updatedRSIndex)
576+
assert.Equal(t, int32(1), *updatedRS.Spec.Replicas)
568577

569578
updatedRollout := f.getUpdatedRollout(updatedRolloutIndex)
570579
progressingCondition := conditions.GetRolloutCondition(updatedRollout.Status, v1alpha1.RolloutProgressing)

rollout/ephemeralmetadata_test.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ func TestSyncCanaryEphemeralMetadataInitialRevision(t *testing.T) {
3737

3838
f.expectUpdateRolloutStatusAction(r1)
3939
idx := f.expectCreateReplicaSetAction(rs1)
40+
f.expectUpdateReplicaSetAction(rs1)
4041
_ = f.expectPatchRolloutAction(r1)
4142
f.run(getKey(r1, t))
4243
createdRS1 := f.getCreatedReplicaSet(idx)
@@ -75,8 +76,9 @@ func TestSyncBlueGreenEphemeralMetadataInitialRevision(t *testing.T) {
7576

7677
f.expectUpdateRolloutStatusAction(r1)
7778
idx := f.expectCreateReplicaSetAction(rs1)
78-
_ = f.expectPatchRolloutAction(r1)
79+
f.expectPatchRolloutAction(r1)
7980
f.expectPatchServiceAction(previewSvc, rs1.Labels[v1alpha1.DefaultRolloutUniqueLabelKey])
81+
f.expectUpdateReplicaSetAction(rs1) // scale replicaset
8082
f.run(getKey(r1, t))
8183
createdRS1 := f.getCreatedReplicaSet(idx)
8284
expectedLabels := map[string]string{
@@ -209,6 +211,7 @@ func TestSyncBlueGreenEphemeralMetadataSecondRevision(t *testing.T) {
209211
f.expectUpdateRolloutStatusAction(r2) // Update Rollout conditions
210212
rs2idx := f.expectCreateReplicaSetAction(rs2) // Create revision 2 ReplicaSet
211213
f.expectPatchServiceAction(previewSvc, rs2PodHash) // Update preview service to point at revision 2 replicaset
214+
f.expectUpdateReplicaSetAction(rs2) // scale revision 2 ReplicaSet up
212215
f.expectListPodAction(r1.Namespace) // list pods to patch ephemeral data on revision 1 ReplicaSets pods`
213216
podIdx := f.expectUpdatePodAction(&pod) // Update pod with ephemeral data
214217
rs1idx := f.expectUpdateReplicaSetAction(rs1) // update stable replicaset with stable metadata

rollout/experiment_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -519,6 +519,7 @@ func TestRolloutDoNotCreateExperimentWithoutStableRS(t *testing.T) {
519519
f.expectCreateReplicaSetAction(rs2)
520520
f.expectUpdateRolloutAction(r2) // update revision
521521
f.expectUpdateRolloutStatusAction(r2) // update progressing condition
522+
f.expectUpdateReplicaSetAction(rs2) // scale replicaset
522523
f.expectPatchRolloutAction(r1)
523524
f.run(getKey(r2, t))
524525
}

rollout/sync.go

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -159,13 +159,7 @@ func (c *rolloutContext) createDesiredReplicaSet() (*appsv1.ReplicaSet, error) {
159159
Template: newRSTemplate,
160160
},
161161
}
162-
allRSs := append(c.allRSs, newRS)
163-
newReplicasCount, err := replicasetutil.NewRSNewReplicas(c.rollout, allRSs, newRS)
164-
if err != nil {
165-
return nil, err
166-
}
167-
168-
newRS.Spec.Replicas = pointer.Int32Ptr(newReplicasCount)
162+
newRS.Spec.Replicas = pointer.Int32Ptr(0)
169163
// Set new replica set's annotation
170164
annotations.SetNewReplicaSetAnnotations(c.rollout, newRS, newRevision, false)
171165

@@ -250,12 +244,10 @@ func (c *rolloutContext) createDesiredReplicaSet() (*appsv1.ReplicaSet, error) {
250244
return nil, err
251245
}
252246

253-
if !alreadyExists && newReplicasCount > 0 {
247+
if !alreadyExists {
254248
revision, _ := replicasetutil.Revision(createdRS)
255-
c.recorder.Eventf(c.rollout, record.EventOptions{EventReason: conditions.NewReplicaSetReason}, conditions.NewReplicaSetDetailedMessage, createdRS.Name, revision, newReplicasCount)
256-
}
249+
c.recorder.Eventf(c.rollout, record.EventOptions{EventReason: conditions.NewReplicaSetReason}, conditions.NewReplicaSetDetailedMessage, createdRS.Name, revision)
257250

258-
if !alreadyExists {
259251
msg := fmt.Sprintf(conditions.NewReplicaSetMessage, createdRS.Name)
260252
condition := conditions.NewRolloutCondition(v1alpha1.RolloutProgressing, corev1.ConditionTrue, conditions.NewReplicaSetReason, msg)
261253
conditions.SetRolloutCondition(&c.rollout.Status, *condition)

rollout/sync_test.go

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -256,14 +256,17 @@ func TestCanaryPromoteFull(t *testing.T) {
256256
f.kubeobjects = append(f.kubeobjects, rs1)
257257
f.replicaSetLister = append(f.replicaSetLister, rs1)
258258

259-
createdRS2Index := f.expectCreateReplicaSetAction(rs2) // create new ReplicaSet (surge to 10)
259+
createdRS2Index := f.expectCreateReplicaSetAction(rs2) // create new ReplicaSet (size 0)
260260
f.expectUpdateRolloutAction(r2) // update rollout revision
261261
f.expectUpdateRolloutStatusAction(r2) // update rollout conditions
262+
updatedRS2Index := f.expectUpdateReplicaSetAction(rs2) // scale new ReplicaSet to 10
262263
patchedRolloutIndex := f.expectPatchRolloutAction(r2)
263264
f.run(getKey(r2, t))
264265

265266
createdRS2 := f.getCreatedReplicaSet(createdRS2Index)
266-
assert.Equal(t, int32(10), *createdRS2.Spec.Replicas) // verify we ignored steps
267+
assert.Equal(t, int32(0), *createdRS2.Spec.Replicas)
268+
updatedRS2 := f.getUpdatedReplicaSet(updatedRS2Index)
269+
assert.Equal(t, int32(10), *updatedRS2.Spec.Replicas) // verify we ignored steps and fully scaled it
267270

268271
patchedRollout := f.getPatchedRolloutAsObject(patchedRolloutIndex)
269272
assert.Equal(t, int32(2), *patchedRollout.Status.CurrentStepIndex) // verify we updated to last step

test/e2e/functional_test.go

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -90,10 +90,12 @@ spec:
9090
ExpectRevisionPodCount("2", 1).
9191
ExpectRolloutEvents([]string{
9292
"RolloutUpdated", // Rollout updated to revision 1
93-
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-698fbfb9dc (revision 1) with size 1
93+
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-698fbfb9dc (revision 1)
94+
"ScalingReplicaSet", // Scaled up ReplicaSet abort-retry-promote-698fbfb9dc (revision 1) from 0 to 1
9495
"RolloutCompleted", // Rollout completed update to revision 1 (698fbfb9dc): Initial deploy
9596
"RolloutUpdated", // Rollout updated to revision 2
96-
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2) with size 1
97+
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2)
98+
"ScalingReplicaSet", // Scaled up ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2) from 0 to 1
9799
"RolloutStepCompleted", // Rollout step 1/2 completed (setWeight: 50)
98100
"RolloutPaused", // Rollout is paused (CanaryPauseStep)
99101
"ScalingReplicaSet", // Scaled down ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2) from 1 to 0
@@ -696,11 +698,13 @@ func (s *FunctionalSuite) TestBlueGreenUpdate() {
696698
ExpectReplicaCounts(3, 6, 3, 3, 3).
697699
ExpectRolloutEvents([]string{
698700
"RolloutUpdated", // Rollout updated to revision 1
699-
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-7dcd8f8869 (revision 1) with size 3
701+
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-7dcd8f8869 (revision 1)
702+
"ScalingReplicaSet", // Scaled up ReplicaSet bluegreen-7dcd8f8869 (revision 1) from 0 to 3
700703
"RolloutCompleted", // Rollout completed update to revision 1 (7dcd8f8869): Initial deploy
701704
"SwitchService", // Switched selector for service 'bluegreen' from '' to '7dcd8f8869'
702705
"RolloutUpdated", // Rollout updated to revision 2
703-
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-5498785cd6 (revision 2) with size 3
706+
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-5498785cd6 (revision 2)
707+
"ScalingReplicaSet", // Scaled up ReplicaSet bluegreen-5498785cd6 (revision 2) from 0 to 3
704708
"SwitchService", // Switched selector for service 'bluegreen' from '7dcd8f8869' to '6c779b88b6'
705709
"RolloutCompleted", // Rollout completed update to revision 2 (6c779b88b6): Completed blue-green update
706710
})

0 commit comments

Comments
 (0)