Skip to content

Commit 9d4deb4

Browse files
authored
Merge pull request kubernetes#2553 from chaodaiG/prow-deploy-address-comment
KEP-2539: Addressing comments from kubernetes#2540
2 parents 841863e + 70e9e00 commit 9d4deb4

File tree

2 files changed

+44
-21
lines changed

2 files changed

+44
-21
lines changed

keps/sig-testing/2539-continuously-deploy-k8s-prow/README.md

Lines changed: 41 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@
1414
- [Design Details](#design-details)
1515
- [Automated Merging of Prow Autobump PRs](#automated-merging-of-prow-autobump-prs)
1616
- [Roll Back Process](#roll-back-process)
17+
- [Graduation Criteria](#graduation-criteria)
18+
- [Alpha -> Beta Graduation](#alpha---beta-graduation)
19+
- [Beta -> GA Graduation](#beta---ga-graduation)
20+
- [Announcement](#announcement)
1721
- [Implementation History](#implementation-history)
1822
- [Alternatives](#alternatives)
1923
- [A new tool merges autobump PRs](#a-new-tool-merges-autobump-prs)
@@ -82,7 +86,7 @@ Shouldn’t see any change, prow breakage should be discovered by prow monitorin
8286
- What’s Not Changed
8387
- React to prow alerts and take actions.
8488
- What’s Changed
85-
- No more manual inspecting prow healthiness.
89+
- Decouple prow logs inspection from prow bump.
8690
- No more manual lgtm/approve/retest autobump PRs.
8791
- No more manual Slack posting.
8892

@@ -94,7 +98,7 @@ Change how prow is released.
9498

9599
## Proposal
96100

97-
Prow autobump PRs are automatically merged every hour, only on working hours of working days.
101+
Prow autobump PRs are automatically merged every 3 hours, only on working hours of working days.
98102

99103
### Notes/Constraints/Caveats (Optional)
100104

@@ -114,36 +118,55 @@ One possible way of dealing with breaking changes, is:
114118

115119
This approach uses tide auto-merge feature, so that no need to worry about repo requirements such as need more than one approver etc.
116120

117-
```
118-
<<[UNRESOLVED (spiffxp) ]>>
119-
Suggestion: how to keep slack reports on each automated bump.
120-
<<[/UNRESOLVED]>>
121-
```
122-
123121
#### Roll Back Process
124122

125123
When prow stopped functioning after a bump, prow oncall should:
126124
- Stop auto-deploying by commenting `/hold` on latest autobump PR.
127125
- Manually create rollback PR for rolling back to known good version.
128-
- Manually apply the changes from rollback PR.
126+
- Prow is not super actively developed currently, normally there are not many
127+
changes between bumps, and it should be easy to identify culprit.
128+
- General rule of thumb is we can assume last bump was good.
129+
- Manually apply the changes from rollback PR by running [`prow/bump.sh`](https://github.com/kubernetes/test-infra/blob/master/prow/deploy.sh)
130+
131+
### Graduation Criteria
132+
133+
#### Alpha -> Beta Graduation
134+
135+
- Low frequency continuous deployment bumped prow as expected
136+
- Known prow failures are captured by alerts ahead of non-oncall human
137+
138+
#### Beta -> GA Graduation
129139

130-
```
131-
<<[UNRESOLVED]>>
132-
Which version to roll back. This is generally not a problem due to low release volume of prow. @alvaroaleman suggested 6 hours intervals.
133-
<<[/UNRESOLVED]>>
134-
```
140+
- High frequency continuous deployment bumped prow as expected
141+
- Testgrid displays prow plank version
142+
143+
#### Announcement
144+
145+
Before enabling Alpha phase, this will be announced:
146+
- On #prow and #testing-ops channel on Slack
147+
- Via email to the entire [email protected] group
135148

136149
## Implementation History
137150

138151

139152
## Alternatives
140153

141-
142154
#### A new tool merges autobump PRs
143-
This method is independent of tide, which makes sure it works on every prow instance.
155+
156+
Instead of letting tide merge PR, an alternative idea is to created a dedicated
157+
continuous deploy job that takes full control:
158+
- Merge autobump PR on a fixed schedule
144159

145160
##### Pros:
146-
Not relying on tide, works really well with prow instances that don't have tide.
161+
- This method is independent of tide, which makes sure it works on every prow instance.
147162

148163
##### Cons:
149-
Probably have significantly divergent code paths for finding and approving PRs on Gerrit vs PRs on GitHub.
164+
- The tools is pretty similar to tide, means there will be lots of duplicated
165+
logic with tide.
166+
167+
The biggest pros of this approach, is that it works better with prow instance
168+
that doesn't have tide support yet, for example prow that works with gerrit.
169+
However, there are two reasons for not going this path:
170+
- The current design is targeting k8s prow, which does have tide.
171+
- Tide will eventually come to gerrit and this can be evaluated later which
172+
should be done first: tide for gerrit, or continuous deploy prow with gerrit.

keps/sig-testing/2539-continuously-deploy-k8s-prow/kep.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ owning-sig: sig-testing
66
participating-sigs:
77
- sig-testing
88
- sig-release
9-
status: provisional
9+
status: implementable
1010
creation-date: 2021-02-23
1111
reviewers:
1212
- "@spiffxp" # Sig-testing chair
13-
- "@ameukam" # Sig-release chair
13+
- "@justaugustus" # Sig-release chair
1414
- "@alvaroaleman" # Prow approver
1515
approvers:
1616
- "@spiffxp" # Sig-testing chair
17-
- "@ameukam" # Sig-release chair
17+
- "@justaugustus" # Sig-release chair
1818
- "@alvaroaleman" # Prow approver

0 commit comments

Comments
 (0)