|
| 1 | +# WG Reliability Charter |
| 2 | + |
| 3 | +This charter adheres to the conventions described in the [Kubernetes Charter README] |
| 4 | +and uses the Roles and Organization Management outlined in [sig-governance]. |
| 5 | + |
| 6 | +[sig-governance]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance.md |
| 7 | +[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md |
| 8 | + |
| 9 | +## Scope |
| 10 | + |
| 11 | +The Reliability Working Group (WG Reliability) is organized with the goal of |
| 12 | +allowing users to safely use Kubernetes for managing production workloads by |
| 13 | +ensuring Kubernetes is stable and reliable. |
| 14 | + |
| 15 | +### In Scope |
| 16 | + |
| 17 | +- What reliability means for Kubernetes and how to measure it? |
| 18 | +- Measuring Kubernetes reliability in tests |
| 19 | +- Introducing criteria for blocking the release if the reliability is |
| 20 | + below the bar |
| 21 | +- Building a list of end-user outages and reliability issues |
| 22 | + (if applicable with mitigations and/or workarounds) |
| 23 | +- Creating and prioritizing a list of areas that require reliability |
| 24 | + investments |
| 25 | +- Work with relevant SIGs on delivering necessary infrastructure |
| 26 | + (e.g. test frameworks) to unblock further steps |
| 27 | +- Initiate and drive cross-SIG reliability improvements |
| 28 | + |
| 29 | +### Out of scope |
| 30 | + |
| 31 | +- Designing and executing improvements clearly falling into individual SIG |
| 32 | + responsibilities. |
| 33 | + |
| 34 | +## Special Powers |
| 35 | + |
| 36 | +The Reliability WG has a power to block feature-oriented contributions from |
| 37 | +any SIG if requested reliability-related improvements are not being addressed. |
| 38 | +Before it can be exercised, sig-arch must approve the criteria suggested by |
| 39 | +this working group. |
| 40 | + |
| 41 | +Given WGs are by-definition temporary, on WG Reliability retirement we will |
| 42 | +pass this responsibility to to SIG Architecture Production Readiness subproject |
| 43 | +or to SIG Architecture generally for reassignment at the leads’ discretion. |
| 44 | + |
| 45 | +## Stakeholders |
| 46 | + |
| 47 | +Stakeholders in this working group span multiple SIGs. |
| 48 | + |
| 49 | +In the first phase of defining reliability for Kubernetes building list of |
| 50 | +reliability gaps and areas for investments the following SIGs will be |
| 51 | +involved: |
| 52 | + |
| 53 | +- SIG Architecture |
| 54 | + High-level input on requirements. |
| 55 | +- SIG Scalability |
| 56 | + Input on scale test gaps and reliability issues at scale. |
| 57 | +- SIG Cluster Lifecycle |
| 58 | + Input on cluster setup and upgrade mechanics. |
| 59 | +- SIG Release |
| 60 | + Input on blocking and soak requirements. |
| 61 | +- SIG Testing |
| 62 | + Input on testing mechanics, missing frameworks, etc. |
| 63 | +- SIG * |
| 64 | + Input on reliability gaps in their areas. |
| 65 | + |
| 66 | +The group will be also reaching out to users and cluster operator |
| 67 | +(e.g. via surveys), to build the full picture. |
| 68 | + |
| 69 | +In the later phase improving reliability, every single SIG may potentially |
| 70 | +be involved depending on the findings from the initial phase. |
| 71 | + |
| 72 | +## Deliverables |
| 73 | + |
| 74 | +The artifacts the group is supposed to deliver include: |
| 75 | +- Document defining what reliability means for Kubernetes and how to measure it. |
| 76 | +- List of known user outages and potential failure modes |
| 77 | +- List of specific investmenets that should happen to improve reliability |
| 78 | +- Set of processes to introduce in Kubernetes to avoid over time degradation |
| 79 | + of reliability |
| 80 | + |
| 81 | +The actual investments will be owned by corresponding SIGs. |
| 82 | + |
| 83 | +## Roles and Organization Management |
| 84 | + |
| 85 | +This sig follows adheres to the Roles and Organization Management outlined in |
| 86 | +[sig-governance] and opts-in to updates and modifications to [sig-governance]. |
| 87 | + |
| 88 | +[sig-governance]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance.md |
| 89 | + |
| 90 | +## Timelines and Disbanding |
| 91 | + |
| 92 | +The exact timeline for existing of this working group is hard to predict at |
| 93 | +this time. |
| 94 | + |
| 95 | +The group will start working on the deliverables mentioned above. Once the |
| 96 | +group we will be satisfied with the current shape of them and no additional |
| 97 | +coordination on their execution will be needed, we will retire Working Group |
| 98 | +and pass oversight of reliability to SIG Architecture PRR subproject. |
0 commit comments