Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 45 additions & 65 deletions wg-manifests/charter.md
Original file line number Diff line number Diff line change
@@ -1,70 +1,50 @@
# WG Manifests Charter


This charter adheres to the conventions, roles and organization management
outlined in [wg-governance].

## Scope

- Provide a catalog (centralized repository) of Kubeflow application manifests.
- Provide a catalog of third-party apps for common services.
We simply (automatically) synchronize the application and dependencies manifests to then elaborately combine (configure)them for full platform experience.
Providing a consistent and tested end-to-end multi-tenant experience is the most important task of the platform/manifests WG.
To achieve this we maintain an extensive testing suite that covers most basic scenarios users would expect from a Platform for ML orchestration.
We also provide the documentation regarding, but not limited to installation, extension, security and architecture to enable users to run their own ML Platform on Kubernetes.
Users may choose to derive from platform/manifests to create so called distributions, which are opinionated to satisfy individual requirements.
Users may also choose to install individual components without the benefits of the platform.
Comment on lines +9 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we format this as a list of bullet points, and combine the ones which are the the same idea?

This makes it easier to have discussions about each specific element of the scope, as some new elements are being proposed.


### In scope

#### Code, Binaries and Services

- Maintain tooling to automate copying manifests from upstream app repos.
- Maintain a catalog that will allow users to install Kubeflow apps and
common services easily on Kubernetes, either on the cloud or on-prem, without
depending on external cloud services or closed source solutions. Those
manifests are deployed using `kubectl` and `kustomize` and include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to ensure that kustomize/kubectl remains the primary "scope" of the manifests repo, as this is what all existing distributions/users are based on.

Also, are we allowing other deployment tools beyond kubectl and kustomize (e.g. helm, argo cd, flux cd), because this is a big scope change if so?

1. A common set of manifests for the current official Kubeflow applications:
- Training Operators
- Kubeflow Pipelines (KFP)
- Notebooks
- KFServing
- Katib
- Central Dashboard
- Profile Controller
- PodDefaults Controller
1. Manifests for a set of specific common services:
- Istio
- KNative
- Dex
- Cert-Manager

#### Cross-cutting and Externally Facing Processes

##### With Application Owners

- Aid applications owners in creating kustomize manifests for their application,
inside the app repo, if those don't exist already.
- Communicate with application owners to agree upon the version they want to be
included in the next Kubeflow release.

##### With Distribution Owners

- Coordinate with distribution owners, to make sure they are in-sync about the
release schedule and have time to test and bring their distributions
up-to-date.
- Enable users / distributions to install, extend and maintain Kubeflow as a end-to-end multi-tenant platform for multiple users
- This includes dependencies, security efforts and exemplary integration with popular tools and frameworks.
- Users can also install individual components without the benefits of the platform, but then they could also just directly fetch them from the WG releases.
- Synchronize the manifests between working groups and make sure via integration tests that the components work end-to-end together as multi-tenant platform
- Release tested releases of the Kubeflow platform for downstream consumption
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to clarify what is meant by "Kubeflow Platform", because this is not defined, or just not use that term.

- We try to be compatible with the popular Kubernetes clusters (Kind, Rancher, AKS, EKS, GKE, ...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary for the manifest working group?

We intentionally excluded this goal from the original manifest wg charter to prevent unnecessary focus on vendor-specific issues.

Copy link
Member Author

@juliusvonkohout juliusvonkohout Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice the users are going to have different Kubernetes layers (Kind, Rancher, AKS, EKS, GKE, ...) but this only covers Kubernetes, not AWS managed databases or so. We definitely try to be compatible with the most popular ones although we cannot guarantee it. Right now it works on Kind, Rancher, AKS, EKS, GKE for me and this is also what most users expect. So it is a "soft goal" we try for our users, but we do not guarantee it.

In the end this is done by volunteers, that is what we want to work on. This is where we see the value in contributing to Kubeflow. If someone else wants to focus on something else he is free to do that what is sustainable and valuable for him. No one is forced to work on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the kubeflow/manifests were only meant to be a "minimum viable deployment" for testing purposes on Kind clusters?

Should we say that instead?

- We provide hints and experimental examples how a user / distribution could integrate non-default external authentication (e.g. companies Identity Provider) and popular non-default services on his own
- We in general document the installation of Kubeflow as a platform and / or individual components including common problems and architectural overviews.
- There is the evolving and not exhaustive list of dependencies for a proper multi-tenant platform installation: Istio, KNative, Dex, Oauth2-proxy, Cert-Manager, ...
- There is the evolving and not exhaustive list of applications: KFP, Trainer, Dashboard, Workspaces / Noteboks, Kserve, Spark, ...
Comment on lines +20 to +29
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets try and list these as a specific list of "responsibilities" (like the current ones).

Words like "enable", "hints", and "experimental examples" are not very clear.


## Cross-cutting and Externally Facing Processes

### With Application Owners

- Aid the application owner in creating manifests (Helm, Kustomize) for his application
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that requiring the manifests WG to support the upstream manifests is sustainable.

But obviously, it is something that the individuals who are participating might also choose to do if they are so inclined.

Copy link
Member Author

@juliusvonkohout juliusvonkohout Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always have to keep in mind that we are volunteers. All of this is best-effort. We try it. Sometimes other working groups need help to understand for example securitycontexts of a pod, since they are rather focused on the source code. Or we help them to fix the kustomize 5 warnings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that many contributors are volunteers, but either way, the WG charters are governance documents.

It's important for the health of the WG (and project) that we set reasonable expectations for the working group members.

I am not sure it's sustainable to include the expectation of upstream manifest maintenance, this is why the original charter focused only on "aggregating manifests".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over the last five years it was very sustainable and we made great progress.

- Aid the application owner regarding security best practices
- Communicate with the application owner regarding releases and versioning

### With Users / Distribution Owners
- Distributions are opinionated derivatives of Kubeflow platform/manifests, for example replacing all databases with closed source managed databases from AWS, GKE, Azure, ...
- A distribution can be created by an arbitrary amount of users / companies in private or in public by deriving from Kubeflow platform/manifests, see the definition above
Comment on lines +40 to +41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to define "distribution" here, lets be as generic as possible:

  • distributions are all downstream derivatives of the kubeflow manifests which are not maintained by the kubeflow community

We could also define it, and other terms at the top of the document.

- Coordinate with "distribution owners" / users to take part in the testing of Kubeflow releases.

### Out of scope

This WG is NOT going to:
- Maintain deployment-specific tools like `kfctl`.
- Maintain distribution-specific manifests.
- Decide which applications to include in Kubeflow.
- Decide which variant of an application to include (e.g., KFP Standalone vs
KFP with Istio).
- Create and maintain one or more Kubeflow distributions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Opening a new thread because the old one was marked as resolved)

It is critical that we still explicitly exclude the creation of a distribution by the manifest working group, as this would create a massive conflict of interest.

- Support configurations with environment-specific requirements, like special
hardware, different versions of third-party apps (e.g., Istio, KNative, etc.)
or custom OIDC providers.
- Support and promote a specific deployment tool (e.g., `kfctl`). Opinionated
deployment tools can extend the base kustomizations to create manifests that
support their methods.
- For example, people invested in `kfctl` can create overlays that enable
the use of `kfctl`'s parameter substitution, which expects a specific
folder structure (`params.env`).
- We do not support a specific deployment tool (e.g., ArgoCD, Flux)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be clarified, what do we mean by "support":

  • what is a "deployment tool"?
  • in what way are we "not supporting" deployment tools?

- The default installation shall not contain deep integration with external cloud services or closed source solutions, instead we aim for Kubernetes-native solutions and light authentication and authorization integration with external IDPs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubeflow does not have a "default installation", what is meant by this?

Also, lets try and be specific about what is "out of scope" rather than using ideas like "aim for" which are very imprecise, can we split this into specific bullet points?


## Roles and Organization Management

Expand All @@ -76,36 +56,36 @@ The positions of the Chairs and TLs are granted to the organizations and compani
Kubeflow's [governance model](https://github.com/kubeflow/community/blob/master/wgs/wg-governance.md)
includes a plethora of different leadership roles.
This section aims to provide a clear description of what these roles mean for
this repo, as well as set expectations from people with these roles and requirements
this repository, as well as set expectations from people with these roles and requirements
for people to be promoted in a role.

A Working Group lead is considered someone that has either the role of
**Subproject Owner**, **Tech Lead** or **Chair**. These roles were defined by trying
to provide different responsibility levels for repo owners. For the Manifests WG
we'd like to start by treating *approvers* in the root [OWNERS](https://github.com/kubeflow/manifests/blob/master/OWNERS),
to provide different responsibility levels for repository owners. For the Manifests WG
we would like to start by treating *approvers* in the root [OWNERS](https://github.com/kubeflow/manifests/blob/master/OWNERS),
as Subproject Owners, Tech Leads and Chairs. This is done to ensure we have a
simple enough model to start that people can understand and get used to. So for
the Manifests WG we only have Manifests WG Leads, which are the root approvers.

The following sections will aim to define the requirements for someone to become
a reviewer and an approver in the root OWNERS file (Manifests WG Lead).

### Manifests WG Lead Requirements
### Platform/Manifests WG Lead Requirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets keep the name of the WG the same, as discussed above.


The requirements for someone to be a Lead come from the processes and work required
to be done in this repo. The main goal with having multiple Leads is to ensure
to be done in this repository. The main goal with having multiple Leads is to ensure
that in case there's an absence of one of the Leads the rest will be able to ensure
the established processes and the health of the repo will be preserved.
the established processes and the health of the repository will be preserved.

With the above the main pillars of work and responsibilities that we've seen for
this repo throughout the years are the following:
1. Being involved with the release team, since the [release process](https://github.com/kubeflow/community/tree/master/releases) is tightly intertwined with the manifests repo
2. Testing methodologies (GitHub Actions, E2E testing with AWS resources etc)
3. Processes regarding the [contrib/addon](https://github.com/kubeflow/manifests/blob/master/contrib) components
4. [Common manifests](https://github.com/kubeflow/manifests/tree/master/common) maintained by Manifests WG (Istio, Knative, Cert Manager etc)
this repository throughout the years are the following:
1. Being involved with the release team, since the [release process](https://github.com/kubeflow/community/tree/master/releases) is tightly intertwined with the manifests/platform repository
2. Testing methodologies (GitHub Actions)
3. Processes regarding the [experimental](https://github.com/kubeflow/manifests/blob/master/experimental) components
4. [Platform manifests](https://github.com/kubeflow/manifests/tree/master/common) maintained irectly by Manifests/Platform WG (Istio, Knative, Cert Manager etc.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo, and also lets keep the name as Manifests WG.

5. Community and health of the project

Root approvers, or Manifests WG Leads, are expected to have expertise and be able
Root approvers, or Manifests/Platform WG Leads, are expected to have expertise and be able
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets keep the name as Manifests WG.

to drive all the above areas. Root reviewers on the other hand are expected to
have knowledge in all the above and have as a goal to grow into the approvers
role by helping with reviews throughout the project.
Expand All @@ -120,7 +100,7 @@ role by helping with reviews throughout the project.

The goal of the requirements is to quantify the main pillars that we documented
above. The high level reasoning is that approvers should have lead efforts and
have expertise in the different processes and artefacts maintained in this repo
have expertise in the different processes and artefacts maintained in this repository
as well as be invested in the community of the WG.

* Need to be a root reviewer
Expand Down