Skip to content

Conversation

atheo89
Copy link
Member

@atheo89 atheo89 commented Apr 8, 2025

Related to: https://issues.redhat.com/browse/RHOAIENG-22918
CI-config PR: openshift/release#63599

Description

Sync kubeflow repository with upstream 1.10.0 Release

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

mishraprafful and others added 30 commits June 24, 2024 21:58
)

* fix: notebook server images with non-root SecurityContext

Signed-off-by: Mathew Wicks <[email protected]>

* fix: set `S6_BEHAVIOUR_IF_STAGE2_FAILS` to `2`

Signed-off-by: Mathew Wicks <[email protected]>

* fix: permissions for gid 0

Signed-off-by: Mathew Wicks <[email protected]>

* fix: rstudio with random uid

Signed-off-by: Mathew Wicks <[email protected]>

* fix: capture STDERR in jupyter and code-server

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Mathew Wicks <[email protected]>
* Disable issue creation and update README

Signed-off-by: Andrey Velichkevich <[email protected]>

* Add kubeflow/community repo

Signed-off-by: Andrey Velichkevich <[email protected]>

---------

Signed-off-by: Andrey Velichkevich <[email protected]>
* Add Prometheus metrics to CRUD backend

Use prometheus_flask_exporter library to add Prometheus metrics to
CRUD backend. With this approach all CRUD backens will be able to
enable metrics.

Signed-off-by: Robert Gildein <[email protected]>
Signed-off-by: Robert Gildein <[email protected]>

* KF-6122 Add short doc about metrics a improve code

Add note to README.md about metrics and link the source code for more
information. Fix small issue and missing dependency for Python < 3.8.

Signed-off-by: Robert Gildein <[email protected]>

* fix getting backend version from Python < 3.8

Signed-off-by: Robert Gildein <[email protected]>

* Enable metrics by default and increase backend version to 1.2

Signed-off-by: Robert Gildein <[email protected]>

* switch to group by rule instead of path

Signed-off-by: Robert Gildein <[email protected]>

* fix yaml files

Signed-off-by: Robert Gildein <[email protected]>

---------

Signed-off-by: Robert Gildein <[email protected]>
Signed-off-by: Robert Gildein <[email protected]>
* Remove unused files

Signed-off-by: Mathew Wicks <[email protected]>

* Move `security` folder to `kubeflow/community`

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Mathew Wicks <[email protected]>
…eflow#7639)

* Expose metrics with prom-client

Expose default and custom metrics with prom-client [1].
Custom metrics:
- rest_http_request_duration_seconds (Histogram) - response time
- rest_http_request_total (Counter) - response count
- app_info (Gauge) - app information: app name and version

---
[1]: https://www.npmjs.com/package/prom-client

Signed-off-by: Robert Gildein <[email protected]>

* add simple test and apply suggestions

Signed-off-by: Robert Gildein <[email protected]>

* apply suggestions

Signed-off-by: Robert Gildein <[email protected]>

---------

Signed-off-by: Robert Gildein <[email protected]>
* Improve README and GitHub issue templates

Signed-off-by: Mathew Wicks <[email protected]>

* Use italics instead of bold

Signed-off-by: Mathew Wicks <[email protected]>

* Implement suggestions

Signed-off-by: Mathew Wicks <[email protected]>

* Remove redundant link on "Kubeflow Components" and make into sentence separate.

Signed-off-by: Mathew Wicks <[email protected]>

* Add manifests docs link

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Mathew Wicks <[email protected]>
* feat(dashboard): add Habana Gaudi as an option for GPUs

Signed-off-by: Tuomas Katila <[email protected]>

* feat(notebooks): add Intel Gaudi pytorch containers

Signed-off-by: Mathew Wicks <[email protected]>
Signed-off-by: Tuomas Katila <[email protected]>

* feat(notebooks): add gaudi jupyter containers to be built and published

Signed-off-by: Tuomas Katila <[email protected]>

* feat(notebooks): add a note about hugepages with Gaudi workloads

Signed-off-by: Tuomas Katila <[email protected]>

---------

Signed-off-by: Tuomas Katila <[email protected]>
Signed-off-by: Mathew Wicks <[email protected]>
* feat: update example notebook images

Signed-off-by: Mathew Wicks <[email protected]>

* Remove incorrect flow-chart png

Signed-off-by: Mathew Wicks <[email protected]>

* Update `kfp` python package to `2.9.0`

Signed-off-by: Mathew Wicks <[email protected]>

* Allow RStudio image to start outside Kubernetes

Signed-off-by: Mathew Wicks <[email protected]>

* Put note about why RStudio has WARN log-level

Signed-off-by: Mathew Wicks <[email protected]>

* Allow users to specify their own Reticulate python

Signed-off-by: Mathew Wicks <[email protected]>

* Fix installing packages on RStudio

Signed-off-by: Mathew Wicks <[email protected]>

* Add `r-shiny` to base RStudio image

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Mathew Wicks <[email protected]>
* using GET instead of POST method to handle logout

Signed-off-by: Tarek Abouzeid <[email protected]>

* adding logoutURL template to href

Signed-off-by: Tarek Abouzeid <[email protected]>

* refactor logout

Signed-off-by: Tarek Abouzeid <[email protected]>

---------

Signed-off-by: Tarek Abouzeid <[email protected]>
* upgrade node from 12 to 16

Signed-off-by: tariq-hasan <[email protected]>

* replace cypress with playwright for e2e tests in jupyter

Signed-off-by: tariq-hasan <[email protected]>

* update base image for node

Signed-off-by: tariq-hasan <[email protected]>

* update build scripts for tensorboard

Signed-off-by: tariq-hasan <[email protected]>

* update build scripts for jupyter

Signed-off-by: tariq-hasan <[email protected]>

---------

Signed-off-by: tariq-hasan <[email protected]>
Signed-off-by: Francisco Javier Arceo <[email protected]>
…er-2025

Updating OWNERS for 2025 KSC members
* Updating to include emeritus_approvers

Signed-off-by: Francisco Javier Arceo <[email protected]>

* removing one line break

Signed-off-by: Francisco Javier Arceo <[email protected]>

---------

Signed-off-by: Francisco Javier Arceo <[email protected]>
* Update tag for v1.10.0-rc.0

Signed-off-by: Kimonas Sotirchos <[email protected]>

* Use Ubuntu 22.04 for building multi-arch

Refs kubeflow#7679 (comment)

Signed-off-by: Kimonas Sotirchos <[email protected]>

---------

Signed-off-by: Kimonas Sotirchos <[email protected]>
…beflow#7669)

* chore: Add securitycontext for PSS PoC

Signed-off-by: biswassri <[email protected]>

* mathew: explicitly use non-root base images and set user/group in Dockerfile

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: biswassri <[email protected]>
Signed-off-by: Mathew Wicks <[email protected]>
Co-authored-by: Mathew Wicks <[email protected]>
(cherry picked from commit a42250e)
Signed-off-by: Mathew Wicks <[email protected]>
(cherry picked from commit 09c8ee3)
* ci: always trigger tests on release

Signed-off-by: Mathew Wicks <[email protected]>

* ci: update release guide

Signed-off-by: Mathew Wicks <[email protected]>

* ci: add workflow to approve runs on `ok-to-test`

Signed-off-by: Mathew Wicks <[email protected]>

* ci: add workflow to enforce semantic PR titles

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Mathew Wicks <[email protected]>
(cherry picked from commit 886466b)
Signed-off-by: Mathew Wicks <[email protected]>
(cherry picked from commit 828dace)
* ci: downgrade qemu, fix arm64 build segfaults

Signed-off-by: Mathew Wicks <[email protected]>

* noop change to trigger notebook images to build

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Mathew Wicks <[email protected]>
(cherry picked from commit b7f1b5c)
Signed-off-by: Tuomas Katila <[email protected]>
(cherry picked from commit 348b082)
Signed-off-by: Mathew Wicks <[email protected]>
(cherry picked from commit bdc9bb2)
* chore: migrate docker images to ghcr

Signed-off-by: Eder Ignatowicz <[email protected]>

* adding dash for central dashboard

Signed-off-by: Eder Ignatowicz <[email protected]>

* update the spawner images

Signed-off-by: Eder Ignatowicz <[email protected]>

* try to fix qemu

Signed-off-by: Eder Ignatowicz <[email protected]>

* fix notebook-servers GHCR image path

Signed-off-by: Mathew Wicks <[email protected]>

* update example-notebook-servers README

Signed-off-by: Mathew Wicks <[email protected]>

* fix missed `IMG` updates

Signed-off-by: Mathew Wicks <[email protected]>

* remove remaining cases of `kubeflownotebookswg`

Signed-off-by: Mathew Wicks <[email protected]>

* fix typos

Signed-off-by: Mathew Wicks <[email protected]>

* remove `DOCKER_USER` env-var from workflows

Signed-off-by: Mathew Wicks <[email protected]>

---------

Signed-off-by: Eder Ignatowicz <[email protected]>
Signed-off-by: Mathew Wicks <[email protected]>
Co-authored-by: Mathew Wicks <[email protected]>

(cherry picked from commit 4df4c97)
Signed-off-by: Mathew Wicks <[email protected]>
@openshift-ci openshift-ci bot requested review from andyatmiami and harshad16 April 8, 2025 11:54
@openshift-ci openshift-ci bot added the size/xxl label Apr 8, 2025
@openshift-ci openshift-ci bot added size/xxl and removed size/xxl labels Apr 8, 2025
@atheo89 atheo89 changed the title Sync kubeflow repository with upstream 1.10.0 Release Sync kubeflow repository with upstream 1.10.0 release Apr 8, 2025
@openshift-ci openshift-ci bot added size/xxl and removed size/xxl labels Apr 8, 2025
@jstourac
Copy link
Member

jstourac commented Apr 8, 2025

I checked the actual changes between 1.9 and 1.10 very briefly and this sync seems quite legit to our case. So if the CI will be happy, I'll be happy too.

Thank you.

/lgtm

@@ -10,7 +10,7 @@ spec:
spec:
containers:
- name: kube-rbac-proxy
image: gcr.io/kubebuilder/kube-rbac-proxy:v0.4.0
image: quay.io/brancz/kube-rbac-proxy:v0.4.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brancz?

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 33.15%. Comparing base (03fbca3) to head (253d64c).

❗ There is a different number of reports uploaded between BASE (03fbca3) and HEAD (253d64c). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (03fbca3) HEAD (253d64c)
2 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #582       +/-   ##
===========================================
- Coverage   56.32%   33.15%   -23.17%     
===========================================
  Files           9        2        -7     
  Lines        2356      941     -1415     
===========================================
- Hits         1327      312     -1015     
+ Misses        928      598      -330     
+ Partials      101       31       -70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@atheo89
Copy link
Member Author

atheo89 commented Apr 8, 2025

/override "Red Hat Konflux / kubeflow-enterprise-contract / pr group"

Copy link

openshift-ci bot commented Apr 8, 2025

@atheo89: Overrode contexts on behalf of atheo89: Red Hat Konflux / kubeflow-enterprise-contract / pr group

In response to this:

/override "Red Hat Konflux / kubeflow-enterprise-contract / pr group"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member

jiridanek commented Apr 8, 2025

/override "Red Hat Konflux / odh-notebook-controller-on-pull-request"

that should be infra issue building on non-amd64 arches that is also haunting our devops team w.r.t. releasing

Copy link

openshift-ci bot commented Apr 8, 2025

@jiridanek: Overrode contexts on behalf of jiridanek: Red Hat Konflux / odh-notebook-controller-on-pull-request

In response to this:

/override "Red Hat Konflux / odh-notebook-controller-on-pull-request"

that should be infra issue building on non-amd64 arches that is haunting our devops team w.r.t. releasing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek
Copy link
Member

/lgtm

nothing really changed

@atheo89
Copy link
Member Author

atheo89 commented Apr 8, 2025

thank you all! I will move this in to move forward with the rest.

/approve

Copy link

openshift-ci bot commented Apr 8, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: atheo89

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Apr 8, 2025
@jiridanek
Copy link
Member

/override "Red Hat Konflux / kubeflow-enterprise-contract / pr group"

this is a weird one

Copy link

openshift-ci bot commented Apr 8, 2025

@jiridanek: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • Red Hat Konflux / kubeflow-enterprise-contract / pr group

Only the following failed contexts/checkruns were expected:

  • ci/prow/images
  • ci/prow/kf-notebook-controller-pr-image-mirror
  • ci/prow/kf-notebook-controller-unit
  • ci/prow/odh-notebook-controller-e2e
  • ci/prow/odh-notebook-controller-pr-image-mirror
  • ci/prow/odh-notebook-controller-unit
  • pull-ci-opendatahub-io-kubeflow-main-images
  • pull-ci-opendatahub-io-kubeflow-main-kf-notebook-controller-pr-image-mirror
  • pull-ci-opendatahub-io-kubeflow-main-kf-notebook-controller-unit
  • pull-ci-opendatahub-io-kubeflow-main-odh-notebook-controller-e2e
  • pull-ci-opendatahub-io-kubeflow-main-odh-notebook-controller-pr-image-mirror
  • pull-ci-opendatahub-io-kubeflow-main-odh-notebook-controller-unit
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

In response to this:

/override "Red Hat Konflux / kubeflow-enterprise-contract / pr group"

this is a weird one

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 7245828 into opendatahub-io:main Apr 8, 2025
21 of 22 checks passed
jstourac pushed a commit to jstourac/kubeflow that referenced this pull request Aug 27, 2025
…r digest to 11db23b (opendatahub-io#582)

Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.