Bump to helm v4 by yuvipanda · Pull Request #7886 · 2i2c-org/infrastructure

yuvipanda · 2026-03-10T00:40:55Z

Tested it locally, and seems fine. Also bumps kubectl version.

Remove a lot of superfluous and drifted-to-no-longer-be-accurate comments.

Once this is merged, we would need to make sure everyone upgrades their local install of helm to v4 as well

Ref #7470

github-actions · 2026-03-10T00:50:52Z

Merging this PR will trigger the following deployment actions.

Support deployments

No support upgrades will be triggered

Staging deployments

Cloud Provider	Cluster Name	Hub Name	Reason for Redeploy
gcp	2i2c	staging	Core infrastructure has been modified
gcp	2i2c	dask-staging	Core infrastructure has been modified
aws	smithsonian	staging	Core infrastructure has been modified
aws	2i2c-aws-us	staging	Core infrastructure has been modified
aws	nasa-veda	staging	Core infrastructure has been modified
aws	disasters	staging	Core infrastructure has been modified
aws	opensci	staging	Core infrastructure has been modified
gcp	hhmi	staging	Core infrastructure has been modified
aws	maap	staging	Core infrastructure has been modified
gcp	awi-ciroh	staging	Core infrastructure has been modified
aws	jupyter-health	staging	Core infrastructure has been modified
aws	strudel	staging	Core infrastructure has been modified
gcp	cloudbank	staging	Core infrastructure has been modified
aws	nmfs-openscapes	staging	Core infrastructure has been modified
aws	projectpythia	staging	Core infrastructure has been modified
kubeconfig	2i2c-jetstream2	staging	Core infrastructure has been modified
aws	reflective	staging	Core infrastructure has been modified
aws	temple	staging	Core infrastructure has been modified
gcp	2i2c-uk	staging	Core infrastructure has been modified
aws	berkeley-geojupyter	staging	Core infrastructure has been modified
kubeconfig	utoronto	staging	Core infrastructure has been modified
kubeconfig	utoronto	r-staging	Core infrastructure has been modified
aws	bnext-bio	staging	Core infrastructure has been modified
gcp	leap	staging	Core infrastructure has been modified
aws	victor	staging	Core infrastructure has been modified
aws	ucmerced	staging	Core infrastructure has been modified
aws	nasa-cryo	staging	Core infrastructure has been modified
aws	nasa-ghg-hub	staging	Core infrastructure has been modified
aws	openscapeshub	staging	Core infrastructure has been modified
aws	earthscope	staging	Core infrastructure has been modified
aws	aimatx-2i2c-hub	staging	Core infrastructure has been modified

Production deployments

Cloud Provider	Cluster Name	Hub Name	Reason for Redeploy
gcp	2i2c	mtu	Core infrastructure has been modified
aws	smithsonian	prod	Core infrastructure has been modified
aws	2i2c-aws-us	showcase	Core infrastructure has been modified
aws	nasa-veda	prod	Core infrastructure has been modified
aws	nasa-veda	binder	Core infrastructure has been modified
aws	disasters	prod	Core infrastructure has been modified
aws	opensci	sciencecore	Core infrastructure has been modified
aws	opensci	climaterisk	Core infrastructure has been modified
aws	opensci	small-binder	Core infrastructure has been modified
aws	opensci	big-binder	Core infrastructure has been modified
gcp	hhmi	spyglass	Core infrastructure has been modified
gcp	hhmi	binder	Core infrastructure has been modified
aws	maap	prod	Core infrastructure has been modified
gcp	awi-ciroh	prod	Core infrastructure has been modified
aws	jupyter-health	prod	Core infrastructure has been modified
aws	strudel	prod	Core infrastructure has been modified
aws	strudel	workshop	Core infrastructure has been modified
gcp	cloudbank	ahs	Core infrastructure has been modified
gcp	cloudbank	authoring	Core infrastructure has been modified
gcp	cloudbank	bcc	Core infrastructure has been modified
gcp	cloudbank	bmcc	Core infrastructure has been modified
gcp	cloudbank	chaffey	Core infrastructure has been modified
gcp	cloudbank	ccsf	Core infrastructure has been modified
gcp	cloudbank	chabot	Core infrastructure has been modified
gcp	cloudbank	chicagostate	Core infrastructure has been modified
gcp	cloudbank	cmu	Core infrastructure has been modified
gcp	cloudbank	cra	Core infrastructure has been modified
gcp	cloudbank	csm	Core infrastructure has been modified
gcp	cloudbank	csum	Core infrastructure has been modified
gcp	cloudbank	deanza	Core infrastructure has been modified
gcp	cloudbank	demo	Core infrastructure has been modified
gcp	cloudbank	dvc	Core infrastructure has been modified
gcp	cloudbank	elac	Core infrastructure has been modified
gcp	cloudbank	elcamino	Core infrastructure has been modified
gcp	cloudbank	evc	Core infrastructure has been modified
gcp	cloudbank	etsu	Core infrastructure has been modified
gcp	cloudbank	fresno	Core infrastructure has been modified
gcp	cloudbank	foothill	Core infrastructure has been modified
gcp	cloudbank	glendale	Core infrastructure has been modified
gcp	cloudbank	golden	Core infrastructure has been modified
gcp	cloudbank	gwu	Core infrastructure has been modified
gcp	cloudbank	gpu-demo	Core infrastructure has been modified
gcp	cloudbank	high	Core infrastructure has been modified
gcp	cloudbank	hmc	Core infrastructure has been modified
gcp	cloudbank	humboldt	Core infrastructure has been modified
gcp	cloudbank	kean	Core infrastructure has been modified
gcp	cloudbank	lacc	Core infrastructure has been modified
gcp	cloudbank	lahc	Core infrastructure has been modified
gcp	cloudbank	laney	Core infrastructure has been modified
gcp	cloudbank	lavc	Core infrastructure has been modified
gcp	cloudbank	lbcc	Core infrastructure has been modified
gcp	cloudbank	mendocino	Core infrastructure has been modified
gcp	cloudbank	merced	Core infrastructure has been modified
gcp	cloudbank	merritt	Core infrastructure has been modified
gcp	cloudbank	mmc	Core infrastructure has been modified
gcp	cloudbank	miracosta	Core infrastructure has been modified
gcp	cloudbank	mission	Core infrastructure has been modified
gcp	cloudbank	moreno	Core infrastructure has been modified
gcp	cloudbank	norco	Core infrastructure has been modified
gcp	cloudbank	ocu	Core infrastructure has been modified
gcp	cloudbank	palomar	Core infrastructure has been modified
gcp	cloudbank	pasadena	Core infrastructure has been modified
gcp	cloudbank	redwoods	Core infrastructure has been modified
gcp	cloudbank	reedley	Core infrastructure has been modified
gcp	cloudbank	riohondo	Core infrastructure has been modified
gcp	cloudbank	saddleback	Core infrastructure has been modified
gcp	cloudbank	santiago	Core infrastructure has been modified
gcp	cloudbank	sbcc	Core infrastructure has been modified
gcp	cloudbank	sbcc-dev	Core infrastructure has been modified
gcp	cloudbank	sierra	Core infrastructure has been modified
gcp	cloudbank	sjcc	Core infrastructure has been modified
gcp	cloudbank	sjsu	Core infrastructure has been modified
gcp	cloudbank	skyline	Core infrastructure has been modified
gcp	cloudbank	sou	Core infrastructure has been modified
gcp	cloudbank	stanford	Core infrastructure has been modified
gcp	cloudbank	spelman	Core infrastructure has been modified
gcp	cloudbank	srjc	Core infrastructure has been modified
gcp	cloudbank	tuskegee	Core infrastructure has been modified
gcp	cloudbank	ucsc	Core infrastructure has been modified
gcp	cloudbank	uchicago	Core infrastructure has been modified
gcp	cloudbank	umd	Core infrastructure has been modified
gcp	cloudbank	und	Core infrastructure has been modified
gcp	cloudbank	virginia	Core infrastructure has been modified
gcp	cloudbank	wlac	Core infrastructure has been modified
aws	nmfs-openscapes	prod	Core infrastructure has been modified
aws	nmfs-openscapes	workshop	Core infrastructure has been modified
aws	nmfs-openscapes	noaa-only	Core infrastructure has been modified
aws	projectpythia	prod	Core infrastructure has been modified
aws	projectpythia	pythia-binder	Core infrastructure has been modified
aws	reflective	prod	Core infrastructure has been modified
aws	reflective	workshop	Core infrastructure has been modified
aws	temple	prod	Core infrastructure has been modified
aws	temple	advanced	Core infrastructure has been modified
aws	temple	research	Core infrastructure has been modified
gcp	2i2c-uk	lis	Core infrastructure has been modified
aws	berkeley-geojupyter	prod	Core infrastructure has been modified
kubeconfig	utoronto	prod	Core infrastructure has been modified
kubeconfig	utoronto	r-prod	Core infrastructure has been modified
kubeconfig	utoronto	highmem	Core infrastructure has been modified
kubeconfig	projectpythia-binder	binderhub	Core infrastructure has been modified
aws	bnext-bio	prod	Core infrastructure has been modified
gcp	leap	prod	Core infrastructure has been modified
gcp	leap	public	Core infrastructure has been modified
aws	victor	prod	Core infrastructure has been modified
aws	ucmerced	prod	Core infrastructure has been modified
aws	nasa-cryo	prod	Core infrastructure has been modified
aws	nasa-ghg-hub	prod	Core infrastructure has been modified
aws	nasa-ghg-hub	binder	Core infrastructure has been modified
aws	openscapeshub	prod	Core infrastructure has been modified
aws	openscapeshub	workshop	Core infrastructure has been modified
gcp	dubois	ephemeral	Core infrastructure has been modified
aws	earthscope	prod	Core infrastructure has been modified
aws	earthscope	binder	Core infrastructure has been modified
aws	aimatx-2i2c-hub	prod	Core infrastructure has been modified

agoose77 · 2026-03-10T13:15:38Z

FYI we've had two outages recently due to regressions in Helm null handling.

I noticed it first with Project Pythia (incident report), and @GeorgianaElena noticed it in Earthscope

Although merging that PR has meant that earthscope no longer has any null problems, we still see regressions for Project Pythia:

function test-helm() {
  local version="${1:?need arg}"
  test -d outputs && rm -r outputs;         
  mkdir outputs;   
  echo "Testing $version"
  podman run --rm -it -v $PWD/outputs:/outputs -v $PWD:/app -w /app "alpine/helm:$version" template helm-charts/basehub --values=config/clusters/projectpythia-binder/binderhub.values.yaml --output-dir /outputs >/dev/null; 
  grep  'hub\.jupyter\.org/node-purpose' outputs/**/*.yaml
}
versions=("3.17.0" "3.17.1" "3.20.0" "4.1.1")

for version in "${versions[@]}"; do
  test-helm "$version"
  printf "\n\n"
done

I dug into this, and it's caused by changes to sibling merging (of values set in a child chart). Here's a reproducer: https://github.com/agoose77/reproducer-helm-merge-changes

I've opened a bug report here: helm/helm#31919 Anecdotally, it feels like Helm has been grappling with these bugs for a while.

So what do we do?

It seems like this is only a bug in this particular merging scenario. We could either update the binderhub chart to filter out these nulls, or opt to set the value only once via e.g. jsonnet.

Alternatively, we wait for Helm to fix this ... but that will likely take quite a while. I'm happy to dig in and figure out the fix for Helm, but I'm not sure if that's the best use of our time.

yuvipanda · 2026-03-17T15:39:46Z

@agoose77 would upgrading to helm cause this regression to come back? Or is the concern that there will be other bugs that we run into?

agoose77 · 2026-03-17T17:03:52Z

I think merging via helm 4 would break at least one cluster atm given the open bug report.

yuvipanda · 2026-03-17T19:08:20Z

@agoose77 I looked into that, and i agree. I also think we can't fix this with extraConfig because we also need to set this for the core pods themselves, which can't be done with extraConfig. Is that right?

GeorgianaElena · 2026-03-18T12:42:53Z

@agoose77, I believe the only places where this breaks is when when we set the selector to null here, right?

infrastructure/config/clusters/projectpythia-binder/binderhub.values.yaml

Lines 122 to 124 in a22201b

    
           dockerApi: 
        
             nodeSelector: 
        
               hub.jupyter.org/node-purpose:

As a workaround, until helm fixes this, why not stop setting this node purpose selector in the basehub

infrastructure/helm-charts/basehub/values.yaml

Lines 45 to 46 in a22201b

    
           nodeSelector: 
        
             hub.jupyter.org/node-purpose: user

And instead set it in each hub's config?

I know it's extra work, but we can write a script that does it and shouldn't take us much time.

agoose77 · 2026-03-18T15:26:02Z

@GeorgianaElena I'm a bit nervous because it might happen to other hubs. We'd need to validate that each hub is currently not broken by the upgrade. Future hub changes could also trigger these bugs, but we'd catch them in production equally as in local development. (Not ideal to know about a Helm bug that we might step on at any time, but manageable).

I haven't taken the view "we're going to do this, how do we do it safely" — let me put that hat on now.

Tested it locally, and seems fine. Also bumps kubectl version. Remove a lot of superfluous and drifted-to-no-longer-be-accurate comments. Once this is merged, we would need to make sure everyone upgrades their local install of helm to v4 as well Ref 2i2c-org#7470

for more information, see https://pre-commit.ci

yuvipanda · 2026-03-19T18:32:12Z

I see this as basically us having found an issue in helm, and @agoose77 opening the issue upstream clearly helped - there's a fix coming in helm/helm#31946. We could just wait for that to land, and then deploy it.

And yes, this is something we have to do, and sooner than later :)

yuvipanda · 2026-03-19T18:38:12Z

Ok, I'm not sure we can wait for helm/helm#31946 - it looks LLM generated, and from someone who also opened a few hundred PRs in a few hundred other repos recently, and is their first PR in helm. I do believe it'll eventually get fixed in helm, but that PR may not be it.

Another option for us to consider is to move it to basehub/values.jsonnet, and selectively apply it based on the name of the cluster. Normally I'd not suggest doing that, but given this is a time limited regression, I'm ok with that. And we'd need to add a piece of doc to any jetstream cluster that it needs an exception written in there.

yuvipanda added 3 commits March 19, 2026 11:23

Bump to helm v4

7fd26e3

Tested it locally, and seems fine. Also bumps kubectl version. Remove a lot of superfluous and drifted-to-no-longer-be-accurate comments. Once this is merged, we would need to make sure everyone upgrades their local install of helm to v4 as well Ref 2i2c-org#7470

Error out if helm version is less than v4

af707e4

Remove unused fstring

3e66049

yuvipanda force-pushed the helm-v4 branch from 876f9b4 to 3e66049 Compare March 19, 2026 18:29

[pre-commit.ci] auto fixes from pre-commit.com hooks

567829c

for more information, see https://pre-commit.ci

Remove unused fstring

e22d444

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump to helm v4#7886

Bump to helm v4#7886
yuvipanda wants to merge 5 commits into2i2c-org:mainfrom
yuvipanda:helm-v4

yuvipanda commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

agoose77 commented Mar 10, 2026 •

edited

Loading

Uh oh!

yuvipanda commented Mar 17, 2026

Uh oh!

agoose77 commented Mar 17, 2026

Uh oh!

yuvipanda commented Mar 17, 2026

Uh oh!

GeorgianaElena commented Mar 18, 2026

Uh oh!

agoose77 commented Mar 18, 2026 •

edited

Loading

Uh oh!

yuvipanda commented Mar 19, 2026

Uh oh!

yuvipanda commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yuvipanda commented Mar 10, 2026

Uh oh!

github-actions bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Support deployments

Staging deployments

Production deployments

Uh oh!

agoose77 commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuvipanda commented Mar 17, 2026

Uh oh!

agoose77 commented Mar 17, 2026

Uh oh!

yuvipanda commented Mar 17, 2026

Uh oh!

GeorgianaElena commented Mar 18, 2026

Uh oh!

agoose77 commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuvipanda commented Mar 19, 2026

Uh oh!

yuvipanda commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 10, 2026 •

edited

Loading

agoose77 commented Mar 10, 2026 •

edited

Loading

agoose77 commented Mar 18, 2026 •

edited

Loading