post-start silently ignores uncordon failures: should fail fast instead

## Expected behavior

As an operator
In order to notice failure to uncordon nodes at startup
In order to respect bosh canary update and not propagate errors to the whole cluster
I need bosh job status to surface failures

## Current behavior

On new strimzi coab instances, smoke test fails, i.e. new fresh service instances fails to deploy.

Suspecting that k3s server takes time to start and be ready to accept k8s api request to uncordon the master node in the post-start bosh action. As a result, the cordon request fails.


however, post-start silently ignores uncordon failures

https://github.com/orange-cloudfoundry/k3s-wrapper-boshrelease/blob/c535c160642465472067728235b0c4454491b0d3/jobs/k3s-agent/templates/bin/post-start.erb#L23-L26

https://github.com/orange-cloudfoundry/k3s-wrapper-boshrelease/blob/c535c160642465472067728235b0c4454491b0d3/jobs/k3s-server/templates/bin/post-start.erb#L15-L20

Therefore, we see additional downstream side effects of the uncordon failures (typically timeout to complete kustomizations aka run k8s jobs/workloads)



	#uncordon
	/var/vcap/packages/k3s/k3s kubectl --kubeconfig=/var/vcap/data/k3s-agent/drain-kubeconfig.yaml uncordon $K3S_NODE_NAME \
	>> $JOB_DIR/post-start.log \
	2>> $JOB_DIR/post-start-stderr.log

	#wait for k8s api to be available, wait for 5 min max
	<% if_p('k3s.master_vip_api') do \|vip\| %>
	timeout 300 sh -c 'until nc -z <%= vip %> 6443; do sleep 1; done' /var/vcap/packages/k3s/k3s kubectl --kubeconfig=/var/vcap/store/k3s-server/kubeconfig.yml get pods --all-namespaces
	<% end %>
	#uncordon
	/var/vcap/packages/k3s/k3s kubectl --kubeconfig=/var/vcap/store/k3s-server/kubeconfig.yml uncordon $K3S_NODE_NAME

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

post-start silently ignores uncordon failures: should fail fast instead #78

Expected behavior

Current behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

post-start silently ignores uncordon failures: should fail fast instead #78

Description

Expected behavior

Current behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions