Usage of Ansible for configurations and Splitting of Day2 activities #816

dan1el-k · 2023-05-22T23:01:26Z

dan1el-k
May 22, 2023

Motiviation

During preparing PR: #778, I came across a lot of ssh timeout issues, as well as connections drops from hetzner which caused the terraform provisioners to fail and therefore the whole execution always needed a restart and never finished with "one click" in my case.

This is cumbersome and time-consuming for the endusers as it requires to deep dive for investigating the potential root cause of the installation failures.

Also to mention, I experienced most of the failures at the final steps at "null resources kustomization" in the init.tfdoing the kubernetes manifest deployment of day1 and some of the day2 resources.

Proposals

1. Use of a more mature tool (like ansible) for doing the configurations and let terraform just to the infrastructure creation and management

The terraform docs itself actually don't recommend to use provisioners heavily (https://developer.hashicorp.com/terraform/language/resources/provisioners/file), just as a last resort:

Important: Use provisioners as a last resort. There are better alternatives for most situations. Refer to Declaring Provisioners for more details.

I would propose to use Ansible

as a "glue" to orchestrate the whole installation,
having possibilites to steer the terraform executions,
use the outputs to instrument the ansible inventory,
do the whole configuration of the nodes
do the actuall installation of k3s,
reboot orchestration and so on.

It is much more mature with configuration activities, remote node management, ssh connectivities, and even allows retries :).

Yes, it would be another tool and dependency. But we could put Ansible and Terraform together in a nice container image, and let the user just run the container locally to do the installation. With that it would also improve the stability of the installation as we could make sure that for a terraform and ansible version works and was tested for a certain release.

2. Create a "install.sh" script

Currently the users need to execute multiple chained steps in order to creae a cluster:

Install packer, terraform, kubectl
Create ssh keypair (if doesn't exists already)
Create MicroOS snapshosts using packer
terraform init --upgrade
terraform validate
terraform apply -auto-approve
restart terraform apply -auto-approve in case of failures

When going for Proposal 1, most of them could be coverd by Ansible, but then still it would be nice to have a single "install.sh" which starts the container image doing the whole installation with one click.

Without Proposal 1, I would see at least to bundle, the snapshot creation + terraform execution in an script. Both grepping for output messages of terraform and in case restarting the installation automatically.

3. Split of Day1 and Day2 activities

Right now, after provisioning of the nodes and setup of the k3s clusters, finally a bunch of kustomizations get executed.

We can basically group them in:

Ingress configurations
CNI driver
CSI driver
System upgrade and reboot configurations

As I wrote at the beginning, that this is the place where the installer fails most of the time caused by connection issues (at least in my case), I would at minimum split the "null_resources" and separate ones base on the category and just adapt the trigers to reduce the failure vector.

Furthermore I would argue the things like Ingress configurations are already a day2 activite which I personally always take care anyways via GitOps (using ArgoCD). In my opinion the installation of new cluster should just cover all day1 activites, so the bare minimum which is requied to run that k3s cluster on that infrastructure.

So I would for sure agree to:

CNI driver
CSI driver
as Day1 activity

System upgrade-controller and kured is a bit greayish but would also see them as part of the base features of this project.

However, the ingress I would propose to kick out or at least make configurable to not deploy any at all, and therefore also not run the provisioners for them as not necessary.

What do you think about this ideas? Let's brainstorm together. 👍

mysticaltech · 2025-08-04T14:58:06Z

mysticaltech
Aug 4, 2025
Maintainer

Using ansible for "Day 2" activities is indeed appealing and should be considered.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Usage of Ansible for configurations and Splitting of Day2 activities #816

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Usage of Ansible for configurations and Splitting of Day2 activities #816

Uh oh!

dan1el-k May 22, 2023

Motiviation

Proposals

1. Use of a more mature tool (like ansible) for doing the configurations and let terraform just to the infrastructure creation and management

2. Create a "install.sh" script

3. Split of Day1 and Day2 activities

Replies: 1 comment

Uh oh!

mysticaltech Aug 4, 2025 Maintainer

dan1el-k
May 22, 2023

mysticaltech
Aug 4, 2025
Maintainer