SPDX-License-Identifier: Apache-2.0
Copyright (c) 2021 Intel Corporation
- General guidelines
- Keep output files in one place
- “block” usage
- Use command's 'creates' param when possible
- Ansible module instead of command
- Look for role in baseline-ansible
- "become: yes" usage
- Check error (failed_when) instead of using ignore_error
- Helm values
- External sources
- File/dir permissions
- Don't remove temporary files if possible
- Data manipulation
- /tmp/ directory usage
- Set defaults for variables
- Handle network reliability
- Block/rescue for better debugging
- Task names
- Role guidelines
- Multi OS guidelines
- Playbook organization
- Redeployment
Tasks should keep its files in a well-known directory tree. It is recommended to use:
-
project_dir - General location in which role/component should create directory and use that directory to store files
_git_repo_dest_harbor: "{{ project_dir }}/harbor"
-
ne_helm_charts_default_dir - For all Helm charts
_pcss_chart_dir: "{{ ne_helm_charts_default_dir }}/pccs"
- name: copy helm chart files copy: src: "{{ item }}" dest: "{{ _pcss_chart_dir }}" directory_mode: u+rwx mode: u+rw loop: - Chart.yaml - templates
Thanks to that it is easier to control what was changed in the system.
"block" should be used if it improves readability and removes code duplication. Blocks can help to group tasks that are executed under a single condition. When there is too many blocks in single file, consider splitting the blocks into separate files.
- name: update if rule changed
block:
- name: Reload udev rules
command: udevadm control --reload-rules
changed_when: true
- name: Retrigger udev
command: udevadm trigger
changed_when: true
become: yes
when:
- add_kvm_rule.changedIf a file specified by 'creates' already exists, this step will not be run.
- name: setup bash completion
shell:
cmd: cmctl completion bash > /etc/bash_completion.d/cmctl
creates: /etc/bash_completion.d/cmctl
become: yesFor almost every command there is an Ansible module which can perform this command. Only modules which don't require additional installation are allowed to be used. Don't use command and shell when Ansible module is available Example: Instead of:
command: service auditd restartUse:
service:
name: auditd
state: restartedInstead of
command:
cmd: make -j modules_install
chdir: "{{ tmp_dir.path }}/quickassist/qat"Use:
make:
chdir: "{{ tmp_dir.path }}/quickassist/qat"
target: modules_install
environment:
"MAKEFLAGS": "-j{{ nproc_out.stdout|int + 1 }}"
become: yesBefore writing your own role check if such role doesn't already exist in baseline-ansible
“become: yes” should be only used when absolutely necessary. It should also be set on smallest element possible. This can be whole role, if “become: yes” is needed for all tasks in role.
We sometimes check if a feature is already installed by calling a command that can result in an error. We should narrow the accepted error down and fail in other cases.
failed_when: kernel_version.stdout is version('5.11', '<')It is advised to use values file as a template with ansible variables and then use ansible.builtin.template to copy it to helm charts.
- name: template and copy values.yaml
template:
src: "values.yaml.j2"
dest: "{{ ne_helm_charts_default_dir }}/telegraf/values.yaml"
mode: preserveIt is prohibited to push external sources to repositories. It is mandatory to use commit/tag/specific release version of external source. If 3rdparty components with source code are needed, files should be downloaded, git cloned etc. to project_dir and customization should be applied. To customize downloaded code it is suggested to use:
- patch
- kustomize
Example of kustomize:
- op: add
path: /spec/containers/0/volumeMounts/-
value:
mountPath: {{ isecl_k8s_extensions }}
name: extendedsched
readOnly: true
- op: add
path: /spec/containers/0/command/-
value: --policy-config-file={{ isecl_k8s_extensions }}/scheduler-policy.json
- op: add
path: /spec/volumes/-
value:
hostPath:
path: {{ isecl_k8s_extensions }}
type:
name: extendedschedWhen setting mode for files, symbolic notation should be used. There are 4 types of owner symbols
- a - all
- u - user
- g - group
- o - other
NOTE: Remember to set mode for all owners! Either use a and then owner type
mode: a=rx,u+wOr explicitly state all
mode: u=rw,g=r,o=The temporary files make debugging a lot easier
Instead of using sed, awk, grep etc. try using what ansible provides. Ansible provides Jinja2 templating which is very powerful in terms of manipulating data.
See following links for more info
- https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html
- https://docs.ansible.com/ansible/latest/user_guide/complex_data_manipulation.html
- https://jinja.palletsprojects.com/en/3.0.x/templates/
Example of complex variable manipulation
- name: Build full packages list
set_fact:
install_dependencies_full_list:
"{{
install_dependencies[ansible_os_family] | default([]) | select('string') | list +
install_dependencies | json_query(distribution_query) | default([]) +
install_dependencies | json_query(distribution_major_version_query) | default([]) +
install_dependencies | json_query(distribution_version_query) | default([])
}}"
vars:
distribution_query: "{{ ansible_os_family }}[*].{{ ansible_distribution }}[][]"
distribution_major_version_query: "{{ ansible_os_family }}[*].{{ ansible_distribution }}_{{ ansible_distribution_major_version }}[][]"
distribution_version_query: "{{ ansible_os_family }}[*].{{ ansible_distribution }}_{{ ansible_distribution_version | replace('.','_') }}[][]"Example of string manipulation
dest: "{{ tmp_dir.path }}/{{ item | basename | regex_replace('\\.j2$', '') }}"Do not explicitly use /tmp/ directory. Either create directory in project_dir or use ansible.builtin.tempfile
- name: create temp directory for pip2 installation
tempfile:
state: directory
prefix: pip2-
register: pip2_temp_dirWhen reading variable which is not set by role, always add | default().
- role: infrastructure/git_repo_tool
when: "platform_attestation_controller | default(False) or platform_attestation_node | default(False)"Often there can be network issue during some operations. To handle such cases use retries and delay.
- name: setup repository for kernel | get repository file
get_url:
url: "{{ kernel_repo_url }}"
dest: "{{ _kernel_repo_dest }}"
mode: a=r,u+w
register: result
retries: "{{ number_of_retries }}"
until: result is succeeded
delay: "{{ retry_delay }}"NOTE: Remember to register variable and add until or in other case Ansible will silently switch retries to 1.
For cases when something fails, but command which fails might not provide enough information, block/rescue can be used to provide more information.
- name: ensure that main, restricted, universe and multiverse repositories are enabled
block:
- name: add repository
become: yes
apt_repository:
repo: "{{ item }}"
loop:
- "deb http://archive.ubuntu.com/ubuntu {{ ansible_distribution_release }} main universe"
rescue:
- name: run apt update
apt:
update_cache: yes
register: error_output
become: yes
- name: fail run apt update
fail:
msg: "{{ error_output }}"Task names should start with lowercase letter.
- name: open port for GrafanaRoles should not have any checks against which host runs it. Playbook should be the place where role is mapped to host. NOTE: In synchronization cases delegate_to is acceptable, but if possible synchronization should be done in playbooks.
Role written for one scenario can be used in multiple scenarios with some tuning. In such case role should have a parameter passed to it during include_role or role usage. These variables should tune, enable or disable parts of role.
- name: open port for Grafana
include_role:
name: infrastructure/firewall_open_ports
vars:
fw_open_ports: "{{ grafana_open_ports }}"Tasks in one file should have clear and single responsibility. In other case, tasks in such file should be split based on their responsibility.
If file contains more than 8 tasks, consider splitting it to separate files.
Check if host state which tasks are trying to set isn’t already active. Example: When tasks are building library/application firstly it should be checked if such library/application isn’t already installed/build with intended version before installation/building.
- name: Check current git version
command: git --version
register: git_version_command
changed_when: false
- name: Set current_git_version
set_fact: current_git_version="{{ git_version_command.stdout.split()[-1] }}"
- name: install git from source
include_tasks: install.yml
when: current_git_version < git_versionThere should be one main file which controls the flow of role. In this file there should be a check if component in desired version isn't already installed. All other elements in main should only include sub tasks with particular stages of component deployment. This is also a good place to consider support for multi OSes.
Roles should be written in a way that it will work in multinode case.
There are couple of places where variables are defined:
- inventory/default/group_vars/(all,controller_group,edgenode_group)/10-default.yml - Variables that enable end user option to enable and configure some feature (targeted for end user)
- inventory/default/host_vars/(host name)/10-default.yml - Variables for specific host. Used in very specific cases.
- role/vars - Variables that will likely not change (only changed by DEK developer)
- role/defaults - Variables that might change for example in case of version change (only changed by DEK developer)
All variables should be declared without any prefix which would suggest that the variable is private.
Use:
public_path: '/foo/bar'instead of:
_private_path: '/foo/bar'DEK does support only limited set of operating systems, but all roles should be written with MultiOS support in mind.
Main tasks should be generic and if needed it should include specialized tasks for a given OS.
Following options are possible:
- In case there are very few OS dependencies: Use when ansible_os_family for specific tasks
- name: allow traffic from Kubernetes subnets in ufw ufw: rule: allow from: "{{ item }}" with_items: "{{ fw_open_subnets }}" when: ansible_os_family == "Debian"
- In case there are many OS dependencies: Create an OS specific file and use include_tasks
- name: prepare {{ ansible_os_family }}-based distro include_tasks: "{{ ansible_os_family | lower }}.yml"
- Create variables set dependent on OS
- name: unmask, enable and start firewall service systemd: name: "{{ firewall_service[ansible_os_family] }}" masked: false enabled: true state: started
- With more variables move varibales to files and include files with vars using include_vars with with_first_found
- name: load OS specific vars include_vars: "{{ item }}" with_first_found: - files: - "{{ ansible_distribution|lower }}{{ ansible_distribution_major_version|lower }}.yml" - "{{ ansible_os_family|lower }}.yml"
Each option should be revised individually.
OS abstraction should be as general as possible. In corner cases it is possible to ansible_distribution and ansible_distribution_major_version.
Roles should install needed packages using install_dependencies role (check role README for details).
install_dependencies allows specifying packages for specific distributions and version.
install_dependencies:
RedHat:
- package_for_all_redhat_os
- CentOS:
- packet_for_all_versions_of_centos
- CentOS_7:
- packet_for_centos_version_7
Debian: []There are sub-playbooks in dek. Playbooks organize roles by their functionality. There is still problematic single node and multi node deployment issue and our end goal should be to get rid of this separation with proper organization in playbooks.
Playbooks should be the source which defines where particular role should be run. Also playbooks should be used to synchronize execution of tasks.
DEK does not officially support redeployment. But there are certain gains when roles and tasks are written with consideration of possible redeployment.
When developing new features redeployments happen a lot. If roles were not written with support for redeployment, each time clean system would be required which would add a lot of time overhead.
When writing a role which will support redeployment there is an automatic tendency to correctly organize roles code.
This product changes a lot and there might be cases in future where redeployment will be needed.