Ansible AT NWZ - Ansible Basics

Ansible is an open source automation tool for orchestration and general configuration and administration of computers (this and more at https://de.wikipedia.org/wiki/Ansible ). It offers a number of advantages over administration with your own script collection:

Infrastructure is configured declaratively ('infrastructure as code'), which, especially in combination with the management of configuration files in git repositories, enables very reliable deployment: the correctness and security of configurations can be systematically and automatically checked and implemented before use, there is 'managed reporting' of all processes, so that the four-eyes principle is also easier to implement, etc.
This also makes it easier to treat client computers in a similar way to 'immutable infrastructure': Computers can be replaced or exchanged more easily because all relevant status data is stored in the configuration files. (The computer itself is transformed from an irreplaceable 'pet' into expendable 'livestock' ...)
Idempotency, i.e. the multiple use of instructions without consequences when the target state has already been reached, can be exploited further: Ideally, you should write your most important playbooks and roles in such a way that they are as free of side effects as possible (restarts only when really necessary, as much as possible solved via ansible-builtins and not shell commands, etc.). Then you can also see from the ansible output whether systems are still in a good state, as only a few changes are displayed.

To get started: (The following is based on an article: “Automatisierung mit Ansible”, c't Cloud 2021 special issue. Many thanks to Julian Jeggle for helpful additions).

Ansible is used for the automatic configuration of operating systems, applications, etc., as an alternative to e.g. CFEngine or Puppet. It is consistently modularized and uses standard communication paths (SSH or Windows Remote Management WinRM), i.e. no agents need to be installed on clients. Any system with the Ansible package including modules can be used as a control host, i.e. no dedicated machine is required. Only a directory for the configuration files is required on the control host; however, it is better to store these in a Git repository, at the University of MS at https://zivgitlab.uni-muenster.de/, see also https://confluence.uni-muenster.de/pages/viewpage.action?pageId=26673453

Ansible is then based on the following structural elements:

1. Inventory: directory of the target hosts, in the simplest case a text file in INI style (YAML is also possible if you want to use the same markup language as for the playbooks):

pc0.example.com

[group1]
pc1.example.com
pc2.example.com

[group2]
pc3.example.com
pc4.example.com

Subgroups can also be formed, variables defined and parameters specified. An inventory can be generated dynamically with scripts. A configuration framework is created by Ansible under /etc/ansible/, user-specific .ansinble.cfg files in the home directory are also possible.

2. Tasks: Tasks for the target hosts

For communication, SSH keys without password request should be set up for an unprivileged user on the target hosts. (This account should only be used for Ansible.) The name (remote_user) and the private key (private_key_file) of this user can be entered in the [defaults] section of .ansible.cfg. Temporary root rights should be granted to this user via sudo.

Note 1: This is not necessary for systems with a central user administration such as the University of MS, as the ansible user on the target system will usually be identical to the ansible user on the source system. If it really is a separate user (e.g. on a server that is not in the domain), the user name is probably specific to the respective playbook and should then either be directly in the playbook or possibly in a project-specific ansible.cfg.

Note 2: In principle, it is also possible to work directly as root and without a key; if the ssh-askpass package is installed on the source system, you can log in with -k or --asp-pass using a password. Trying out Ansible requires almost no effort. If the local user has no (sudo) privileges, however, only corresponding tasks are of course possible. This could be useful if you want to roll out your dotfiles via Ansible in order to be able to quickly change the computer without having to reconfigure everything first.

You now tell Ansible which target hosts should execute which tasks (wildcards are allowed). This can be done either via ad-hoc commands or via playbooks, which can contain multi-step instructions. The following command lists which clients fall under a pattern such as all or group1:

ansible all|group1 -i inventory --list-hosts

The following command is used to execute the shell command on the host via the command module:

ansible  -i inventory -m command -a "shellcommand" -o

(-m specifies the Ansible module, -a options for this module, -o the output on one line per host)

There are Ansible modules for all typical tasks. When executing Ansible commands, only the achievement of the defined (target) state counts. (It is not returned as a default what had to be done for this and whether any errors occurred on the way there that required detours - only whether the target state was reached or not). The first point of contact is the module index of the Ansible documentation, which is divided into so-called collections (there are many examples for the respective modules).

https://docs.ansible.com/ansible-core/devel/index.html -> https://docs.ansible.com/ansible/latest/collections/index.html

3. Playbooks: Structured collections of tasks in YAML format

The following playbook installs the packet on all hosts of the group group1 via the packet manager yum:

---
- name: Install <'Trivialname' des Packets>
 hosts: group1
 become: yes

 tasks:
 - name: Install Latest Version
  ansible.build.yum:
   name: 
   state: present

(--- -> start YAML; become -> execute tasks; multiple tasks possible)
Note 1: If sudo requires a password (usually the default), you have to call ansible-playbook with -K or --ask-become-pass.
Note 2: It is recommended to use complete module names, so this is not just yum (which would also work), but ansible.builtin.yum.

The following task activates and starts the service via the Ansible module service (see documentation for options):

---

 - name: Enable and Start 
   ansible.builtin.service:
    name: 
    enabled: true
    state: started

Playbooks are executed with ansible-playbook -i inventory .yml. (Possibly with -k and/or -K if you need to pass passwords.) The Ansible user can be defined in the playbook via remote_user.

Many playbooks will require variables. The content of variables is referenced via {{ variable }}. Variables can be defined on the command line, in the playbook, in the inventory or in special files. Best practice is the use of host and group variables: For this purpose, the directories host_vars and group_vars (possibly also common_vars) are created at the level of the inventory file. Files such as all, group1, etc. with corresponding variable definitions (again in YAML format) can then be stored in the group_vars directory. The following playbook is an example of the use of variables:

- name: create x file
 hosts: all
 become: yes

 tasks:
 -name: create file
  copy:
   content: text inklusive {{ variable }}
   dest: /

4. Facts: System informationen

ansible -m setup displays all facts. In playbooks, individual facts (e.g. default IP) can be referenced as variables (here from the debug module for an output):

- ansible.builtin.debug:
 msg: "{{ factname }}"

Facts can also be defined in /etc/ansible/facts.d/ and/or collected by modules, e.g. the module ansible.builtin.package_facts collects information about the installed packages.

The following page of the documentation is helpful for dealing with Facts: https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_vars_facts.html

In tasks, facts can also be used in when statements (the syntax of Jinja2 is used for this, as already mentioned above for the use of variables and also otherwise in Ansible - for control structures this largely corresponds to that of Python), e.g:

when: ansible_memtotal_mb > 3000

Or to check whether other tasks have run successfully.

Facts can also be used in loop statements (obsolete: “with”), e.g. to check whether certain services are running (item is replaced by the loop elements; the correct indentation is important):

task:
 -name: check for services
  ansible.builtin.service:
   name: "{{ item }}"
   state: started
  loop:
   - sshd
   - httpd

Test setup:

In connection with another article (“Schnellstart mit Ansible”, c't Cloud 2021 special issue) some 'bootstrap' scripts for the typical basic tasks were provided: ct.de/wmvu

debianstrap.sh - preparation target host

myroot.yml- set ssh pubkey for root, disable SSH password login

hostname.yml - set hostname, customize /etc/hosts

dockersetup.yml - set up Docker

You can then proceed as follows:

- install Ansible on the control host (desktop, notebook or Linux VM), create directory structure for Ansible files (ideally using Git): mkdir ~/ansible && mkdir ~/ansible/.ssh, possibly also ~/ansible/host_vars and ~/ansible/group_vars or ~/ansible/common_vars

Note: Ansible can be installed as a Python package without root rights: On old systems or in Venvs via pip install --include-deps ansible, on newer systems (after installing pipx) via pipx install --include-deps ansible.

- Create SSH key pair (ansible/ansible.pub) without passphrase in the ssh directory, create second key pair with passphrase for root access, possibly more for monitoring etc.

- debianstrap.yaml (executed on the target host) creates the user ansible on the target host, disables the login by password, stores the ansible.pub (the key must be specified as PUBKEY in debianstrap.yaml), and gives the ansible user the required sudo rights

- create the file inventory.ini in the ansible directory on the control host and enter “ ansible_host:” there (DNS names are also OK, for registered end devices the name or .uni-muenster.de is usually sufficient)

- Create the .ansible.cfg file on the control host in the home directory:

[defaults]
remote_user = ansible
private_key_file = ~/ansible/.ssh/ansible

to test with: ansible '*' -i inventory.ini -m ping)

- myroot.yml allows you to set the ssh pubkey for root and disable SSH password login, to be called with ansible-playbook -i inventory.ini myroot.yml (to be tested several times)

- hostname.yml allows you to set a hostname and then customize /etc/hosts (order is important!); the old hostname (as 'Ansible fact') is {{ ansible_hostname }}, the new one is taken from the variable {{ hostname }}, which can be specified on the control host in ansible/hosts_vars/ (---; hostame: ).

- dockersetup.yml automates the multi-step setup of Docker on the target hosts via the apt module.

Additional material:

Ansible tutorial: https://docs.ansible.com/ansible/latest/getting_started/index.html

YAML syntax: https://en.wikipedia.org/wiki/YAML

ANSIBLE AT NWZ - FROM THE BASICS TO YOUR OWN CONFIGURATION

The Ansible repository of the Institute for Theoretical Physics (ITP) provided below is already very extensive and therefore not necessarily suitable for a first start. Depending on your previous knowledge, it may make more sense to build up the repository for your own computer administration piece by piece instead of starting directly with the overall configuration of the ITP (including Git submodules, pipelines and pre-commit hooks). You can then concentrate on interesting aspects and see how these have been solved in the ITP.

(Access to the repository is managed via p0admin, so use an admin account to access it.)

The following materials may prove helpful on the way to your own configuration:

- getting started with Ansible: https://docs.ansible.com/ansible/latest/getting_started/index.html

- step by step youtube tutorial(s): https://youtu.be/3RiVKs8GHYQ

(another one: https://youtu.be/goclfp6a2IQ?list=PL2_OBreMn7FqZkvMYt6ATmgC0KAGGJNAN etc.)

- Ansible udemy course: https://www.udemy.com/course/learn-ansible/?couponCode=ST11MT91624A

Basic aspects: comments on the ITP repository

- installing a desktop: ...

- mounting of NFS shares: ...

- setting standard applications: ...

Advanced aspects: comments on the ITP repository

Automatic rollout: With ansible-pull you can also set up the computer administration so that the target hosts regularly obtain and execute the current configuration: https://docs.ansible.com/ansible/latest/cli/ansible-pull.html The details of how ansible-pull is configured in the ITP can be found in roles/workstation/tasks/ansible-pull.yaml

- Domain join (in ITP): The root password is set manually when the base system is installed. The domain join is performed by the respective admin during the first manual rollout (which is necessary for setting up Ansible-Pull anyway). You can either pass the credentials to Ansible as variables during the rollout or the playbook simply stops at the join point and the join command is executed manually (see join_domain.yaml in the NWZ Domain Member Repository in the next section). If the computer is already in the domain, the step is skipped.

- Management of secret data: Ansible-Vault can be used to encrypt variables/files locally on the control host. The procedure is as follows: Store information in file, encrypt file with ansible-vault, use file in playbook, execute playbook with --ask-vault-pass. (You can also encrypt strings directly.) Read more:

https://www.redhat.com/sysadmin/introduction-ansible-vault
https://docs.ansible.com/ansible/latest/cli/ansible-vault.htm
https://docs.ansible.com/ansible/2.9/user_guide/playbooks_vault.html
https://docs.ansible.com/ansible/latest/cli/ansible-vault.html

In ITP, the configuration for the workstations is deliberately without secrets so that no control server is needed, but the configuration can be rolled out directly from zivgitlab via ansible-pull. However, there are secrets in the ITP server repository (e.g. hashed admin passwords for services such as the APT cacher), for which ansible-vault is then used. In IVV5, ansible-vault is also used for the workstation configuration.

Tackling problems: Playbook collection

- SSSD configuration for laptops (...)

Tips & tricks:

- In IVV5, signed host SSH keys are provided via a separate “SSH CA” server. With the corresponding entry in ssh_known_hosts, you will not receive a “Trust on first use” request for managed hosts. As long as the host keys of a computer have been signed with the SSH key of the “SSH CA”, they are accepted. Signed host keys are bound to host names and have an expiration date. (Contact: alexander.preuss@uni-muenster.de)

Ansible AT NWZ - NWZ specific Information

At the University of MS, Ansible files can be managed via the University of MS Gitlab: https://zivgitlab.uni-muenster.de/, see also https://confluence.uni-muenster.de/pages/viewpage.action?pageId=26673453 (teaching materials on Git - and many more - can be found in the NWZ self-study course Research Software Engineering: https://sso.uni-muenster.de/LearnWeb/learnweb2/course/view.php?id=80361)

The admins from the Institute for Theoretical Physics (ITP) have made their Ansible setup available for reuse in the NWZ - many thanks especially to Julian Jeggle! The individual repositories can be integrated as sub-modules in the repository of your organizational unit; alternatively, you can pick out and adapt the files that you want to transfer to your repository. It probably does not make sense to simply take the whole repository and try to apply it to the computers outside the ITP, as a number of aspects are hardcoded (such as the institution's internal NFS server). Instead, you should pick out individual task files by topic and see how this was done at the ITP. The YAML structure of Ansible makes the whole thing relatively self-documenting.

Introductory presentation: https://zivgitlab.uni-muenster.de/itp-admins/unclassified/ansible/introduction_presentation

Workstation playbooks: https://zivgitlab.uni-muenster.de/itp-admins/unclassified/ansible/workstation-playbooks

- for the setup and maintenance of workstations at the Institute of Theoretical Physics:

Basic host configuration: https://zivgitlab.uni-muenster.de/itp-admins/unclassified/ansible/debian_host

- for Debian; this must be adapted accordingly for other distributions.

NWZ domain Join: https://zivgitlab.uni-muenster.de/itp-admins/unclassified/ansible/nwz_domain_member

- distribution-specific adjustments must be made in /tasks/main.yaml.

About the Workstation Playbooks repository:

- the file common_vars/main.yml can be created by you as shown at the bottom of the appendix (in the ITP, common_vars is created as a separate repository, which is included as a sub-module, as this can also be included in another repository for the servers of the ITP, so that certain variables do not drift apart between the configuration of the workstations and that of the servers).

- the hosts file lists the target hosts; it must be adapted for your group

- group_vars and host_vars contain files with variable definitions for groups and hosts; these must also be adapted for your group

- site.yml is the 'master playbook'

- workstation.yml and nwz_domain_member.yml are the main playbooks for workstations with NWZ domain join nwz_domain_members.yaml is the playbook that realizes the NWZ domain join, workstations.yaml does the rest. (There is also devhosts.yml for the installation of some workgroup-specific development packages and debian_hosts.yml for the installation of basic packages and services, which the servers also receive).

In general, the playbooks are always organized in such a way that .yaml is a playbook for all hosts in the group. This playbook has exactly one task: to execute the tasks in the role. By convention, the roles use the singular, the groups the plural (unless there really is only one host in the group).

- monitoring_target.yaml is the client side of the monitoring system (in the case of the ITP, an installation of the Prometheus Node Exporter). The rest of the monitoring system happens in the server playbooks.

- in oneshots there are various playbooks for tasks that have to be performed once; the full titles are largely self-explanatory; here you have to see if and what you can use

- in roles you will find the references to the debian_host and nwz_domain_member repository; furthermore, the basic playbooks for workstations (as well as dev_hosts and monitoring) can be found here.

Note: Technically speaking, these are roles with plays and not playbooks. The Ansible terminology is unfortunately very confusing: “Plays” simply means a list of tasks. A playbook also contains the information on which computer and under which user these tasks are to be executed

- in roles/workstations, defaults are set for the target hosts, files are provided (files), services are managed (handlers) and regular maintenance tasks are specified (tasks); as everything is processed here, you must adapt or delete all files for your purposes; the full titles of the individual tasks are, in turn, largely self-explanatory

Note 1: Handlers are not necessarily linked to services (even if they are often used there). Handlers are a simplification to express the logic “if this task has caused a change, then this task must also run (otherwise not)”. The classic example is the writing of a configuration file, which (if there really was a change) should result in a corresponding restart of a service. This can therefore be seen as a type of “demand task”.

Note 2: From Ansible's point of view, it doesn't matter whether the tasks are executed regularly or not. The configuration of the computers takes place in the role tasks, which can then be rolled out either once, several times or regularly (automatically once a week in the ITP).

- You will most likely want to use the following tasks: ...

- you will probably not want to use the following tasks (and therefore want to delete them from your repository): ...

About your git repository:

- After you have set up your repository with the appropriate selection of directories and files (possibly included as modules), you still need to define the following git tasks, for which the git configuration files in the Workstation Playbooks repository can be used:

.gitlab-ci.yml - ensures that a code review with pre-commit git hooks (task-specific scripts) is run, an Ansible syntax check (ansible-playbook -i hosts --syntax-check ) is performed, and that typical Ansible configuration errors are searched for (with ansible-lint).

Note: The Gitlab pipeline should be deactivated at the beginning, especially for beginners. The linter is very strict and this can be a little frustrating at first.

.pre-commit-config.yaml - to detect problems that the pipeline would detect in Gitlab before the actual commit. This saves some time, as the pipeline always needs some time to start.

Note: pre-commit (like the pipeline) is completely optional and only helps with quality assurance in the team. Ideally, use the Ansible plugin in VS Code and detect errors in the configuration while writing.

.gitmodules - defines repositories that are integrated as sub-modules (here: debian_host and nwz_domain_member and common_vars)

For management with Ansible/Git:

- once you have committed a change to one of the YAML files, the above git pipeline runs through its stages with jobs/scripts, ensuring that the overall configuration is in a (preferably) sensible state - there is also an email for failed pipelines

- Output, error messages etc. can be viewed in the Uni MS Gitlab web interface under build -> pipelines/jobs

- new computers must be added to all:/hosts:/:. Some roles require certain host variables to be set, e.g. vmhost_primary_physical_interface for the hosts in the vmhosts group. Most groups correspond to roles, but there are also more general groups such as servers.

- to manage the hosts, create a local copy of the git repository on the control host, or refresh it, and then use the ansible-playbook command (the following is from the how-to of the workstation playbook git repository):

ansible-playbook -i hosts --limit=myhost site.yaml

If only one special role is to be used:

ansible-playbook -i hosts --limit=myhost myrole.yaml

If the entire configuration except the NWZ domain join is to be applied:

ansible-playbook -i hosts --limit=myhost site.yaml --skip-tags domain_join

Remember to restrict the command to the target host, otherwise ALL hosts will be reconfigured!

You can also use “--tags=mytag” to apply only parts of the configuration (useful during development).

Oneshots are tasks that are not applied regularly. They can also be triggered with ansible-playbook:

ansible-playbook -i hosts --limit=myhost oneshots/systemd_reload/main.yaml -u root -k

(whether -u root -k is necessary depends on the use of sudo and SSH pubkeys)

With ansible-pull you can also set up the host management so that the target hosts regularly obtain and execute the current configuration: https://docs.ansible.com/ansible/latest/cli/ansible-pull.html

The details of how ansible-pull is configured in the ITP can be found in roles/workstation/tasks/ansible-pull.yaml

In principle, all computers can be used as control hosts in the ITP, as only Ansible (via pipx) needs to be installed and the Git repository cloned. In practice, these are usually simply the admin workstations. Authorization is carried out via individual SSH keys for each admin (portable and secured by using -sk keys on Nitrokeys).

As the computers are not always switched on (and WoL is not reliable enough), the computers in the ITP are configured so that they clone the Git repository automatically after booting and every 7 days and roll it out to themselves. Thanks to ansible-pull, this can be done with a single command and does not require any additional authorization, as the corresponding update service is already running as root and Ansible can also run a playbook locally in addition to the method via SSH.

Ultimately, the Zivgitlab server is the central configuration server and every change that is uploaded to the central repository is automatically rolled out after some time. Time-critical issues (e.g. security, expiring Gitlab access tokens, etc.) are occasionally rolled out in-house.

For the automated base system installation (except for the hard disk partitioning), the Debian preseeding method is used in the ITP. The site.yaml playbook is then rolled out manually once, which then sets up the automatic rollout via ansible-pull. Ideally, the manual rollout should also be eliminated by integrating ansible-pull into the installer; however, this has not yet been implemented.

Appendix:

common_vars/all.yaml - group-specific information on subnets, hosts/IPs, users (groups), admins (incl. keys)

# Global variables for both workstations and server repo
monitoring_masters:
  - ""
_subnet: 
# A list of address spaces for computers under our administration
itp_administered_subnets:
  - "{{ _subnet }}"
# Subnets that we trust for administrative access (i.e. root login or IPMI access) trusted_admin_subnets:
  - "{{ _subnet }}"
# A list of addresses for Gitlab (according to Confluence) zivgitlab_addresses:
  - "128.176.196.104"
  - "128.176.11.80"
zivgitlab_addresses_ipv6:
  - "2001:4cf0:8:1::/64"
unims_gateway_addresses:
  - "128.176.11.120" # SSH Jumphost 1
  - "128.176.11.121" # SSH Jumphost 2
  - "10.68.0.0/16"     # VPN
  - "128.176.105.0/24" # pLANet.X
  - "128.176.106.0/24" # pLANet.X
  - "128.176.107.0/24" # pLANet.X
  - "128.176.193.68" # PALMA
  - "128.176.193.69" # PALMA
  - "128.176.193.70" # PALMA
  - "128.176.193.71" # PALMA
unims_gateway_addresses_ipv6:
  - "2001:4cf0:2:4020:250:56ff:feae:3904" # SSH Jumphost 1
  - "2001:4cf0:2:4020:250:56ff:feae:174b" # SSH Jumphost 2
  - "2001:4cf0:280:30::/64"               # VPN
# A list of user groups that should contain all proper users (but not students)
_user_group:
  - name: 
    id: 
# our group admins (normal user accounts for e.g. mailing)
itp_admins:
  - 
# our grup admins (admin accounts)
_admins_yaccounts:
  - 
_admins_keys:
  :
    - type: 
      key: 
# Hostname or address of the /home storage server (currently not managed via Ansible)
home_directory_server: 
_workstations:
  :
    macaddress:

