How to handle multi-host roles?

I am unfamiliar with how administrators structure their inventories and playbooks with respect to roles that run against multiple hosts as a group. We have two new system roles - vpn https://github.com/linux-system-roles/vpn and ha_cluster https://github.com/linux-system-roles/ha_cluster - that will be used to set up associations among several hosts - that is - hosts will have to know about and use information about some of the other hosts in the inventory.

Here is a proposal which assumes the user wants to set up everything in the inventory, and the playbooks are more or less static.

Note that this does not preclude the ability of users to set up everything by editing playbooks instead of inventory, and passing in parameters in the `roles` or `include_role` using `vars`.

I would really like to get some feedback about how sysadmins will use these roles in a real production environment, using Ansible Tower or Satellite or ???

https://gist.github.com/richm/59d2dd6df7ae6760a7f06550696c9351

Typically speaking, I tell people that roles should never know about your environment. Instead anything that your role needs to know about should be passed in.

As such, I would avoid needing a special group to exist in your inventory, but instead maybe passing that in.

If they need to know host facts, that’s a different story.

Do note that a playbook file can contain multiple plays. And plays target hosts. That is how you apply roles to certain hosts. But if a play that doesn’t target certain hosts, and the play needs to know about them, you’d usually want a multi play playbook, where the first play minimally just is there to gather facts that the 2nd play needs. You can then use groups.whatever and hostvars to access info about other hosts within a certain group.

Typically speaking, I tell people that roles should never know about your environment. Instead anything that your role needs to know about should be passed in.

As such, I would avoid needing a special group to exist in your inventory, but instead maybe passing that in.

You don't need a special group to exist in the inventory, but you do need some way to say "run the vpn role on this set of hosts". AFAIK, you can either define a group in your inventory, then use

- hosts: some_arbitrary_name_of_my_group_of_vpn_hosts
roles:
- linux-system-roles.vpn

or you can specify the hosts manually:

- hosts: hosta,hostb,hostc
roles:
- linux-system-roles.vpn

or some combination thereof. Or is there some other method?

If they need to know host facts, that's a different story.

The use case is that e.g. some sort of per-host key could be defined as a host fact, and if the role is running on host b, and it needs to set up a tunnel with host a, it can look up the key in the host facts for host a.

Of course, all of these could be passed in as role vars/parameters, but I'm not sure how you would do that efficiently when you have multiple tunnels consisting of multiple hosts.

Do note that a playbook file can contain multiple plays. And plays target hosts. That is how you apply roles to certain hosts. But if a play that doesn't target certain hosts, and the play needs to know about them, you'd usually want a multi play playbook, where the first play minimally just is there to gather facts that the 2nd play needs. You can then use `groups.whatever` and `hostvars` to access info about other hosts within a certain group.

Do you have examples of this?

- hosts: some_arbitrary_name_of_my_group_of_vpn_hosts
roles:
- linux-system-roles.vpn

or you can specify the hosts manually:

- hosts: hosta,hostb,hostc
  roles:
   - linux-system-roles.vpn

or some combination thereof. Or is there some other method?

Yes. Enable vpn in the inventory and create the group dynamically.
For example

  > cat hosts
  hosta vpn=enabled
  hostb vpn=disabled
  hostc vpn=enabled

The playbook

  - hosts: all
    gather_facts: false
    tasks:
      - add_host:
          groups: vpn_hosts
          name: "{{ item }}"
        loop: "{{ hostvars|dict2items|
                  selectattr('value.vpn', 'eq', 'enabled')|
                  map(attribute='key')|list }}"
        run_once: true

  - hosts: vpn_hosts
    gather_facts: false
    tasks:
      - debug:
          msg: "Running linux-system-roles.vpn"

gives

  ok: [hosta] =>
    msg: Running linux-system-roles.vpn
  ok: [hostc] =>
    msg: Running linux-system-roles.vpn

This is an example of what @Matt Martz meant with "a multi play
playbook, where the first play minimally just is there to gather
facts that the 2nd play needs."

See "Variable precedence: Where should I put a variable?"
https://docs.ansible.com/ansible/latest/user_guide/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable

There are many options. For example, put common variables into the
*inventory group_vars/all* and per-host key into the *inventory
host_vars/*.

Next option is to build a common dictionary with the parameters. For
example

vpn_params:
  hosta:
    key1: val1
    ...
  hostb:

IMHO, the main question is where the data is coming from?

In the case of keys the role can generate them for the user, which means that we will need two passes/plays - one pass to generate the missing keys, and another pass to apply the role with the new keys. We have PR that does something similar to that - generate the pre-shared keys for each pair of hosts - https://github.com/linux-system-roles/vpn/pull/3/files#diff-3d0ff1709ca48add100327bb2a468e6c508fb92a159c64c4f99ad1df89d9bddeR53-R77

I would really like to get an example from different real use cases that sysadmins are using in production environments e.g. As a sysadmin for a small lab, I like to have custom playbooks with custom groups and specify my vpn hosts with a combination of inventory and custom playbooks. OR As a sysadmin for a large corporate deployment using Ansible Tower, I like to have relatively static playbooks and use more complex inventory setups.

- hosts: some_arbitrary_name_of_my_group_of_vpn_hosts
   roles:
   - linux-system-roles.vpn

or you can specify the hosts manually:

- hosts: hosta,hostb,hostc
   roles:
    - linux-system-roles.vpn

or some combination thereof. Or is there some other method?

Yes. Enable vpn in the inventory and create the group dynamically.
For example

   > cat hosts
   hosta vpn=enabled
   hostb vpn=disabled
   hostc vpn=enabled

The playbook

   - hosts: all
     gather_facts: false
     tasks:
       - add_host:
           groups: vpn_hosts
           name: "{{ item }}"
         loop: "{{ hostvars|dict2items|
                   selectattr('value.vpn', 'eq', 'enabled')|
                   map(attribute='key')|list }}"
         run_once: true

   - hosts: vpn_hosts
     gather_facts: false
     tasks:
       - debug:
           msg: "Running linux-system-roles.vpn"

gives

   ok: [hosta] =>
     msg: Running linux-system-roles.vpn
   ok: [hostc] =>
     msg: Running linux-system-roles.vpn

This is an example of what @Matt Martz meant with "a multi play
playbook, where the first play minimally just is there to gather
facts that the 2nd play needs."

Is this better (as in more performant, best practices) than just doing

> cat hosts
[vpn_hosts]
hosta
hostb
hostc

- hosts: vpn_hosts
roles:
- linux-system-roles.vpn

?

No. It isn't. Knowing where does the data (vpn=enabled/disabled) come
from could help proposing the best practice.

The ha_cluster case is probably easier to conceptualize for the purposes of this discussion. For example, I want to manage a clustered database:

[db_cluster]
db1.example.com
db2.example.com
...
dbN.example.com

then

- hosts: db_cluster
roles:
- linux-system-roles.ha_cluster
- postgresql

In this case I know in my inventory which hosts will go into cluster.