Evaluate group_vars directories and files recursively

Hey,

I remember that I once initially did this and naturally assumed it would work.

I now again am at the point where I find this could be immensely useful.

Let’s take the following inventory:

[all]
abc-group1-1.envA

abc-group1-1.envB

[envA]
abc-group1-1.envA

[envB]
abc-group1-1.envB

[abc-group1]
abc-group1-1.envA
abc-group1-1.envB

I can now apply variables for my different environments and host groups:

group_vars/
abc-group1

envA
envB

Now what if I wanted to change the variable for a host group in a specific environment only? My intuition tells me that I could assume following to work:

group_vars/
envA/
abc-group1
abc-group2

So, the abc-group-1 under envA is only loaded when the host is:

  • in group envA and
  • in group abc-group-1

However, it seems that Ansible does not recursively evaluate group_vars directories. That means, in the above example. All hosts that are in envA receive variables from ‘abc-group1’ and ‘abc-group2’ regardless of whether they actually are in that inventory group.

The other workaround would be to create further inventory groups, e.g.


[envA_abc-group1]

....

However, that would become unwieldy very quickly.

Has there ever been any considerations in this direction? How would you solve this?

First, you are defining abc-group1 BOTH as a host and as a group, that
can cause issues with selection not doing what you expect.

To confirm your results, yes, ALL files under group_vars/<group_name>
will be assigned to the group, there is no 'sub selection' made, this
is done via a vars_plugin and we don't have any plans to change the
default. You can try developing your own pugin that has different a
behaviour, but each vars plugin gets a single 'group' at a time, so
knowledge of other groups is not really available at that point.

The naming of hosts is done in a way so that one can sort them automatically into the correct inventory groups. There are two distinctions made from a host name:

  • environment
  • host group

So, for example, if there exists a play that is called “database” and there are the following hosts
database-group1-1.envA
database-group1-2.envA
database-group1-1.envB
database-group1-2.envB

then the inventory would look like

[database-group1]

database-group1-1.envA
database-group1-2.envA
database-group1-1.envB
database-group1-2.envB

[envA]

database-group1-1.envA
database-group1-2.envA

[envB]

database-group1-1.envB
database-group1-2.envB

So, then one can do ansible-playbook database.yml --limit ‘database-group1:envA’. Assuming that database.yml creates a clustered database of all the hosts it receives it is vital to pass the --limit option.

However, the issue from this model is that group_vars can be set EITHER environment OR host group specific. I don’t see a way to mix them and say “If host is in group 1 AND in environment A then use these variables” (apart from defining new inventory groups, which becomes ugly with many groups and environments).

Another approach would probably be to split the inventory up into several inventories per environment, and then putting the group_vars folder next to the environment specific inventory file.
In my current setup however, the group_vars are neatly placed next to the playbook which I thought of as a good approach to always know where play specific variables are placed.

Perhaps the “one inventory per environment” approach would be better.

It actually seems like the vars_plugin receives a request for all the groups that a specific host is a member of. This seems to be the reason why always the alphabetically last group_vars from a group name are applied to a host.

I’ve finally had the time to get around looking at the implementation.

I managed to implement a vars_plugin which recursively looks for directories with the name of groups and applies the deepest found variable.

For example:

`
[group1]
hostA
hostB
hostC

[group2]
hostA

[group3]
hostB
hostC

`

`
r_group_vars/
group1/
group2.yml
group3/
group1.yml
group1.yml

`

For hostA, first group1.yml is loaded and then combined with variables from group1/group2.yml.
For hostB, only group3/group1.yml is loaded (since no matching variable file is found under group1/.
For hostC, first group1.yml is loaded and then combined with variables from group3/group1.yml.

This allows defining special cases when a host is in various groupings. So I am able to specify “If host is in these groups, then use these variables, if it is only in those groups, then use those”. It’s a great way to really apply the right variables for the right hosts.

What do you think? Would this be something which could be contributed to Ansible Project?

Code (will try to optimize it a litte later): https://pastebin.com/4vj8qrJh

A revised version of the plugin with proper documentation: https://gist.github.com/Timoses/17c39a100350eeedb10d77ab39b9eceb

PR created here: https://github.com/ansible/ansible/pull/60593