Variable precedence change/fix coming in 2.3

Hi all, just wanted to write about a fix/change I’ll be merging into the devel branch soon (ie. as soon as the AWS outage quits impacting our testing infrastructure), which means it will be included in the 2.3.0 release.

Per our variable precedence docs (http://docs.ansible.com/ansible/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable), the variables provided by the inventory script or INI file are supposed to be lower priority than those found in group_vars and host_vars files, however this has not been the case in 2.0 and it has gone unnoticed thus far.

While fixing this to match our documentation however, we realized that the precedence docs had a problem - variables defined on the host in the script/INI file would in fact have lower precedence than those in group_vars/all (or any group_vars files). We’ve decided to fix this based on the following rules for inventory:

  1. Host Vars > Group Vars
  2. Things relative to the playbook > Things relative to the inventory > Things defined in inventory

For those who are unfamiliar with the second point above, it is possible to have host/group vars files defined relative to both the playbook and to the inventory. For example, if you have:

playbook/site.yml
playbook/group_vars/all
inventory/hosts
inventory/group_vars/all

And you run:

$ ansible-playbook -i inventory/hosts playbook/site.yml

the variables in playbook/group_vars/all will “win” over those in inventory/group_vars/all.

Prior to this change, if you had something like:

ansible_connection: local

In any group_vars/all file, it would have “won” over anything defined in the inventory script, even if you did something like:

myhost ansible_connection=paramiko

the value in the group_vars would have been the value used, which is counter-intuitive.

If anyone has any questions regarding this change, please let us know.

Thanks!

Hi James,

I have a big concern about this bug fix that I would like to discuss. I understand that the code was doing something else that was in the documents, however I think that the code was doing the right thing.

Let me describe how this bug (feature?) helped us use Ansible in a very efficient way and why we are in trouble with this change. At my company we are using Ansible for large scale deployment in AWS to provision some 140+ EC2 instances. Given the AWS environment we have resources in:

  • Different regions

  • Within each region in different Availability Zones (AZ)

  • Within each AZ in different stages (prod, staging, prototype, sandbox, etc.) that all reside in their own separated VPC.
    This is the physical AWS layout. Atop all this we also have different “functional types” or “box types”. Something like:

  • “Application box”

  • “Cassandra box”

  • “Deployment box”

  • “Monitoring box”

  • etc.

Using Ansible 2.2.x we have come up with an environment that has nicely fit into this architecture. Even colleagues new to the project could easily comprehend the overall architecture just by looking at ansible files. The way we used Ansible was the following:

  1. Resource types (or we call them layers) are simply modelled with groups and their associated playbooks. E.g. the Application boxes are set up using the ‘layers/application.yml’ and group “application”, the Cassandra boxes the ‘layers/cassandra.yml’ playbook and group “cassandra”. They are generic enough to be usable in all stages and AWS regions or AZs. They are even independent from the actual project so e.g. Project A and B can set up a Cassandra box by using the same playbook.
  2. Next we use group vars (relative to inventory) to define Project specific defaults for each resource/box type. E.g. Project A can define in Cassandra group vars the Cassandra version that all nodes should use, in the Application group vars the Java version etc. Everything that should be the same independently from actual deployment stage or AWS region (e.g. the set of basic Linux tools)
  3. On the top level we use separate inventory files for each AWS region AND stage combination. So we have files like inventory/staging-eu-west-1, inventory/staging-eu-central-1, inventory/prod-us-east-1, etc… In these inventory files we haven’t just listed the hosts per group but use them as a place to override all the properties for the given region and stage. We could even use the all:vars to specify common properties across all groups in the given region/stage. Such a property is e.g. the name of the stage that is used for monitoring or starting applications, the dns names, etc.
    So as you can see having things defined in the inventory files with higher priority than host and group vars allowed us nicely scaling up our Ansible environment without doing unnecessary duplications. We just took e.g. the Cassandra playbook and group with all the defaults for our projects (coming from group vars) and finally just overrode the bits that are different in the given stage and region in the inventory file. Thus e.g. provisioning Cassandra nodes in two different regions could be done by just the following two self describing commands (show exactly what is provisioned where):

ansible-playbook -i inventory/staging-eu-west-1 layer/cassandra.yml
ansible-playbook -i inventory/staging-eu-central-1 layer/cassandra.yml

This bug fix unfortunately ruined our clean setup as inventory properties are now overridden by group vars. Even worse at the moment we struggle to find any alternative solution (posted in various blogs or in the documents) that would be as clean as it was before. So I would like to ask what alternative way do you suggest now given our complex, large scale infrastructure? How to set properties with the same of separation of concerns (project → box type → stage → region) an such minimal duplication?

This blog explains a very similar problem and also tries to find a clean solution: https://www.digitalocean.com/community/tutorials/how-to-manage-multistage-environments-with-ansible. If you read it through you can see that none of those solutions are as elegant as ours. Each of them are either more complex and does not reflect the actual architecture or needs considerably mode duplication and copy-paste or requires non-ansible solutions like symlinking.

I understand that it was a bug that we used as a feature but this “feature” could give an elegant solution to what it seems to be an unsolved problem in Ansible.

Best regards,

Szabolcs

I might have a 'solution' ... configurable precedence, this is
something I've been toying with in
http://github.com/ansible/ansible/issues/23001.

config setting (need to add to examples/ansible.cfg) :
https://github.com/ansible/ansible/pull/23001/files#diff-b77962b6b54a830ec373de0602918318R249

draft implementation:
https://github.com/ansible/ansible/pull/23001/files#diff-473ef6db3f086739e3c053f001d731b5R257

Hi Brian,

Thank you for the answer. I have read through your inventory plugin proposal JIRA, PR and docs but cannot see where precedence can actually be configured. I see that inventory types are now nicely refactored to be handled by plugins and that they can be enabled or disabled but can’t see where precedence is set.

Could you please give me the necessary config option? (The above links do redirect me and the pages jump around so probably pasting here directly is better). What I would like is to have INI with higher precedence than group_vars relative to inventory files.

Thanks,

Szabolcs

in constants.py:

VARIABLE_PRECEDENCE = get_config(p, DEFAULTS, 'precedence',
'ANSIBLE_PRECEDENCE',
['all_inventory', 'groups_inventory', 'all_plugins_inventory',
'all_plugins_play', 'groups_plugins_inventory',
'groups_plugins_play'], value_type='list')

Which would allow you to switch the group vars precedence, the above
list is the default and matches the current docs.

Got it. I was focusing on the config and not the constants. Let me play with this and then I’ll get back to you.

well, constants define in code what is acceptable in config, the
equivalent in ansible.cfg woudl be:

precedence = all_inventory, groups_inventory, all_plugins_inventory,
all_plugins_play, groups_plugins_inventory,
groups_plugins_play

So I have given it a go but unfortunately got an error (without touching the ansible.cfg yet). I have built and installed the version from the inv_plugins branch of your ansible repo. I have run a playbook ‘layers/monitroing.yml’ that has nothing else than two includes of the files ‘layers/infrastructure/monitorting.yml’ and ‘layers/configuration/monitoring.yml’. Also specified an inventory file via -i. Then almost immediately after ansible-playbook starts I get this error:

ERROR! the file_name ‘layers/infrastructure/vars’ does not exist, or is not readable

I’ve run it also in verbose mode but no more information revealed. That directory is indeed not there but I have no idea why it should be. Do you have any idea what this could be? Should I build an other branch or repo? Or just need to add something to ansible.cfg?

Just to add the same playbook works with 2.2.2.0 and also with ansible/devel.

I’ve seen you’ve merged the PR. Now I get the same error with latest ansible:devel.

Can you supply us with a simple reproducer?

Also you can use ANSIBLE_DEBUG=1 to get very detailed information.

Something to reproduce will require more time as our ops repo is pretty big and need to figure out how to cut relevant parts.

Until that I’ve run the playbook with the debug option. Attached the log. (Note that I have deleted the parts when it iterates through all the hosts and roles and variables. Seemed to be unrelated but also that’s more sensitive data.)

Szabolcs

(attachments)

ansible_debug.log (5.26 KB)

you have a group or host named 'vars'?

In any case, this should be fixed in current devel, as a ticket with a
simple reproducer was opened (and now closed).

Good news. The build from the latest devel branch indeed fixed the vars directory does not exist error. Plus the precedence config worked like charm. I have moved inventory_all and inventory_group to end of the list and just got what I needed. Variables in the INI with the highest priority.

Just a small regression that I have found. Vim put some swap files into the inventory directory and it looks that the latest ansible picked them up and gave an error while the older version did not care about them:

32011 1496045937.69478: Loading data from /home/spota/ansible-ops/inventory/group_vars/all/vars/.all.yml.swp
32011 1496045937.69507: RUNNING CLEANUP
ERROR! ‘utf8’ codec can’t decode byte 0xac in position 16: invalid start byte

Anyway, removing swap files solved the problem.

Thank you for your help.

Szabolcs

Did this make it into 2.3 and if so where would I find constants.py to change the default precedence order? I would have thought something like this would be set in ansible.cfg

I’ve been wading through the many posts on how variable precedence works and many including one from you on another post state:

playbook/group_vars are meant to override the inventory as plays are more specific than inventory.

This doesn’t make sense to me (and a lot of other people it seems), I had assumed:

  1. A playbook applies a set of roles to one or more servers (i.e. what to do), it would naturally follow that playbooks/group_vars/.yml apply to all servers in the playbook.
  2. An inventory file defines which servers the playbook will apply the roles to (i.e. where to do it), it should natually follow that we can use inventories//group_vars/.yml to override playbook group_vars as inventories are more specific about where to run tasks than plays
    With the playbook taking precedence over inventories there is no obvious place to have “global” variables for a group (as defined in inventory file) that can be overridden as needed in each inventory. All.yml doesn’t make the cut as this can’t be used to define group level variables.

So, if this change for changing the default precedence order has not yet made it into Ansible can you point me at any documentation that can show me how to define global variables for groups that can be overridden . All the links I’ve followed try to jump though hoops to do what should be a simple excercise and the ansible documentation is rather quiet on the subject (unless I’ve missed something) - which is a common complaint in the posts I’ve been reading.

Thanks
Andrew