Apologies if this has been asked and discussed before. I found it a tricky thing to search the archives for given the common words in the search string.
We have a large number of hosts that have basic bare metal machine names for, “blade-CC-NN.domain” for example, where CC is the chassis and NN is the blade number within the chassis. We define these in our inventory files like below, so that we can apply common plays across all of the bare metal hosts, and more specific plays against specific chassis:
[blade-hosts]
blade-[1:50]-[1:16].domain[blade-hosts-specialisation1]
blade-[1:3]-[1:16].domain
However, we also define a number of service host names, like “ns1.domain” or “fileserver8.domain”, which are logically on the same bare metal hostnames that are already defined above, but the service IP (and thus service hostname) may move between different bare metal hosts every now and then if we reinstall or repurpose a blade to do something else.
[nameservers]
ns[1:4].domain
ns-test[1:2].domain
[fileservers]
fileserver[1:20].domain
In our playbooks, we tend to create plays that apply common or generic things that apply to all of large groups of bare metal hosts using the blade-hosts- host-group names, but apply service specific plays to service hostnames and host-groups.
This all works fine for us if we run Ansible serially / sequentially, only one host at a time, but as soon as we fork ansible-playbook over multiple hosts at once, there is a risk that Ansible will be running against a service host and a bare metal hostname that are logically the same physical machine, and get a locking problem.
For example, ns1.domain is a service IP and service host that resides on blade-1-1.domain. If we fork ansible and run against all service and machine host names at once, there’s a chance it will log in to both those inventory names at the same time.
We would like to continue to use both bare metal host names in combination with service host names because of the ease of understanding it gives us.
How have other people dealt with this problem?
Are we doing something fundamentally silly with the way we are using the inventory, and/or is there some special magic we can do in the fact gathering stage for Ansible to merge duplicate inventory items in to one?
Thanks!
Nicola