Hey Folks,
I am writing to bring a topic that started in an pull request to a broader audience. That particular pull request is:
https://github.com/ansible/ansible/pull/9387
This change intends to add a gather_facts functionality named ‘force’ in which, for a particular play, a user may specify that they would like setup to run across all of the hosts specified in the inventory. This functionality, for us, has been a super lightweight mechanism for ensuring that a play has all of the facts that it needs to complete, regardless of any --limit’s placed on the Ansible run itself.
In particular, we currently maintain one of the largest OpenStack Ansible repositories used in a production setting (https://github.com/blueboxgroup/ursula). As OpenStack is deployed in a distributed fashion, there are many occasions where, for instance, a data plane node may need to understand facts about a control plane node. This, as expected, will work fine when a user wants to perform a complete end-to-end run across all of the nodes in the cluster. Where this breaks down is when a user would like to execute against a specific subset of nodes.
A real-world use case for this functionality happens when we add new data plane nodes to an existing cluster elastically. In this case we would like to leave all other nodes untouched and bring only the new data plane node into production at the same Ursula rev. We do so with ansible-playbook --limit option.
While we understand that the current method of supporting this type of functionality in Ansible is via fact caching, we are looking for something a little bit different. There are a few reasons for this:
- We would like to give users of our Ursula repo an extremely simple on-ramp to deploying OpenStack. While the overhead of setting up Redis may be trivial for most, it is an additional piece of infrastructure to maintain for a sometimes-realized benefit.
- The method proposed in the pull request does not stand in conflict with the fact caching already implemented in Ansible. In fact, we think it can be quite complimentary.
- An alternative would be to add the setup role in various places throughout our repo. We believe that a simple “gather_facts: force” for the plays that need external facts is a pretty simple way to achieve this.
- The example above where a node may be added to a cluster months after the initial deploy would mean that we would either set an extraordinarily high fact_caching_timeout or disable the TTL altogether. This seems like it doesn’t quite fit the bill for the itch we are trying to scratch.
To that end, we have been using this change for quite some time with quite a bit of success. As this has been quite useful to us, and as far we can tell, does not stand in conflict with the current fact caching implementation, we would like to contribute this feature to Ansible itself. I am interested to hear what others would think about this particular change.
Would this feature be useful to you?
Thanks for all the great work this community does…
Best,
Craig