Forced Fact Gathering

Hey Folks,

I am writing to bring a topic that started in an pull request to a broader audience. That particular pull request is:
https://github.com/ansible/ansible/pull/9387

This change intends to add a gather_facts functionality named ‘force’ in which, for a particular play, a user may specify that they would like setup to run across all of the hosts specified in the inventory. This functionality, for us, has been a super lightweight mechanism for ensuring that a play has all of the facts that it needs to complete, regardless of any --limit’s placed on the Ansible run itself.

In particular, we currently maintain one of the largest OpenStack Ansible repositories used in a production setting (https://github.com/blueboxgroup/ursula). As OpenStack is deployed in a distributed fashion, there are many occasions where, for instance, a data plane node may need to understand facts about a control plane node. This, as expected, will work fine when a user wants to perform a complete end-to-end run across all of the nodes in the cluster. Where this breaks down is when a user would like to execute against a specific subset of nodes.

A real-world use case for this functionality happens when we add new data plane nodes to an existing cluster elastically. In this case we would like to leave all other nodes untouched and bring only the new data plane node into production at the same Ursula rev. We do so with ansible-playbook --limit option.

While we understand that the current method of supporting this type of functionality in Ansible is via fact caching, we are looking for something a little bit different. There are a few reasons for this:

  1. We would like to give users of our Ursula repo an extremely simple on-ramp to deploying OpenStack. While the overhead of setting up Redis may be trivial for most, it is an additional piece of infrastructure to maintain for a sometimes-realized benefit.
  2. The method proposed in the pull request does not stand in conflict with the fact caching already implemented in Ansible. In fact, we think it can be quite complimentary.
  3. An alternative would be to add the setup role in various places throughout our repo. We believe that a simple “gather_facts: force” for the plays that need external facts is a pretty simple way to achieve this.
  4. The example above where a node may be added to a cluster months after the initial deploy would mean that we would either set an extraordinarily high fact_caching_timeout or disable the TTL altogether. This seems like it doesn’t quite fit the bill for the itch we are trying to scratch.

To that end, we have been using this change for quite some time with quite a bit of success. As this has been quite useful to us, and as far we can tell, does not stand in conflict with the current fact caching implementation, we would like to contribute this feature to Ansible itself. I am interested to hear what others would think about this particular change.

Would this feature be useful to you?

Thanks for all the great work this community does…

Best,
Craig

An alternate implementation I was considering was to allow to delegate
facts, currently gathering facts with delegate_to: applies the facts
to the 'current host' or 'delegated for host'. This would solve not
only the issue when you want all host facts but also when you want
just a small group or single host:

setup: update={{item}}
with items: "{{groups['dbservers']}}
delegate to: item

This would gather facts for the hosts in dbservers group and apply
them to those hosts (not current) and make them available to other
hosts through the hostvars dictionary.

The force option gives you all or nothing, which can apply to some
setups but will be very hard to use on large inventories or those with
disconnected/slow hosts.

What if instead of “gather_facts: force” we develop a mechanism that is simple in nature like:

gather_facts: hosts:all
gather_facts: hosts:compute,database

Where we trim the inventory based upon a set of groups? (I just made that syntax up now, so dont hate me for it…open to ideas) But would that help satisfy simplicity and your concerns about slow facts and large inventories?

If we went that route, we should use the exsiting host pattern syntax:
http://docs.ansible.com/intro_patterns.html

That is similar to what i'm proposing. The differences is that mine
adds a new option to fact plugins that allows to redirect update,
yours overloads the existing boolean directive to give it multiple
meanings.

how would it interact with --limits?
fact caching? which now depends on gather_facts being
implicit/explicit and boolean.