I have a very large playbook which configures our entire
infrastructure. Because of this, various steps are tagged so that only
specific parts of the playbook can be run, cutting down on runtime
when required.
Parts of this setup use facts/hostvars to automatically create correct
configuration files. For example, nginx config adding all the
application servers that are defined in the inventory to the correct
upstream definitions, and iptables on the appservers automatically
opening up the correct ports to the loadbalancers.
However, when running the playbook with --limit, or --tags, not all
hosts are contacted, and as a result, facts aren't available on every
system in the infrastructure. This causes all kinds of problems for my
setup, obviously.
Is there any way to force gathering of facts on all hosts, even when
specifying one of these options? Or another way to deal with this
situation that I haven't thought of?
Right now, I'm solving it for the --tags case by having one task at
the start of the playbook, which simply calls the ping module and has
every tag that's used listed. This way, this task is kicked off no
matter which tag is specified, causing facts to be gathered on every
system in our inventory.
Obviously, this isn't a practical solution however, nor does it solve
the case where limit it used.
How about being able to specify which facts you need back ( some sort of white/black listing) so that in a large scale infrastructure that does not cause any extra overhead on the network nor on the Ansible managment/console where these data gets reported back.
It’s also true that there really isn’t a ginormous amount of data coming back from the facts modules, so it shouldn’t be a problem.
If you are writing some modules of your own, you’ll likely have a few facts modules that you might choose to not call on a case by case basis if they are intensive (walid_facts, etc)
No I haven't. In our case, we're simply not using --limit for the time
being, and have our tags set up to still allow pretty granular
targeting. Fact caching is going to help somewhat with this problem,
once it's implemented.
Sorry to dig up this thread again but it’s also an issue for me.
AIUI, Strahinja Kustudić needs (just as I do) a way to always gather facts on all hosts, regardless of tags/limits. What your proposal does is (again, AIUI) improve performance without the need of gather_facts: false in every play. It’s cool but different. We need different functionality, not better performance.
this gives people more fine grained control, target will populate
hostvars[target], vs current host.
or would it be better to use delegate_to?
I'd say delegate_to (with a group, which isn't allowed ATM, I think?) would be cleaner as it would transparently support other modules (like custom facts) without modifying them. Also, to support this AFAIK you need local action plugins, not just a module to run on the remote host.
So either this:
- setup:
delegate_to: webservers
Or this, which probably isn't valid Ansible either:
it would require changing current behavior, now if you do setup +
delegate_to, you populate 'current' host with delegated host facts.
Right, my bad.
Still, I don't really like the target= as a parameter of the setup module. It complicates the implementation too much IMHO. It effectively duplicates the whole support of delegate_to with a minor change in functionality (which I forgot and you pointed out).
Another play-level option (delegate_facts: true?) might be more universal and would keep writing custom fact modules trivial.
I know that I suggested this, but I really don’t see a downside in using gather_facts: force. It is simple, seems the easiest to implement, it is easy to read and feels natural. Only downside I see is that gather_facts is a Boolean, and if we allowed “force”, it will end being that and might feel a little strange.
On the other hand adding additional options to setup module like ignore_tags and ignore_limit is cool, but what I don’t like about it is that you would need to do gather_facts: False before that. The same goes for using setup with delegate_to.
Don’t get me wrong those additional parameters for the setup module are great ideas and I think Ansible should have them as well, so that you can gather custom facts, but Ansible is always being promoted as being extremely simple and gather_facts: force/always is as simple as it can be.