Force gathering facts on all hosts when using --tags or --limit

Nick_Groenen · September 10, 2013, 1:22pm

I have a very large playbook which configures our entire
infrastructure. Because of this, various steps are tagged so that only
specific parts of the playbook can be run, cutting down on runtime
when required.

Parts of this setup use facts/hostvars to automatically create correct
configuration files. For example, nginx config adding all the
application servers that are defined in the inventory to the correct
upstream definitions, and iptables on the appservers automatically
opening up the correct ports to the loadbalancers.

However, when running the playbook with --limit, or --tags, not all
hosts are contacted, and as a result, facts aren't available on every
system in the infrastructure. This causes all kinds of problems for my
setup, obviously.

Is there any way to force gathering of facts on all hosts, even when
specifying one of these options? Or another way to deal with this
situation that I haven't thought of?

Right now, I'm solving it for the --tags case by having one task at
the start of the playbook, which simply calls the ping module and has
every tag that's used listed. This way, this task is kicked off no
matter which tag is specified, causing facts to be gathered on every
system in our inventory.

Obviously, this isn't a practical solution however, nor does it solve
the case where limit it used.

Michael_DeHaan2 · September 10, 2013, 2:00pm

Fact caching is something we want to look into for the 1.4 release.

Walid · September 20, 2013, 9:05pm

Hi,

How about being able to specify which facts you need back ( some sort of white/black listing) so that in a large scale infrastructure that does not cause any extra overhead on the network nor on the Ansible managment/console where these data gets reported back.

kind regards

Walid

Brian_Coca1 · September 20, 2013, 9:17pm

@walid, setup module has this option, use the filters, from the examples:

ansible all -m setup -a ‘filter=ansible_eth[0-2]’

Walid · September 21, 2013, 6:48am

Thank you Brian, my mistake i should have read the documentation first. this is perfect, thanks

Michael_DeHaan2 · September 21, 2013, 1:23pm

Hi Walid,

It’s also true that there really isn’t a ginormous amount of data coming back from the facts modules, so it shouldn’t be a problem.

If you are writing some modules of your own, you’ll likely have a few facts modules that you might choose to not call on a case by case basis if they are intensive (walid_facts, etc)

–Michael

Kerry_Kurian · January 10, 2014, 7:20pm

Have you found a way to solve this for the --limit case?

Nick_Groenen · January 13, 2014, 4:52pm

No I haven't. In our case, we're simply not using --limit for the time
being, and have our tags set up to still allow pretty granular
targeting. Fact caching is going to help somewhat with this problem,
once it's implemented.

Brian_Coca1 · January 13, 2014, 4:59pm

implementing the offline caches might solve this as most host data should be available from previous runs.

Strahinja_Kustudic2 · March 14, 2014, 4:28pm

I know this is now and old topic, but it didn’t make sense to open a new one.

Having offline cache would be nice, but wouldn’t be easier to make something like:

gather_facts: force or gather_facts_force: True

or something similar, so that this overrides --limit / --tags and always gathers facts.

Michael_DeHaan1 · March 14, 2014, 4:42pm

We’ve discussed this and what we want to do with gather_facts is make a config setting

gather_facts_tendancy: always or lazy, default always

and then if you want to force when it’s lazy, you could just call the ‘- setup’ module in the tasks section.

Strahinja_Kustudic2 · March 14, 2014, 5:01pm

That is a cool solution. Do you have an estimate when will this feature be added?

Michael_DeHaan1 · March 14, 2014, 5:11pm

I’m not actively working on it – I believe Brian Coca had expressed interest in getting this in, also open to someone going ahead and doing it.

–Michael

Grzegorz_Nosek · March 24, 2014, 5:43pm

Hi,

Sorry to dig up this thread again but it’s also an issue for me.

AIUI, Strahinja Kustudić needs (just as I do) a way to always gather facts on all hosts, regardless of tags/limits. What your proposal does is (again, AIUI) improve performance without the need of gather_facts: false in every play. It’s cool but different. We need different functionality, not better performance.

Consider:

Kerry_Kurian · March 24, 2014, 6:31pm

re: always gathering facts on all hosts, regardless of tags/limits.

What I’ve been doing is creating a facts.yml file that looks like this:

Brian_Coca1 · March 24, 2014, 6:54pm

I was thinking something like:

setup: target={{item}}
with_items: groups['webservers']

this gives people more fine grained control, target will populate hostvars[target], vs current host.

or would it be better to use delegate_to?

Grzegorz_Nosek · March 24, 2014, 7:00pm

W dniu 24.03.2014 19:54, Brian Coca pisze:

I was thinking something like:
setup: target={{item}}
with_items: groups['webservers']
this gives people more fine grained control, target will populate
hostvars[target], vs current host.

or would it be better to use delegate_to?

I'd say delegate_to (with a group, which isn't allowed ATM, I think?) would be cleaner as it would transparently support other modules (like custom facts) without modifying them. Also, to support this AFAIK you need local action plugins, not just a module to run on the remote host.

So either this:

- setup:
delegate_to: webservers

Or this, which probably isn't valid Ansible either:

- setup:
delegate_to: "{{ item }}"
with_inventory_hostnames: webservers

Any of these would be fine with me.

Best regards,
Grzegorz Nosek

Brian_Coca1 · March 24, 2014, 7:02pm

it would require changing current behavior, now if you do setup + delegate_to, you populate ‘current’ host with delegated host facts.

Grzegorz_Nosek · March 24, 2014, 7:20pm

W dniu 24.03.2014 20:02, Brian Coca pisze:

it would require changing current behavior, now if you do setup +
delegate_to, you populate 'current' host with delegated host facts.

Right, my bad.

Still, I don't really like the target= as a parameter of the setup module. It complicates the implementation too much IMHO. It effectively duplicates the whole support of delegate_to with a minor change in functionality (which I forgot and you pointed out).

Another play-level option (delegate_facts: true?) might be more universal and would keep writing custom fact modules trivial.

Best regards,
Grzegorz Nosek

Strahinja_Kustudic2 · March 24, 2014, 11:26pm

I know that I suggested this, but I really don’t see a downside in using gather_facts: force. It is simple, seems the easiest to implement, it is easy to read and feels natural. Only downside I see is that gather_facts is a Boolean, and if we allowed “force”, it will end being that and might feel a little strange.

On the other hand adding additional options to setup module like ignore_tags and ignore_limit is cool, but what I don’t like about it is that you would need to do gather_facts: False before that. The same goes for using setup with delegate_to.

Don’t get me wrong those additional parameters for the setup module are great ideas and I think Ansible should have them as well, so that you can gather custom facts, but Ansible is always being promoted as being extremely simple and gather_facts: force/always is as simple as it can be.

Topic		Replies	Views
Force fact-gathering on all hosts when using -limit Ansible Project	5	31	May 4, 2015
Forced Fact Gathering Ansible Project	4	10	June 8, 2015
Collecting setup facts from other servers when using --limit Ansible Project	3	5	May 13, 2014
Resolve DNS / gather_facts Ansible Project	1	11	July 18, 2014
ansible-playbook --limit limitations? Ansible Project	6	12	May 21, 2014

Force gathering facts on all hosts when using --tags or --limit

Related topics