playbooks hang forever when client is swapping or has (NFS) mount problems

Dear all,

doing my first steps with ansible I noticed, that on some clients executing playbooks completely hangs. The common problem on these hosts is, that they are either swapping (even very small amounts of swap used) or they have problems with hanging/not responding NFS filesystems. In all cases, these two problems appeared together. Therefore I cannot say, which is the problematic issue.

However all local filesystems are perfectly ok and responding and working. The system over-all is also working fine.

Running even the simplest playbooks on these hosts hangs completely, even though the playbooks don't access the problematic filesystems. Running the same commands as ad-hoc commands, works fine:

$ ansible buggyhost -m shell -a '/bin/ls' --key=id_rsa
buggyhost | SUCCESS | rc=0 >>
[... `/bin/ls` output here..]

$

... but ...

$ ansible-playbook ls.yml --extra-vars "target=buggyhost" --private-key=id_rsa

PLAY [buggyhost]

The first task ansible is doing is gathering facts.... In facts there
are mounted filesystems, so the NFS one too

Regards,

Thanks a lot. Using 'gather_facts: no' in the playbook solved this issue:

Hi, No it is ok that’s on by default and that should stay like this. A lot of my playbooks (and from other people too) depends on facts (

<snip>

The first task ansible is doing is gathering facts.... In facts there
are mounted filesystems, so the NFS one too

[...]

However I found this feature to be quite hidden in the documentation.
IMHO gather_facts should be off by default and only on on request.

Hi,

No it is ok that's on by default and that should stay like this. A lot
of my playbooks (and from other people too) depends on facts
(|ansible_distribution,
>>ansible_distribution_version, ||ansible_lsb.major_release, to name the
most currents one....)|
>>>>

I see it the same as with services, open ports, access permissions ecc. ecc.: Minimum by default, more on request. But of course, once the maximum has been established as default, a change can break established and working mechanisms. Now it's probably too late to change this initial design decision.

>If you need, you can filter facts :
http://stackoverflow.com/questions/34485286/ansible-gathering-facts-with-filter-inside-a-playbook
But from my point of view, is NFS is not responding, it's your server
that is broken... Perhaps automouting (and so dismounting) NFS is an
option for you Regards, JYL

yes, there is a technical problem, but that's not the issue. The issue is, that this shouldn't break my scripts. When - very simplified - I run a script which does an `ls` in my homedirectory I don't want it to break (rather: it /must/ not break), just because some other, completely unrelated filesystem or service is not working. But that's what was happening in our case until we disabled gather_facts.

Anyway: Our current problem is solved - thanks to your hint - and I will set "gathering = explicit" in the configuration file, which should also have the desired effect.

Cheers
frank