One server is slow to respond to ansible (but does) gather_facts

Just trying to track down what changed on a server. I will run a playbook against 20-30 servers and all of them (configured the same with ansible) respond quickly to gather_facts; however, just one server takes its own sweet time to respond. It wasn’t this way at first, but changed in the patch couple of months. Any ideas as to track down why it is doing that? All other ssh connections to the server are responsive as expected…

Could I suggest to execute an ad-hoc command using the “setup” module against this machine and do all the debug?

Marcos,

I can execute an ad-hoc, but I don’t know what you mean by “all the debug”–or how to do that.
If I run the following, it returns promptly:

`
root@ansible:/etc/ansible # ansible myserver -m debug
myserver | SUCCESS => {
“msg”: “Hello world!”
}

`

Kai,

I cannot find any IO issues on the server in regards to the things you specified. LVM isn’t involved, but NFS is. It is connected to the same NFS server as all of my other servers. In addition, I don’t see any stale files, or mounts. Things look normal on the server other than it taking longer to respond than my other servers… I am still digging into this to see if I can find the cause for the slow response.

Is normal ssh to the server slow or only with Ansible?

Normal SSH to the server is quick and responsive. If I ssh from the ansible server it is even quicker than from my current location directly to the server (The ansible server is on the same network as the other server, so this makes sense). Once the facts are gathered, everything else in an ansible playbook is speedy as any other server, it is just slow to gather facts. If I exclude gathering facts it is as responsive as any other server.

You can use the 'subset' option in fact gathering to narrow down what
is being soo slow on that machine (likely culprits are hardware or
network), this often is a symptom of some problem with the server
being slow or unresponsive to some device query.

ansible myserver -m setup -a ‘gather_subset=hardware’ is definitely the culprit. Do you know of any way to break down the hardware subset to further narrow this down?

You can use ANSIBLE_DEBUG=1 and check the target's syslog to see which
commands are run and how long till 'next command', other than that is
is just debugging the setup module while executing it