Ansible hangs and does not progress

I explain my problem

I get an inventory of 10,000 servers (Linux) through a python in json format, I have a playbook that is responsible for collecting information from the servers and at the end with java I assemble a final report

It is very common that a server has a problem with some file system or that when logging by ssh does not return the promt.

The serious problem is that when I run my playbook and Ansible encounters any of those servers, Ansible stops progressing and hangs or stays on hold and does not show me the trace of which server is with the problem. Even if it is only a ping-pong, it does not advance either.

My temporary solution is to exclude it inside the python so that Ansible does not consider it as inventory and thus finish the process, but it is not the best solution.

My question is if Ansible has a way to give a timeout on the servers where they connect but not return and skip them to continue and finish the process with the remaining servers.

My version of ansible is ansible 2.1.2.0

I’ve run into the same problem, for us it was in regards to NFS mounts that had been dropped by the NFS server and not unmounted on the client side. Disk related commands on the remote host would hang. Part of the problem is there is no timeout on some of these commands Ansible is running from the setup module. As of right now I owe one of the developers the exact line/command Ansible is running where it’s getting stuck in the setup module. This reply is just more of a you’re not alone and didn’t want to see this post go unanswered. If you want to raise awareness you can file a bug but the more information you have in regards to the line of code having an issue the more likely you are to get a quicker response.

  • DiGi