I have found when running ansible with either playbooks or direct commands that the forking is not working as I’d expect. If I add -f 20 or -f 3 or --forks #, it appears I still end up with only 1 remote command running at a time. I check this by looking at the processes running on the ansible server and I only see 1 remote ssh session at a time to my remote machines.
I am running an older version of ansible 1.2.2 because it is the latest in the Ubuntu LTS repo. Has anyone else seen this? Is there a problem with my syntax?
Correct. I am running a single command (such as apt-get update) against a dozen or so servers and want it to happen in parallel. Unfortunately, when using --forks 20 or -f 20 I still only get connections to 1 server at a time. They run consecutively and not concurrently.
Only returns the uptime for server at a time. I have also tried using a sleep statement for example and only a single server returns its results at a time after the sleep interval. They are not running in parallel for some reason.
The don’t appear to be however, that is my concern. For example I assume a command like:
ansible MyServers -m shell -a “sleep 20” -f 20
When run on a list of less than 20 servers should return output from all servers around 20 seconds after it is initiated. It is not. 1 server replies, 20 seconds later the next server replies, 20 seconds later the next server replies etc.
They are not running in parallel. What am I missing?
I have not manually change any connection method so I assume it should still be the default paramiko with shared SSH keys. Here is a test showing that it does not run in parallel. Notice that each response is 10 seconds after the previous one even though -f 10 is specified.
ansible Dev -m shell -a “sleep 10 && date” -f 10
pdx-cass-d02 | success | rc=0 >>
Thu Nov 7 22:54:39 UTC 2013
pdx-extws-d02 | success | rc=0 >>
Thu Nov 7 22:54:49 UTC 2013
pdx-intws-d01 | success | rc=0 >>
Thu Nov 7 22:55:00 UTC 2013
pdx-extws-d01 | success | rc=0 >>
Thu Nov 7 22:55:10 UTC 2013
pdx-fep-d01 | success | rc=0 >>
Thu Nov 7 22:55:20 UTC 2013
pdx-cass-d01 | success | rc=0 >>
Thu Nov 7 22:55:30 UTC 2013
pdx-lb-d01 | success | rc=0 >>
Thu Nov 7 22:55:40 UTC 2013
pdx-mq-d01 | success | rc=0 >>
Thu Nov 7 22:55:51 UTC 2013
pdx-sql-d01 | success | rc=0 >>
Thu Nov 7 22:57:35 UTC 2013
pdx-job-d01 | success | rc=0 >>
Thu Nov 7 22:56:11 UTC 2013
pdx-listen-d01 | success | rc=0 >>
Thu Nov 7 22:56:21 UTC 2013
pdx-sql-d02 | success | rc=0 >>
Thu Nov 7 22:58:05 UTC 2013
So using -vvvv it appears that ansible first connects to ALL servers to copy the tmp file for execution and the password files etc. It then connects individually to the first server to execute that command. Once a response is returned it then connects to the next server.
The verbose output is quite extensive. I’m not sure what else I should be looking for…
Posting on an old thread, since I just had this same issue with ansible 1.9.2.
In the end, I could solve it removing my persisted ssh connections like this
rm ~/.ansible/cp/*
I have no idea why exactly this solved the problem but in case you encounter the same issue you can try it.