I submitted this PR yesterday:
https://github.com/ansible/ansible/pull/12932
It seems like a no-brainer to me, but opinions differ and jimi-c
suggested I bring it to the list for discussion. First, let me
describe my motivation for this change:
I am using ansible to (a) start servers in OpenStack and then (b)
configure those servers. In order for (b) to happen, I need to wait
for cloud-init on the servers to complete the initial system
provisioning process, which depending on the environment may happen in
seconds or may take several minutes.
In this situation, the `wait_for` module simply isn't appropriate: ssh
will be available before cloud-init has configured the necessary ssh
keys.
I can of course simply set "retries" to "a very large number":
- command: >
ssh -o BatchMode=yes
centos@{{myserver.server.public_v4}} true
register: result
until: result|success
retries: 300
But I really hate arbitrary constants. I'd much rather be able to do
something like:
- command: >
ssh -o BatchMode=yes
centos@{{myserver.server.public_v4}} true
register: result
until: result|success
retries: -1
...and have ansible loop indefinitely until the task completes.
jimi-c suggests that:
I believe this would be confusing and potentially bad for users as they
could have a play run indefinitely.
...but I'm writing these playbooks for myself (which I imagine is true
for many people), and in general I'm in favor not introducing
artificial restrictions on the way in which a utility operates
(because someone else may want to use the tool in a way that I hadn't
even imagined).
Anyway, that's my spiel. Let me know if you think I'm a crazy person.
PS:
What I *really* wanted to do in this case was:
- ping:
register: result
until: result|success
Because the question I am trying to answer is effectively, "is the
remote host configured such that ansible will run successfully", and
a successful result from the above play would authoritatively answer
that question. Unfortunately, the above will result in connection
errors before the remote ssh is configured appropriately, and
connection errors will always trigger a task failure regardless of
until: or ignore_errors:.