When increasing the connection timeout with Ansible’s --timeout parameter and try to reach an offline host, at some timeout value I get a “[Errno 110] Connection timed out” message instead of the usual “timed out” message. This seems to be generated by something like urllib2.urlopen() and occurs if the given timeout is larger than the current timeout of the urlopen() method (or whatever call raises “[Errno 110] Connection timed out”)``.
As a result, timeout values larger than, say 20 seconds are not working as expected. Is this intentional? Is there a way to change that second timeout value via Ansible settings? I suggest I could find some system setting to control this, but it seems more intuitive to me to just set the Ansible timeout and Ansible does the rest.
The use case is the following: to push Ansible configurations to a number of host which might be offline when starting Ansible but will get online sometime during the day. I would need really long timeouts for this. If this is even possible, as I don’t know the SSH details regarding timeouts.
As an alternative, would it be possible for Ansible to first start pinging hosts in the inventory and then, if a ping is successful, to start pushing to the host in question? I assume there is no way to easily modify Ansible’s behaviour in this respect without digging deep into the codebase?
Cheers
elektrokokke
If you are trying to push to hosts that might be offline, I'd possibly
suggest either
(A) delegate_to/local_action as a precursor, calling fence agents to
wake the server up and then waiting for the port to come online, and
then waiting for SSH to become accessible using the wait_for module.
(B) looking into using ansible-pull, so hosts remediate on their own
Without completely understanding your use case (more info welcome),
have you looked into something like that?
We consider using Ansible to configure a lot of hosts that usually come online during the night time, but we can’t be sure when exactly. The prerequisites for Ansible push are fulfilled (i.e., Python 2.6). ansible-pull would be a possibility, but we’d rather have it work via push mode.
The scenario looks like this: start ansible in push mode on a lot of hosts, some of which might be online at the moment, and others to come online in a few hours. We don’t really know in advance which hosts will be online at which time, so timing the ansible push by groups of hosts is not possible.
I saw the wait_for module, but didn’t consider the possibility of using it locally. I’ll have to look into the local_action directive, that looks promising, thx! Does the wait_for module continue to poll the given host/port if the host cannot be resolved? Then it would be possible to just call wait_for locally with the target host on port 22, which will finish when the host comes online, would’nt it?
Btw.: changing the host resolution timeout turns out not to be feasible, as it seems to be capped silently at like 30 seconds on our system.
Cheers,
elektrokokke
Yay, it seems to work with a simple wait_for action combined with the local_action directive. Is there a cap on the timeout argument of the wait_for module? I set it to 86400 seconds (24 hours) and it didn’t argue, but I didn’t test if it would really wait that long.
Another question: the delay argument of the wait_for module, is that the polling frequency or is it just a delay before even starting to poll? The documentation suggests the latter, but then what is the polling frequency?
And yet another question: is there a variable similar to the inventory_hostname variable for the port as given in the inventory file? I.e., if there is a host which is accessible via ssh on a port other than 22, I can specify that in the inventory, but I didn’t find anything about accessing this information through a variable like inventory_hostname…
Cheers,
elektrokokke
Yay, it seems to work with a simple wait_for action combined with the
local_action directive. Is there a cap on the timeout argument of the
wait_for module? I set it to 86400 seconds (24 hours) and it didn't argue,
but I didn't test if it would really wait that long.
wait_for if you are running it on the local host with local_action
will not time out.
If you are running it over a connection SSH timeout applies.
Another question: the delay argument of the wait_for module, is that the
polling frequency or is it just a delay before even starting to poll? The
documentation suggests the latter, but then what is the polling frequency?
the delay before even starting to poll.
1 second, ow. We really should make that configurable. Fine for
local use on a machine, not so good otherwise.
And yet another question: is there a variable similar to the
inventory_hostname variable for the port as given in the inventory file?
I.e., if there is a host which is accessible via ssh on a port other than
22, I can specify that in the inventory, but I didn't find anything about
accessing this information through a variable like inventory_hostname...
the connection plugins can't make variables available to templates right now.
I am ok with if ansible_ssh_port is set it being $inventory_port,
provided it also works for fireball
BTW, it was suggested on the list you just put ansible on cron and let
it fail to talk to hosts, it will configure the ones that *are* up
each time, as long as your playbooks
are idempotent this will be good.