version 0.9: wait_for suck in wait until timeout, it cannot wake up after the host can be accessed by ssh

Hi,

I have following wait_for statement after reboot the host, but it stuck in the wait until the timeout even the host can be accessed manually when I run ssh the host. Is it a bug or what I could be missing?

  • name: Wait for restart
    local_action: wait_for host=“{{ inventory_hostname }}” port=22 delay=5 timeout=3600 state=started

Thank you.

  • j

Sorry, the version is 1.9.

Hi,
Try:

  • name: wait for SSH
    local_action: wait_for port=22 host=“{{ inventory_hostname }}” search_regex=OpenSSH delay=30
    sudo: false
    It will start to search for the regex after 30 secs.

Unless there’s some other problem in the mix, this is enough to work:

`
local_action: wait_for state=started host=“{{ inventory_hostname }}” port=22

`

Thanks Yariv and Dan, but non of your suggestions work, the wait_for does seem able to detect the host up after reboot. I reckon it is a bug. Anyway, are there any workaround or alternative module to wait reboot?

sorry, it does not seem able to detect the host back after the reboot, any workaround or alternative module to wait reboot?

Can you post the output when running with -vvvv?

I’m having this same issue on 1.8.2. We have plans to upgrade to 1.9, but we’re not there yet and from reading this thread it sounds like that’s not going to fix my issue.

The details… I’m spinning up EC2 instances in a VPC. I have hosts in the private subnet that are accessible through a Bastion host. ssh from the command line responds immediately without errors. ansible/ssh configs are set up correctly (all other playbooks are working as they should against these hosts in the private subnet.

My play looks like this:

`

NOTE: This may fail because there’s a bug where ansible does not accurately determine

For this reason, we’re making an educated guess that 360 seconds is long enough and

ignore any errors from this step

  • name: Wait for SSH to come up
    local_action: wait_for host={{inventory_hostname}} port=22 delay=30 timeout=360 state=started
    when: ((‘tag_type_mgmt’ in group_names) or (‘tag_type_data’ in group_names))
    ignore_errors: yes

`

The output looks like this:

`
TASK: [Wait for SSH to come up] ***********************************************
skipping: [172.98.2.226]
skipping: [52.4.9.253]
skipping: [172.98.2.242]
<127.0.0.1> REMOTE_MODULE wait_for host=172.98.1.95 port=22 delay=30 timeout=360 state=started
<127.0.0.1> REMOTE_MODULE wait_for host=172.98.1.156 port=22 delay=30 timeout=360 state=started
skipping: [172.98.2.167]
<127.0.0.1> EXEC [‘/bin/sh’, ‘-c’, ‘mkdir -p $HOME/.ansible/tmp/ansible-tmp-1429283315.41-248461052526032 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1429283315.41-248461052526032 && echo $HOME/.ansible/tmp/ansible-tmp-1429283315.41-248461052526032’]
<127.0.0.1> EXEC [‘/bin/sh’, ‘-c’, ‘mkdir -p $HOME/.ansible/tmp/ansible-tmp-1429283315.41-33469490390301 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1429283315.41-33469490390301 && echo $HOME/.ansible/tmp/ansible-tmp-1429283315.41-33469490390301’]
<127.0.0.1> PUT /var/folders/wq/0sb44mp50y78qnmkljp9cn91kw35ch/T/tmpohfvzT TO /Users/bbell/.ansible/tmp/ansible-tmp-1429283315.41-248461052526032/wait_for
<127.0.0.1> PUT /var/folders/wq/0sb44mp50y78qnmkljp9cn91kw35ch/T/tmpnXveWg TO /Users/bbell/.ansible/tmp/ansible-tmp-1429283315.41-33469490390301/wait_for
<127.0.0.1> EXEC [‘/bin/sh’, ‘-c’, u’LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /Users/bbell/.ansible/tmp/ansible-tmp-1429283315.41-248461052526032/wait_for; rm -rf /Users/bbell/.ansible/tmp/ansible-tmp-1429283315.41-248461052526032/ >/dev/null 2>&1’]
<127.0.0.1> EXEC [‘/bin/sh’, ‘-c’, u’LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /Users/bbell/.ansible/tmp/ansible-tmp-1429283315.41-33469490390301/wait_for; rm -rf /Users/bbell/.ansible/tmp/ansible-tmp-1429283315.41-33469490390301/ >/dev/null 2>&1’]
failed: [172.98.1.156 → 127.0.0.1] => {“elapsed”: 360, “failed”: true}
msg: Timeout when waiting for 172.98.1.156:22
…ignoring
failed: [172.98.1.95 → 127.0.0.1] => {“elapsed”: 360, “failed”: true}
msg: Timeout when waiting for 172.98.1.95:22
…ignoring

`

The whole time Ansible is sitting there waiting, I’m able to ssh to both of these servers without any issues… so the wait and/or poll is definitely broken.

Brenda