skip_unreachable ?

Adam_E · January 16, 2019, 12:53am

Hi there.

My scenario is that I have a hundreds of locations on various different networks that I want to run some plays against. In almost all circumstances I will have 2-3% of these sites that I cannot connect to due to network related issues. I don’t want to create exceptions/errors in AWX for these sites that just happen to be offline while we are running the play, they’ll get picked up again next time.

The behaviour I would like is that when a host is deemed as unreachable,

any future tasks in the play against that host will not be run (no point as they will all fail as host unreachable).
the playbook will continue run all tasks against the other hosts that are reachable
the playbook run will report success (return code = 0)

I was thinking of using the ignore_unreachable flag which seems to mostly do what I want, however based on the documentation and the comments in a issue I just created it seems as though the ignore_unreachable flag is not aligned with my goals, the opposite in fact.

Is there another flag/option that might give me something closer to what I am looking for?

system · January 16, 2019, 11:21pm

So what you are asking would be the 'default' way ansible operates, it
removes 'unreachable' hosts from the rest of the play and then
continues with the rest of the hosts.

- any future tasks in the play against that host will not be run (no point as they will all fail as host unreachable).

this is the default

- the playbook will continue run all tasks against the other hosts that are reachable

also the default

- the playbook run will report success (return code = 0)

this is not the default, but you can have a `meta:
clear_host_errors` as your last task .. but this might be too big of a
hammer.

There are some cases in which the above is not true, for example,
using serial, if all hosts in a 'serial batch' fail (unreachable
counts) then the whole play fails, you also have max_fail_percentage
to manage how many failures you tolerate.

Adam_E · January 17, 2019, 12:10am

thanks for the reply Brian. Yes, it sounds like clear_host_errors i think will be too big of a hammer, I just want to ignore unreachable.

I’ll have to figure something else out then, have a couple other scenarios in mind anyways, gonna also look at
something like https://github.com/openstack/ara to have better logging/reporting on playbook runs so I can more easily find the failed hosts and rerun.

ultimately just looking for a nice clean way to monitor and re-run failed executions, as well as easily distinguish things like connection failures vs actual failures where you know that running it again when the connection is up will work. I don’t find AWX gives me a good enough view into this and am looking for a better overall strategy.

Adam_E · February 27, 2019, 10:41pm

hi Brian, I was wondering if there is anything else you can suggest to me. I want to report a successful ansible run when the only thing that failed were unreachable hosts so that it returns success back to AWX.

Is there a callback plugin or some local customization that I can write in the meantime and then contribute back to Ansible core?

Or perhaps some sort of preprocessor that runs through the inventory and removes unreachable/down hosts?

I am interested in two things:

any type of quick workaround. I tried the suggestion you mentioned with regards to the meta_clear_host_errors, but this didn’t work. I posted the output below. Even if it did work, I would be concerned that it would clear other errors other than unreachable.
What is the correct long term solution (enhancement to ansible) to this problem, if you agree this would be useful and can provide some guidance of a solution you support and I can also work on a PR to add to the the core if it’s not too complex. I think this would be useful.

`
$ ansible-playbook -i inventories.unreachable/ unreachable2.yml ; echo “return code from run is: $?”

PLAY [unreachable test] ********************************

TASK [Gathering Facts] *******************************
ok: [host_online]
fatal: [host_unreachable]: UNREACHABLE! => {“changed”: false, “msg”: “Failed to connect to the host via ssh: ssh: connect to host 10.20.3.21 port 22: Connection timed out\r\n”, “unreachable”: true}

TASK [success1] *****************************
changed: [host_online]

TASK [success1] **************************
fatal: [host_unreachable]: UNREACHABLE! => {“changed”: false, “msg”: “Failed to connect to the host via ssh: ssh: connect to host 10.20.3.21 port 22: Connection timed out\r\n”, “unreachable”: true}
to retry, use: --limit @/homenfs/aedwards/unreachable2.retry

PLAY RECAP ******************
host_online : ok=2 changed=1 unreachable=0 failed=0
host_unreachable : ok=0 changed=0 unreachable=2 failed=0

return code from run is: 4
`

playbook:

$ cat unreachable2.
`
$ cat unreachable2.yml

Adam_E · February 28, 2019, 12:26am

I’m thinking about going down this road, if anyone has any better ideas, please share. it seems as though it will work fine, just not sure if there is a better/simpler/more appropriate solution.

`

Topic		Replies	Views
Ansible deployment gets stuck/ends prematurely on unreachable host, doesn't proceed to next host Get Help awx	3	509	May 28, 2025
Ansible stops if unreachable, can it be ignored to continue the playbook? Ansible Project	3	69	May 6, 2020
unreachable and retries Ansible Project	1	808	March 20, 2023
Ansible react on unreachable / failed hosts Ansible Project	1	137	November 11, 2019
How to get UNREACHABLE hosts in ansible-playbook Ansible Project	0	157	September 10, 2020

skip_unreachable ?

Related topics