Clear host errors not working as supposed - cannot retry task on unreachable host

HI al,

I am running a playbook on a set of hosts of which I expect them to be offline. I want to check every 10 seconds whether they are online or not and then output a message when all is running, or another message when all is not running after 10 retries for instance.

I am getting ‘host unreachable’ after which the unreachable host is being removed. I can continue to run other tasks on the hosts that were online (by using ‘meta: clear_host_errors’), but I can’t retry the task on the host that was not online.

Does anybody know how to keep trying (for instance 10 times) a task until all hosts have been successful? Is this possible?

Have you check out wait_for?

and with "any_errors_fatal: true" you can at least stop the play if one or more host doesn't come online.

That works partially, the host is not being cleared but it keeps hanging on that unreachable host.

I managed to do a ping -c 3 to all the hosts in combination with retries=10. I only need to exclude certain host groups when I include all hostnames via: with_items: groups[‘all’]

with_items: groups[‘all:!excluded_group’] does not seem to work. Have you got any idea how to exclude here? :slight_smile:

with_items: "{{ groups['all'] | difference(groups['excluded_group']) }}"

Great, everything is working as I want it to be, thank you for your help!