I am running a playbook on a set of hosts of which I expect them to be offline. I want to check every 10 seconds whether they are online or not and then output a message when all is running, or another message when all is not running after 10 retries for instance.
I am getting ‘host unreachable’ after which the unreachable host is being removed. I can continue to run other tasks on the hosts that were online (by using ‘meta: clear_host_errors’), but I can’t retry the task on the host that was not online.
Does anybody know how to keep trying (for instance 10 times) a task until all hosts have been successful? Is this possible?
That works partially, the host is not being cleared but it keeps hanging on that unreachable host.
I managed to do a ping -c 3 to all the hosts in combination with retries=10. I only need to exclude certain host groups when I include all hostnames via: with_items: groups[‘all’]
with_items: groups[‘all:!excluded_group’] does not seem to work. Have you got any idea how to exclude here?