Batch size (serial) and playbook failing

Hi.
I have a list of hosts of which some will always be unreachable or the tasks in playbook will otherwise fail. I have no problem with it. I cannot however understand why if I add “serial: 4” in my playbook and first four hosts fail, the playbook stops completely not even starting rest of the hosts.
If I remove the batch size restriction, the play runs OK, reports failure for some hosts, but for the rest works well. If I try to control the batch size in order not to overload my laptop, I get the whole play to fail if first batch fails.
There is no mention about this behaviour on https://docs.ansible.com/ansible/latest/user_guide/playbooks_strategies.html#setting-the-batch-size-with-serial
(or I’m completely blind ;-))
Any hints?

This is by design, if all hosts in a batch fail, the playbook is
failed. The logic was that in rolling releases it makes sense to stop
in the case of a batch failing.

Hmm. Would be great if the docs said so :wink:
But how to circumvent it? I can try using random order in order to reduce the probability of the same hosts failing every time but is there any other way?

pt., 18 cze 2021, 16:56 użytkownik Brian Coca <bcoca@redhat.com> napisał:

You could do a group of the hosts that you are able to connect to and exclude the rest. There are also certain Notations in the hosts and limit you can use to exclude the non connevtable hosts

https://docs.ansible.com/ansible/latest/user_guide/intro_patterns.html