I have several clusters containing several nodes. I want to run configuration jobs on all cluster nodes. The configuration jobs involve a small down time of the cluster node. I have to take care, that at least a minimum number of nodes in each cluster is still running. What is the best practice to achieve this?
I found an option in Ansible to limit the total number of simultaneous connections. But this does not help, because I need simultaneous connections per cluster.
Another way is to put all nodes of each cluster in groups: First the group of all first cluster members, second the group of all second cluster members and so on. This would make it possible to apply the configuration role to each group. But this is still not exactly what I am looking for, because this serializes the cluster nodes more than necessary. If I have 10 nodes per cluster and I want to keep at least 50% of all nodes running, it would be possible for Ansible to process 5 nodes in parallel.
What is the best practice for Ansible to handle cluster nodes?