Hi,
I currently use kubespray [1] (ansible playbook for Kubernetes
installation and upgrades) which performs a "rolling upgrade" using the
"serial" keyword as one of its steps.
The problem I encounter is that it's not a true "rolling upgrade", but
batch processing: 20% of the hosts are upgraded at once, and **every host
in a batch wait for all the others in the batch to terminate**.
This causes quite a bit of inefficiency, because hosts frequently have a
lot of variance in the time they take to upgrade. So there is a lot of
busy-waiting.
In that context, the upgrade I'm concerned about is one play, there is
no inter-hosts dependency during that one.
I looked at the host_pinned strategy, but unfortunately is only works
within a batch.
What I'd like to have instead is a rolling upgrade: there is a limited
number of upgrade slots (20% of the total of hosts in that example),
each host performs the whole play as fast as possible, not waiting for
any other hosts. Once it finishes, it frees a slot for another hosts.
I posted and issue [2] (and a related PR proposing a possible design)
with more details and examples. It got auto-closed. While I implemented
a new strategy plugin as a draft PR, IMO this should not be done in an
external way, since it requires parametrizing the "free" strategy. (I
could copy it entirely in an external plugin, but that seems like a
waste).
I'm apparently not the first to expect `host_pinned` to work in this
fashion [3].
Is there interest in supporting this use case inside ansible-core ? Do
you consider the current behavior of "host_pinned" with regard to
"serial" intended (ditto for "throttle" see issue) ?
Should I just go ahead and create an external strategy plugins for this
use case despite this ?
Thanks,
[1]: https://github.com/kubernetes-sigs/kubespray/issues
[2]: https://github.com/ansible/ansible/issues/81736
[3]: https://groups.google.com/g/ansible-devel/c/3AvfyEh2jIU/m/SydjZeCrBAAJ