- name: Rolling restart when config files change
ansible.builtin.include_tasks: change_config.yml
when:
- rke2_cluster_group_name in groups
- rke2_restart_needed
vars:
_host_item: '{{inventory_hostname}}'
This does not appear to be correct and seems to miss the limitation of roles, which can’t define plays, only tasks. The limitation is discussed in:
ansible/ansible#12170#issuecomment-423385055
ansible/ansible#42528
(I can only post two links, so not links, sorry)
I made a quick demonstration and tested the suggested alternative here:
This may seem like a small issue, but it means this role is essentially unusable on large clusters - which is why I’m trying to build a collection (making use of plays with parallel, batched, graceful changes) with the same user experience as the role.
This is really a bad way as you really are doing the same comparission as _host_item == inventory_hostname
And the same goes for the loop, which will run for each host that is in the group, but in the end you only want the current if it is in the group, but I made a mistake in my correction in the issue, the conditional should be:
- rke2_cluster_group_name in group_names
In neither case did that code restrict to one host, only to ‘the current host, if its part of this group’ just in a very awkward way. The only way it is only one host is if the group ONLY has one host.
I changed to this condition in my test case, but the results are the same, it executes tasks on all hosts in parallel - which is what that hack seeks to avoid. It’s emulating serial: 1 in a role.
This could likely use throttle: 1, but that’s not actually solving the problem.
These things do need to run with some parallelism to ever finish on a large cluster, but over a specific order with configuration (arbitrary number of pools / child groups). Maybe I could hack something together with loops, throttle and include_tasks, but a play seems generally better suited to the problem.
Ok, that was very hard to infer just from the task itself, but in any case, this seems more of an issue pertinent to a strategy plugin, which handles when/how tasks are executed, than about having another source of vars.
that was very hard to infer just from the task itself
Which is why I provided examples and linked relevant issues.
Vars are how users typically configure role behavior. Switching to a playbook, global vars have significantly limited sources compared to others.
Extra vars isn’t the end of the world, but it doesn’t compose or provide self-documenting defaults - the defaults end up baked into plugin source code or each play declaration.
The plugin I’ve written to control this rollout could be reduced to a few templated global vars - if a collection could ship default global vars.
strategy plugin
This doesn’t really change the situation - it’s another way I could accomplish it, but just providing the desired play properties works well enough. It’s user provided configuration with sane defaults I’m concerned with.
I understand this may just not fit with your vision or be a significant priority, that’s fine.
Even with the linked resource it was not clear that the intent was to restrict execution that way, specially with the task linked since it is not really doing what you state, other tasks that try to execute ONLY on the first host in a group seem closer to the stated intent.
There is already a way to provide defaults via vars_prompt (which will not prompt in batch execution OR when extra vars are already provided). Another way is just adding a role with defaults/ which are the lowest priority. You can even use role args spec to document/require certain arguments.
Yet the vars does not seem to be the main issue here, but the task execution parallelism and host selection, I think you are trying to solve ‘the issue with the workaround’ vs the actual original issue with the new ‘global vars plugin’.
other tasks that try to execute ONLY on the first host in a group seem closer to the stated intent
That is not what happens. Some operations are delegated to the first control plane instance, but not all of them. Specifically - name: Restart RKE2 service on {{ rke2_node_name }}
This is also just one example and the least complex, there are multiple graceful rolling operations required - and I’m not just working on this particular role.
Yet the vars does not seem to be the main issue here
I guess, but I have the tools I need to solve the issue in a playbook - I already have it working. I could file an issue about roles shipping their own strategy plugins or similar, but just switching to a play seems pretty ideal to me - roles are limiting and collections aren’t much harder to build.
role variables by default were imported to the play scope (in newer versions of core some options can change/disable this), but the availability depends on how you add the role, since includes are done at runtime, they won’t work, but importing a role does.