Aha. Then I think this is intended behavior of community.general.merge_variables plugin, it does not do deduplication of merged lists. @utoddl mentioned that his mergevars plugin can do that so you can rely on that. See:
dedup option.
My advice would be to organize your vars in such a way that merging does not result in duplicated list elements. It should be possible, just think of what is common for all hosts and what is specific.
Merging dicts is tricky. I prefer flat variables. More control over (de)duplication that way.
But this above does not work, I have some syntax error here, just can’t figure out now proper construction which could than be used as input for ansible.builtin.combine.
Now host polyglot is in both groups alpha and beta, so it ends up with [1,2,3, 6,8,10, 7,8,9]. Somewhere you have to deal with the doubled 8, either in the merge plugin, or the Jinja that calls it, or the task(s) that consume the list. None of those are inherently wrong, but that complexity must exist somewhere. I’d suggest the most right/least wrong answer is: “maintainers choice”.
And just for argument’s sake, I can imagine a situation where the doubled 8 should be preserved, so no blanket solutions (“merge plugin always dedupes”) allowed!
For what seems like such a simple problem, there sure are lots of ways to make it messy.
Good question. It’s part of why I selected that as an example. The merge order for mergevars is defined such that explicitly named variables and in-line values are merged in the order listed. After that, variables matching the optional regex are sorted by name excluding any already merged because they were listed explicitly, and then the remaining matching variables are merged. This allows you to ensure you start with, for example, role defaults which may be augmented or replaced by other data in the merge process.
Note: Doesn’t address any de-duplication inside merged structures, etc. (so functionality in comparison to hash_behaviour=merge still little lacks behind :slight_smile), but it’s already acceptable and non-standard hash_behaviour is not needed.
Also how one would compare plugins from ansible.builtin vs community.general in terms of performance, reliability, etc. (if there is a choice).
(because here I could have use either community.general.merge_variables vs ansible.builtin.vars in conclusion with ansible.builtin.combine, what would be preferd? I don’t know, I just tend to incline to ansible.builtin when I can.)
With regard to de-duplication, ansible.builtin.combine has an optional list_merge parameter where two of its allowed values remove duplicates created by the merge.
Behavior when encountering list elements.
Choices:
- “append”: append newer entries to the older ones
- “append_rp”: append newer entries to the older ones, overwrite duplicates
- “keep”: discard newer entries
- “prepend”: insert newer entries in front of the older ones
- “prepend_rp”: insert newer entries in front of the older ones, discard duplicates
- “replace” (default): overwrite older entries with newer ones
Okay, true enough, it is possible. I should amend my list:
to include “or modifying hosts/host groups.”
But in this admittedly contrived example, creating a host group to avoid deduping a list is going to raise some maintainer’s eyebrows some day.
You’re not wrong. The point remains though: the complexity will exist somewhere. I stick by the statement, “the most right/least wrong answer is: maintainer’s choice.”
Yes … Now you better understand how things work and you finally came to the exact code I’m using
There is probably no comprehensive benchmark done so no way to tell. Using community.general.merge_variables should be faster compared to chaining ansible.builtin.vars, ansible.builtin.varnames and ansible.builtin.combine just because you call one plugin instead of three. On the other hand, performance difference should be negligible compared to the general “slownes” of Ansible itself in regard to SSH-ing to the host, uploading module code, running it etc.
In terms of code quality of community.general collection, it should be on par with Ansible itself because all the same contribution guidelines and requirements are in place. Most of the code in community.general collection was part of the Ansible package back in the days when collections didn’t exist.
I have introduced many duplicates for testing at various places between groups and hosts vars, and behavior is there! I see, I can control everything here.
Well, I don’t have anything to add to this discussion anymore, but it was/is a very interesting read and I am definitely stealing that for in my cookbook of tips 'n tricks
So, I’ve made a sample playbook with your version (and, for good measure also c.g.merge_variables), which will (hopefully) clearly illustrate when you should use which (assuming maintainability and readability is a factor)
---
- name: Example of merging dicts
hosts: localhost
gather_facts: false # Speed up the example
vars: # Note that these could be definied anywhere in inventory
group1_settings__to_merge:
some_setting: "foo"
some_list:
- "foo"
- "bar"
another_setting: "bar"
group2_settings__to_merge:
some_setting: "foo"
some_list:
- "baz"
- "foo"
yet_another_setting: "baz"
group3_settings__to_merge:
some_setting: "ever"
some_list:
- "quxz"
- "3242"
a_totally_different_setting: "qux"
group4_some_other_var: # This won't get merged
what: ever
tasks:
- name: 'varnames | combine -> Show result, deduped last wins'
ansible.builtin.debug:
msg: "{{ lookup('ansible.builtin.vars', *query('ansible.builtin.varnames', 'settings__to_merge$')) | ansible.builtin.combine(list_merge='append_rp', recursive=true) }}"
- name: 'c.g.merge_variables -> Show result as is, must set override to ignore'
ansible.builtin.debug:
msg: "{{ lookup('community.general.merge_variables', 'settings__to_merge', pattern_type='suffix', override='ignore') }}"
It resolves to this:
PLAY [Example of merging dicts] *********************************************************************************
TASK [varnames | combine -> Show result, deduped last wins] *****************************************************
ok: [localhost] =>
msg:
a_totally_different_setting: qux
another_setting: bar
some_list:
- bar
- baz
- foo
- quxz
- '3242'
some_setting: ever
yet_another_setting: baz
TASK [c.g.merge_variables -> Show result as is, must set override to ignore] ************************************
ok: [localhost] =>
msg:
a_totally_different_setting: qux
another_setting: bar
some_list:
- foo
- bar
- baz
- foo
- quxz
- '3242'
some_setting: ever
yet_another_setting: baz
So unless you want/need to have a deduped list of some form, it doesn’t matter. The last addition to the merged list wins (see some_setting), but some_list is the only thing that really differs.
Maybe (I don’t think my skills are up to snuff for this) it would be an option for c.g.merge_variables to have an additional override mode, borrowing a few (I’d say only append_rp) from a.b.combine. Because it is aware of the fact there is a list in there:
TASK [c.g.merge_variables -> Show result] ***********************************************************************
fatal: [localhost]: FAILED! =>
msg: 'An unhandled exception occurred while running the lookup plugin ''community.general.merge_variables''. Error was a <class ''ansible.errors.AnsibleError''>, original message: The key ''some_setting'' with value ''foo'' will be overwritten with value ''ever'' from ''group3_settings__to_merge.some_setting''. The key ''some_setting'' with value ''foo'' will be overwritten with value ''ever'' from ''group3_settings__to_merge.some_setting'''