Cannot change a variable passed to a role from within a role - request for docs

ansible [core 2.16.3]

Today, from within a role, I tried to use set_fact to change a variable that had been passed to the role from a playbook using vars:.

Using -vv execution, the set_fact task to change the variable shows the correct value that I want set, but the value is not changed and changed: false is reported.

After some experimentation I think I understand this behavior: It seems that the calling playbook can set and modify the variable, but if you call a role and pass this variable using vars: then the role itself can only reference the variable but not change it. Assuming this is the intended behavior, we now accept it and appreciate it as a safe pattern.

If the role needs to manipulate the passed data, you just use set_fact to copy the value from a passed role into a new variable and manipulate that variable from within the role. This way this modified variable is scoped within the role and doesn’t impact the playbook’s use of the variable after the role is done executing. Right?

What I’m requesting is pointers to documentation so I can understand this authoritatively and set coding policy on more than just my current observations and interpretations. Thanks!

Simulated Playbook (code not tested):

- name: Set config value
  ansible.builtin.set_fact:
    my_role_config:
      - apple
      - banana
      - apple
      - blueberry
      - apple

- name: Include modular role
  ansible.builtin.include_role:
    name: my_role
    tasks_from: main.yml
  vars:
    my_role_config: "{{ my_role_config }}"

Role tasks/main.yml:

---
- name: Initial value of my_role_config
  ansible.builtin.debug:
    var: my_role_config

- name: Immediately change the incoming value
  ansible.builtin.set_fact:
    my_role_config: "{{ my_role_config | unique }}"

- name: Show my_role_config made unique
  ansible.builtin.debug:
    var: my_role_config

Simulated output looks something like this:

  • Task within role shows it would have set only unique list values
  • Debug statement immediately following shows the variable did not change at all
TASK [...: Initial values of my_role_config]

ok: [hostname.mycompany.com] =>
    ansible_facts:
        my_role_config:
        - apple
        - banana
        - apple
        - blueberry
        - apple

TASK [..: Immediately change the incoming value]

ok: [lnxdocstageftden01.pinnacol.com] =>
    ansible_facts:
        my_role_config:
        - apple
        - banana
        - blueberry
    changed: false

TASK [...: Show my_role_config made unique]

ok: [hostname.mycompany.com] =>
    ansible_facts:
        my_role_config:
        - apple
        - banana
        - apple
        - blueberry
        - apple

You cannot do that. There are ansible variable precendence rules Using variables — Ansible Community Documentation
Here is an important extract:

19.Registered vars and set_facts

20.Role (and include_role) params

This meas that ansible will first read variable value if it is set as role include parameter and it does not matter if it has been set with set_fact.

This effectively does nothing, you set variable my_role_config at level 19 (see above) but ansible reads value set at level 20 (see above) and it is not changed, because it is set here (see ** - I’ve highlited important line):

My general recommendation is “never” use set_fact. Ansible if functional, this is imperative directive.

This is the main thing you should know about variables:

The other thing you need to know that there are 2 scopes ‘playbook objects’ and ‘host’, this last one is the most flexible as you have an additional access via hostvars[<inventory_hostname>].

Only?
what about role, block, task, run? I might have forgot some scopes

playbook objects includes play, role, block and task, with the precedence being contained > container

note: updated to plural so its is not as ambiguous

Thanks for the reply. I’m ready to accept that this is variable precedence at work.

I’m familiar with it from working with Chef, but haven’t run into it much with Ansible so far. It’s probably the smooth sailing with Ansible that caused me to discount this as variable precedence and wonder about other things.

I should have reviewed the list and seen those two together, but I think hearing it from others helps me know I’m not just interpreting it to fit my circumstances.

19. Registered vars and set_facts

20. Role (and include_role) params

So my interpretation of this is:

  1. Playbook creates a variable at level 20 when it calls the role
  2. The role attempts to use set_fact to assign a new value to it but this would be at level 19.[1]
  3. When my next debug statement attempts to show the value, it ‘looks down from the top’ and sees the level 20 value so it uses that.

I have a vague recollection that in Chef there could actually be multiple variables defined in memory, but because of the levels you’d only see the highest precedence instance of that variable and its value.

[1] Whether it creates another variable+value in memory at level 19 or does something else, effectively it won’t matter as long as a variable at level 20 exists.

I’m hoping others will benefit from this example. I know we were looking at our roles as if they’re modular functions in a programming language and expecting to be able to pass data to them and manipulate that data from within the function/role. This is good reminder that that thinking is ‘ok’ but Ansible/Chef and others have these layers of precedent to consider that classic programming languages do not.

Cheers,

programing languages also have precedence, but it is mostly related to scope, which does happen also in Ansible, but as an additional layer on the main precedence rules.

Also, role params are not vars: related, they are inline variables passed to a role, that is a different level of precedence.

As I hinted before, you might want to look into hostvars as that will keep the value of the host’s scope (what set_fact and register create), only overridden by extra vars.

You got everything right.

right! I have this mental picture in regards to ansible precendence levels

ansible is a programming language, it is mostly functional, with some elements of imperative programming. And that is why it might be confusing sometimes.

We avoid calling it a programming language, it is not really that versatile as such, we look at it as an automation engine, something that lets you express an algorithm and/or desired state. It does meet the Turing definition, but IMHO I think those definitions are outdated at that point and don’t really meet modern expectations of what a programming language (or AI) really should be.

1 Like

I agree, that might have been overstatement.

I myself write programs as means to archive the goal, so by extension I called ansible programming language. But the point about it beeing functional (or declarative) (as the opposite of imperative) is still holds in my view.

I’m quoting your initial “Simulated Playbook (code not tested)” here because I want to point out something that I think is relevant to your situation. The preceding discussions of variable precedence and playbook vs. host scopes are all well and good, but you need a strategy to deal with this information.

That wouldn’t pass code review in our shop. You’ve got two, maybe three separate things here depending on how your count them, all with the same name – a recipe for confusion.

Now that you’ve become aware of how scoping and precedence interact (to over-generalize, hoping that things will “just work” is insufficient), you need some guidelines for when data goes through various “scope transitions”. For our purposes a scope transition includes data from outside of a role being passed to or used inside of roles.

Taking the example above, the set_fact above is fine in and of itself, but it’s happening outside of the my_role role, so it’s name should reflect that out-of-role origin — well, at least to the extent that it shouldn’t have a name that makes it look like a role variable. Call it something else, then be explicit about the scope transition when you do the include_role:

- name: Set the list of relevant fruits
  ansible.builtin.set_fact:
    fruits_of_my_loom:
      - apple
      - banana
      - apple
      - blueberry
      - apple

- name: Include modular role
  ansible.builtin.include_role:
    name: my_role
    tasks_from: main.yml
  vars:
    my_role_config: "{{ fruits_of_my_loom }}"

After which my_role_config only exists in the context of this host’s invocation of my_role.

Likewise, any set_facts done within a role should (and these are our guidelines, not rules, so “should” rather than “must”) be prefixed with the role name. Therefore any data prefixed by my_role_ was either passed to the role explicitly, set_facted by the role (that’s how the role exposes values it wants to “export”), or came from role defaults or role vars.

Here’s a case where the above was causing minor confusion and how we fixed it. We set a variable in group_vars/mw_itweb_non.yml (non-prod apache httpd servers) that was

mw_common_apache_module_override :
  - 'ssl'
  - ['mpm_prefork', false ]
  - ['mpm_worker', true ]
  - 'proxy_wstunnel'

This is used by our mw_common_apache role to override that role’s default list of which apache modules to load. The confusion arose because it wasn’t set where the role was invoked, nor in the role itself. We renamed the variable mw_itweb_apache_module_override to match the related adjacent variables (still not great, but at least consistent), and assigned the role variable at role invocation:

- name: Common apache configuration
  ansible.builtin.include_role:
    name: mw_common_apache
  vars:
    mw_common_apache_version: "{{ mw_itweb_common_config.apache }}"
    mw_common_apache_server_doc_root: "{{ mw_itweb_common_config.docroot }}"
    mw_common_apache_instance: "{{ mw_itweb_apache_instance }}"
    mw_common_apache_service_type: "{{ mw_itweb_common_config.type | d('init') }}"
    mw_common_apache_port: "{{ mw_itweb_common_config.port }}"
    mw_common_apache_timeout: '120'
    mw_common_apache_shibboleth: true
    mw_common_apache_config_notify: "apache_config_change_{{ mw_itweb_common_config.type }}"
    mw_common_apache_module_override: "{{ mw_itweb_apache_module_override }}"
    mw_common_apache_mpm_worker_maxrequestworkers: "{{ mw_itweb_mpm_worker_maxrequestworkers | d(1000) }}"

We bend these guidelines when relevant data comes from multiple places – group_vars, host_vars, role defaults – and get combined by name. For example, our splunk_forwarder role contains the default variable splunk_forwarder_inputs_conf_defaults, but various group_vars or host_vars may exist for the current play host that should be combined with. They would have names like splunk_forwarder_inputs_conf_fwhostd0p or splunk_forwarder_inputs_conf_fwgrp4. A template task within the splunk_forwarder role combines these variables in its vars: section as shown below:

- name: Populate splunk inputs.conf
  become: true
  become_user: splunk
  ansible.builtin.template:
    src: '{{ splunk_forwarder_inputsconf }}'
    dest: '{{ splunk_forwarder_homedir }}/etc/system/local/inputs.conf'
    owner: splunk
    group: splunk
    mode: 0644
  notify: restart splunkforwarder
  vars:
    # 'conf_var_names' contains a list of all the variables starting with
    # "splunk_forwarder_inputs_conf_" except "…defaults".
    conf_var_names: "{{ [ [lookup('ansible.builtin.varnames',
                                 '^splunk_forwarder_inputs_conf_(?!defaults$).*', default='')]
                        ] | flatten
                        | map('split', ',')
                        | flatten }}"
    # Combine the 'splunk_forwarder_inputs_conf_defaults' role default
    # and any host- or group-specific inputs_conf variables that might
    # apply to the current play host.
    conf_var_values: "{{ splunk_forwarder_inputs_conf_defaults
                        | combine(lookup('ansible.builtin.vars', *conf_var_names)) }}"

(There’s also the community.general.merge_variables lookup that could be used if your variable names let you get the defaults combined first. We did it this way because - say it with me - “We’ve always done it this way.”)

2 Likes

I there a public repo with this role? From these snippets it is hard to understand what you are trying to achive with this approach. What is the intent?

The roles themselves aren’t relevant to what I was trying to show, which was one possible set of naming conventions and practices employed when moving data across the playbook⇔role transition.

The mw_common_apache example was to show the assignment of the role variable
mw_common_apache_module_override from the group variable
mw_itweb_apache_module_override rather than defining the former as a group variable with a name that’s spelled as if it were a role variable — a practice that some consider ill-advised.

The final template task demonstrates combining several variables by name in a particular order:

  • splunk_forwarder_inputs_conf_default (role default, first)
  • splunk_forwarder_inputs_conf_<host_or_group_name> (host or group var, which may not exist)
  • splunk_forwarder_inputs_conf_<host_or_other_group> (another, if it exists, which it may not)
  • …

The point is to demonstrate the naming convention whereby host and/or group vars (if they exist) can augment a role default variable, and as such demonstrate a slight bending of the “don’t name non-role variables like role variables” guideline espoused earlier.

I’m sorry if the extra detail of the examples detracted / distracted from those points.

is my understanding correct that you want to “manually“ recreate variable precedence for these variables, because they are nested (dictionaries, rather than simple types) and you want to combine them, rather than have one overwrite another?

Yes. Each contributes something host- or group-specific which augments rather than replaces the role defaults.

How do you ensure that order of variables in list conf_var_names is always the same?

Result of this statement depends on the order, isn’t it?

1 Like

That’s true, the order would matter if the various contributions (besides those from …_defaults which is always first) conflicted with each other. In our case that doesn’t matter as they don’t conflict.

The other reason we might care is if it resulted in the template output being trivially changed. In that case there should be a | sort in the conf_var_names pipeline, and another perhaps in the template itself. In practice, the order has been consistent, but that isn’t particularly assuring. Thanks for pointing that out.

That wouldn’t pass code review in our shop. You’ve got two, maybe three separate things here depending on how your count them, all with the same name – a recipe for confusion.
…
Likewise, any set_facts done within a role should (and these are our guidelines, not rules, so “should” rather than “must”) be prefixed with the role name. Therefore any data prefixed by my_role_ was either passed to the role explicitly, set_facted by the role (that’s how the role exposes values it wants to “export”), or came from role defaults or role vars.

Thanks for the additional comments, those are helpful.

We actually do our best to follow the Ansible best practices, enforced by the linter for things like variable naming with role prefixes. I just didn’t include it here because I was trying to rush this post out but also because I fear sharing real code will scare people away from putting attention on my post and sharing an answer. I just assume everyone’s busy so I dangle some really trivial code out. (I’m glad I put “simulated code” to hint nobody should actually try it.)

Back when I used Chef, I created some example ‘cookbooks’ and ‘recipes’ to teach myself variable precedence through practical examples. Last year while learning Ansible I did something along the same lines, but forgot about it. This thread reminds me that repeatedly seeing discussions and sample code reinforces our learning so we don’t overlook them in the future. AI won’t help with that, but good discussion does.

Cheers,

1 Like