Merge files inside custom library leads to have not the same values within vars and facts

I Tried to create a custom module that merge vars:

#!/usr/bin/python
import os
import yaml
from ansible.module_utils.basic import AnsibleModule

DOCUMENTATION = r'''
---
module: load_vars
short_description: This module loads variables from files based on the distribution
description:
    - This module loads variables from YAML files based on the distribution, major version, and version number.
    - The module will merge the variables from multiple files if they exist.
options:
    distribution:
        description: The distribution name (e.g., CentOS, Ubuntu).
        required: true
        type: str
    distribution_major_version:
        description: The major version of the distribution.
        required: true
        type: str
    distribution_version:
        description: The version of the distribution.
        required: true
        type: str
    role_path:
        description: The path to the Ansible role.
        required: true
        type: str
'''

def main():
    module = AnsibleModule(
        argument_spec=dict(
            distribution=dict(type='str', required=True),
            distribution_major_version=dict(type='str', required=True),
            distribution_version=dict(type='str', required=True),
            role_path=dict(type='str', required=True)
        ),
        supports_check_mode=True
    )

    distribution = module.params['distribution']
    major_version = module.params['distribution_major_version']
    version = module.params['distribution_version']
    role_path = module.params['role_path']

    paths = [
        os.path.join(role_path, 'vars', f'{distribution}/main.yml'),
        os.path.join(role_path, 'vars', f'{distribution}/{major_version}/main.yml'),
        os.path.join(role_path, 'vars', f'{distribution}/{major_version}/{version}.yml')
    ]

    merged_vars = {}

    for path in paths:
        if os.path.exists(path):
            with open(path, 'r') as file:
                vars_dict = yaml.safe_load(file)
                merged_vars.update(vars_dict)

    module.exit_json(changed=False, ansible_facts=merged_vars)

if __name__ == '__main__':
    main()

Questions:

  • is it possible to use facts inside (distribution, major_version, …) ?
  • why vars.toto and ansible_facts.toto differs ?

I create a demo role, with roles/demo/tasks/main.yml that use my library called merge_vars:

- name: Merge all file vars found
  merge_vars:
    distribution: "{{ ansible_distribution }}"
    distribution_major_version: "{{ ansible_distribution_major_version }}"
    distribution_version: "{{ ansible_distribution_version }}"
    role_path: "{{ role_path }}"

- name: debug toto vars
  debug:
    var: vars.toto

- name: debug toto facts
  debug:
    var: ansible_facts.toto

roles/demo/vars/main.yml file:

toto: main

roles/demo/vars/Ubuntu/main.yml file:

toto: ubuntu

playbook.yml file:

- name: Demo
  hosts: all
  roles:
    - role: demo
$ ansible-playbook -i "localhost," -c local playbook.yml
TASK [demo : debug toto vars] ****************************************************************************************************************************************************************
ok: [localhost] => 
  vars.toto: main

TASK [demo : debug toto facts] ***************************************************************************************************************************************************************
ok: [localhost] => 
  ansible_facts.toto: ubuntu

ansible_facts has the good value but not the vars :frowning:

If I do the same directly with Ansible it works:

- name: Merge all file vars found
  ansible.builtin.include_vars: "{{ item }}"
  loop:
    - "{{ ansible_distribution }}/main.yml"
    - "{{ ansible_distribution }}/{{ ansible_distribution_major_version }}/main.yml"
    - "{{ ansible_distribution }}/{{ ansible_distribution_major_version }}/{{ ansible_distribution_version }}.yml"
  when:
    - ([path, item] | path_join) in query("community.general.filetree", path) | selectattr("state", "in", "file") | map(attribute="src") | list
  vars:
    path: "{{ [role_path, 'vars'] | path_join }}"

In your module.exit_json, you explicitly state ansible_facts=merged_vars. That is, your module is setting facts, not vars.

There seems to be confusion about the nature of variables, facts, and their relationships to one another.

  • variables are ephemeral; they are created afresh on each playbook run and disappear at the end. There are a half-zillion ways to create variables with various scopes and precedences.
  • facts are scoped to specific target hosts. They may optionally be discovered at playbook startup, may be created or changed along the way, and may disappear at the end or optionally be cached for subsequent playbook use.

You can always look at facts directly by examining the ansible_facts dictionary in a specific host context.

All of that is pretty unsurprising. The confusion starts when you attempt to use a “regular” variable. If you try to use {{ toto }} for example, you will get the value of the variable toto if it exists; otherwise you’ll get the value of the fact toto if it exists; and if neither exists you’ll get messages about “undefined blah blah” etc.

What you’re seeing is that you have both a variable and a fact both named toto, and they have two different values.


You can partly address your issue by changing roles/demo/vars/main.yml to this:

toto: '{{ ansible_facts.toto | default("main") }}'

With that change, after calling your role, the follow debug steps…

- name: Show vars.toto
  ansible.builtin.debug:
    var: vars.toto

- name: Show toto
  ansible.builtin.debug:
    var: toto

- name: Show ansible_facts.toto
  ansible.builtin.debug:
    var: ansible_facts.toto

…produce this output (note I’m on Fedora, not Ubuntu):

TASK [demo : Show vars.toto] ***************************************************
task path: /home/utoddl/ansible/gigi206/roles/demo/tasks/main.yml:13
ok: [localhost] => 
  vars.toto: '{{ ansible_facts.toto | default("main") }}'

TASK [demo : Show toto] ********************************************************
task path: /home/utoddl/ansible/gigi206/roles/demo/tasks/main.yml:17
ok: [localhost] => 
  toto: Fedora

TASK [demo : Show ansible_facts.toto] ******************************************
task path: /home/utoddl/ansible/gigi206/roles/demo/tasks/main.yml:21
ok: [localhost] => 
  ansible_facts.toto: Fedora

We see that debug’s “var: vars.toto” produces the value of the toto variable before template evaluation, whereas “var: toto” evaluates the template. Other than that, it behaves pretty much as you expect.

vars is an old internal implementation detail and was never intended for public use. You should not use it for anything because it has surprising behaviour, and any unexpected results that you encounter could be a result of using this undocumented, unsupported variable (which will probably be removed entirely in the not-too-distant future.)

Instead, access the variable by name:

- name: debug toto vars
  debug:
    var: toto

This is not correct. You get the value of the variable toto; the highest precedence source of that variable could be a fact if you have INJECT_FACTS_AS_VARS enabled, but there is no special fallback to facts.

1 Like

I stand corrected; it’s much more subtle than I appreciated. For @gigi206’s original post, the behavior is as I described, assuming you haven’t turned off INJECT_FACTS_AS_VARS, but for different reasons.

If you do turn it off INJECT_FACTS_AS_VARS, then expressions like "{{ ansible_distribution }}" will also fail, so this

- name: Merge all file vars found
  merge_vars:
    distribution: "{{ ansible_distribution }}"
    distribution_major_version: "{{ ansible_distribution_major_version }}"
    distribution_version: "{{ ansible_distribution_version }}"
    role_path: "{{ role_path }}"

would need to be changed to

- name: Merge all file vars found
  merge_vars:
    distribution: "{{ ansible_facts.distribution }}"
    distribution_major_version: "{{ ansible_facts.distribution_major_version }}"
    distribution_version: "{{ ansible_facts.distribution_version }}"
    role_path: "{{ role_path }}"

Precedence for facts is either 11 or 19. Presumably facts set through a module like this would be equivalent to those set by set_fact (19) rather than host facts or cached set_facts (11), but be prepared to be surprised.

The brief INJECT_FACTS_AS_VARS doc mentions

Unlike inside the ansible_facts dictionary, these [injected variables] will have an ansible_ prefix.

which is demonstrably incorrect. It probably applies to facts set through gather_facts, but not those set through modules such as set_fact or @gigi206’s merge_vars.

1 Like

It means to convey that they will be injected with the exact name returned by the module, unlike in ansible_facts where any ansible_ prefix is stripped off. But while that prefix was a convention before the namespaced facts existed, there’s nothing that forces new facts modules to include a prefix. For example, ansible.builtin.package_facts returns a fact just named packages.

I have proposed a documentation update in Clarify documentation for `INJECT_FACTS_AS_VARS` by flowerysong · Pull Request #83404 · ansible/ansible · GitHub

2 Likes

Thank you for your replies :slight_smile:

Finally, can my code be improved ? Finally, it’s always better to use facts like facts.myvar instead of myvar ?
And can I use facts directly in the source code without passing them in arguments ?

I’ve been looking over Creating an info or a facts module trying to decide whether your module is a “*_info” module, a “*_facts” module, or something else. “*_facts” modules return facts about the machine itself, whereas your module feels more like a “*_info” module because it returns information retrieved from selected files. Although the files are selected based on specific facts, their content could be anything.

If that’s true, then according to those guidelines it should be renamed to something along the lines of “merge_distro_var_file_info” (see link above), and it shouldn’t set facts at all. Instead, it should return something that can be registered so other tasks can use those values. (It should also support check mode, although in this case I’m not sure if it would behave any differently in check mode.)

One other thing seems somehow wrong to me. It’s a combination of the way you’re using the role_path, the way you’re testing it using “localhost,” as the target/inventory, and the statement, “If I do the same directly with Ansible it works” and the associated task in that example. Because that example is not doing the same thing at all. That final example task is reading files on the ansible controller, whereas your module reads files on the target hosts. You haven’t noticed the difference because you’ve been testing on – and therefore your target host has been – your ansible controller. However, for this module to work on other hosts, you’ll have to create the identical files on all your target hosts in the same place - the same role_path - as on your ansible controller. I’m reasonably certain that is not what you had in mind. Remember, modules run on the target hosts; lookups, filters, and pretty much everything that isn’t a module run on the ansible controller.

2 Likes

I’m reasonably certain that’s not what you had in mind. Remember, modules run on the target hosts ; lookups, filters, and pretty much everything that isn’t a module run on the ansible controller .

You are quite right, but include_vars is a module, right ? And it includes from the Ansible controller, that’s why I develop a module.

I would like test the same as below in module:

- name: Merge all file vars found
  ansible.builtin.include_vars: "{{ item }}"
  loop:
    - "{{ ansible_distribution }}/main.yml"
    - "{{ ansible_distribution }}/{{ ansible_distribution_major_version }}/main.yml"
    - "{{ ansible_distribution }}/{{ ansible_distribution_major_version }}/{{ ansible_distribution_version }}.yml"
  when:
    - ([path, item] | path_join) in query("community.general.filetree", path) | selectattr("state", "in", "file") | map(attribute="src") | list
  vars:
    path: "{{ [role_path, 'vars'] | path_join }}"

No, it’s an action plugin. And it’s an action plugin with special treatment from the executor, so it’s not possible to achieve exactly the same results it does in a normal action plugin.

1 Like

Wow, @flowerysong, keeping me honest is keeping you busy!
Every time I think I understand something, a closer look reveals a whole 'nuther onion…

I’ve read over ansible.builtin.include_vars module – Load variables from files, dynamically within a task — Ansible Community Documentation several times now, and at best it vaguely hints that the files it’s reading are on the controller. The casual reader could easily come away with the opposite conclusion.

The first bullet point under “Synopsis” says:

Loads YAML/JSON variables dynamically from a file or directory, recursively, during task runtime.

Would you entertain updating that thus:

Loads YAML/JSON variables dynamically from a file or directory on the Ansible controller, recursively, during task runtime.

[Edit: Oops. I hijacked a topic. My bad.]

1 Like