Lazy templating and global variables in Ansible 2.19+

Hi all,

We are in the process of upgrading from Ansible 2.18 to 2.20 and are aware that there were some significant changes to templating in 2.19.

What would be the best practice for handling “undefined” variables in templates in Ansible 2.19+? For example, we have a bunch of global variables defined in our group_vars/all.yml, one of them is:

restic_repository: "s3:s3.amazonaws.com/{{ backup.aws_bucket_name }}"

backup dictionaries are defined in each of the inventories’ group_vars/all.yml and restic_repository was being lazily evaluated when needed in Ansible 2.18.

However, we have a play that references another global variable earlier in the run, so I guess the whole global group_vars/all.yml is loaded and evaluated in 2.19, and we get an error in that play:

FAILED! => {"changed": false, "msg": "Task failed: 'backup' is undefined"}

What would be considered best practice here? use default() in the template? Define a dummy backup dictionary in the global group_var/all.yml?

Thank you!

Variable templating is lazier in 2.19 than 2.18, and a task that uses one variable in group_vars/all.yml does not result in templating unrelated variables in group_vars/all.yml. Can you provide more context, like the full error (which shows the origin of the undefined variable) and failing task + surrounding context that previously succeeded in 2.18?

The failing task is the second stat below:

- name: "[{{ elastic_prefix }}] Set create certificates flag"
  run_once:  true
  block:
    - name: "[{{ elastic_prefix }}] Stat CA cert file"
      stat:
        path: "{{ elastic_host_certificates_path }}/{{ elastic_ca_cert_filename }}"
      register: elastic_ca_cert_stat
    - name: "[{{ elastic_prefix }}] Stat node cert files"
      stat:
        path: "{{ elastic_host_certificates_path }}/{{ elastic_prefix }}-{{ elastic_node_host.inventory_hostname_short }}.crt"
      loop: "{{ groups['docker-swarm'] | map('extract', hostvars) | list }}"
      loop_control:
        loop_var: elastic_node_host
      register: elastic_node_certs_stat
 #  more tasks ...

elastic_prefix and elastic_host_certificates_path are local variables, but elastic_host_certificates_path references a global variable set in global_vars/all.yml.

The full error text is:

[ERROR]: Task failed: 'backup' is undefined

Task failed.
Origin: <path>/roles/elasticsearch/tasks/elastic_instance.yml:50:7

48         path: "{{ elastic_host_certificates_path }}/{{ elastic_ca_cert_filename }}"
49       register: elastic_ca_cert_stat
50     - name: "[{{ elastic_prefix }}] Stat node cert files"
         ^ column 7

<<< caused by >>>

'backup' is undefined
Origin:  <path>/group_vars/all.yml:128:31

126   --prune
127 # Restic backup bucket name is set in cluster's all.yml file.
128 restic_repository:            "s3:s3.amazonaws.com/{{ backup.aws_bucket_name }}"
                                  ^ column 31

elastic_instance.yml is included (include_tasks) from role’s main.yml.

restic_repository variable is referenced in a later play and different role, it’s never used in the elasticsearch role.

This was run on Debian 13.3 WSL on Windows 10, in a Python 3.13.5 venv with the following pip packages installed:

Package            Version
------------------ -----------
ansible            13.4.0
ansible-core       2.20.3
argcomplete        3.6.3
bcrypt             5.0.0
boto               2.49.0
boto3              1.42.60
botocore           1.42.60
certifi            2026.2.25
cffi               2.0.0
charset-normalizer 3.4.4
cryptography       46.0.5
dnspython          2.8.0
hvac               2.4.0
idna               3.11
Jinja2             3.1.6
jmespath           1.1.0
MarkupSafe         3.0.3
packaging          26.0
pip                26.0.1
pycparser          3.0
python-dateutil    2.9.0.post0
PyYAML             6.0.3
requests           2.32.5
resolvelib         1.2.1
s3transfer         0.16.0
six                1.17.0
tomlkit            0.14.0
urllib3            2.6.3
xmltodict          1.0.4
yq                 3.4.3

Was also replicated on a MacOS, a Python 3.14 venv and the same set of packages. If we downgrade to Ansible 2.18.14, everything works.

1 Like

Could you share more of group_vars/all.yml prior to the definition of restic_repository? It’s acting like that variable definition is being sucked up into something that comes before that.

If the backup definition from relevant group_vars wasn’t being pulled in (because of inventory differences?), then we still wouldn’t expect the error to be thrown in the elastic_instance.yml task file since it’s not used until later.

Does

grep -r restic_repository .

turn up any unexpected usages?

'T’is a puzzler.

I think the usage is shown, the cause is

      loop: "{{ groups['docker-swarm'] | map('extract', hostvars) | list }}"

I was able to reproduce, but need to do some more research. In earlier versions of ansible-core, undefined variables in hostvars were just the template string, which was a bug.

2 Likes

Oooh, nice catch! I can test tomorrow and use hostvars in a debug task, with and without a loop and map.

Ok, this is embarrassing… The one inventory I tested this with does not actually have backup dictionary defined in its group_vars/all.yml file, and restic_repository variable from the global group_vars/all.yml is not used for that inventory.

There is still a change in behaviour here. I added this task to dump hostvars into a local file:

- name: Dump dictionary as formatted YAML
  copy:
    content: "{{ groups['docker-swarm'] | map('extract', hostvars) | list | to_nice_yaml }}"
    dest: "./debug_output.yml"
  delegate_to: localhost

On 2.18, I get this for restic_repository in the file:

restic_repository: s3:s3.amazonaws.com/{{ backup.aws_bucket_name }}

The same task errors out on 2.20 with:

TASK [elasticsearch : Dump dictionary as formatted YAML] ************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
[ERROR]: Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'backup' is undefined

Task failed.
Origin: <path>/roles/elasticsearch/tasks/main.yml:179:3

177
178
179 - name: Dump dictionary as formatted YAML
      ^ column 3

<<< caused by >>>

Finalization of task args for 'ansible.builtin.copy' failed.
Origin: <path>/roles/elasticsearch/tasks/main.yml:180:3

178
179 - name: Dump dictionary as formatted YAML
180   copy:
      ^ column 3

<<< caused by >>>

Error while resolving value for 'content'.
Origin: <path>/roles/elasticsearch/tasks/main.yml:181:14

179 - name: Dump dictionary as formatted YAML
180   copy:
181     content: "{{ groups['docker-swarm'] | map('extract', hostvars) | list | to_nice_yaml }}"
                 ^ column 14

<<< caused by >>>

'backup' is undefined
Origin: <path>/group_vars/all.yml:132:31

130 restic_aws_profile:           default
131 # Restic backup bucket AWS region is set in cluster's all.yml file.
132 restic_aws_default_region:    "{{ backup.aws_bucket_region }}"
                                  ^ column 31

fatal: [host01.example.com -> localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'backup' is undefined"}
fatal: [host02.example.com -> localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'backup' is undefined"}
fatal: [host03.example.com -> localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'backup' is undefined"}

Which means that 2.20 behaves how you’d actually expect Ansible to behave when encountering an undefined variable :slight_smile:

1 Like

However… :slight_smile:

Even when I define a dummy backup dictionary in my test inventory, I still get an error on “dump hostvars to local file” task, because hostvars contains a bunch of unresolved variables in templates.

For example, we have a docker-swarm role, executed in an earlier play using import_role.

docker-swarm’s vars/main.yml is:

docker_package_version: 5:28.1.1-1

The role’s tasks/main.yml file includes vars at the very beginning:

- name: Include variables based on distribution
  include_vars: "{{ item }}"
  with_first_found:
    - "{{ ansible_facts['distribution'] | lower }}-{{ ansible_facts['distribution_major_version'] }}.yml"
    - "{{ ansible_facts['distribution'] | lower }}.yml"

Since I’m testing on Ubuntu 24.04, we have vars/ubuntu-24.yml:

docker_package_version_string: "{{ docker_package_version }}~ubuntu.24.04~noble"

The “dump to file” task errors out on Ansible 2.20:

TASK [elasticsearch : Dump dictionary as formatted YAML] ************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
[ERROR]: Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'docker_package_version' is undefined

Task failed.
Origin: <path>/roles/elasticsearch/tasks/main.yml:179:3

177
178
179 - name: Dump dictionary as formatted YAML
      ^ column 3

<<< caused by >>>

Finalization of task args for 'ansible.builtin.copy' failed.
Origin: <path>/roles/elasticsearch/tasks/main.yml:180:3

178
179 - name: Dump dictionary as formatted YAML
180   copy:
      ^ column 3

<<< caused by >>>

Error while resolving value for 'content'.
Origin: <path>/roles/elasticsearch/tasks/main.yml:181:14

179 - name: Dump dictionary as formatted YAML
180   copy:
181     content: "{{ groups['docker-swarm'] | map('extract', hostvars) | list | to_nice_yaml }}"
                 ^ column 14

<<< caused by >>>

'docker_package_version' is undefined
Origin: <path>/roles/docker-swarm/vars/ubuntu-24.yml:4:32

2 ---
3
4 docker_package_version_string: "{{ docker_package_version }}~ubuntu.24.04~noble"
                                 ^ column 32

fatal: [host01.example.com -> localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'docker_package_version' is undefined"}
fatal: [host02.example.com -> localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'docker_package_version' is undefined"}
fatal: [host03.example.com -> localhost]: FAILED! => {"changed": false, "msg": "Task failed: Finalization of task args for 'ansible.builtin.copy' failed: Error while resolving value for 'content': 'docker_package_version' is undefined"}

On Ansible 2.18, I see this in the dump file:

docker_package_version_string: '{{ docker_package_version }}~ubuntu.24.04~noble'

The three relevant roles in our playbook are: gluster, docker-swarm and elasticsearch, executed in that order, each in a separate play.

The hostvars variable is first accessed in the gluster role, but there we extract individual variables, something like this:

gluster_nodes: "{{ ansible_play_batch | map('extract', hostvars, 'inventory_hostname_short') | list }}"

elasticsearch role is the only place where we access the entire hostvars variable in that loop:

loop: "{{ groups['docker-swarm'] | map('extract', hostvars) | list }}"

If I change that loop to extract the single variable we actually use in that task, the loop works fine.

I do hit another templating issue further down the elasticsearch role, need to investigate that one, too. I suspect it’s something similar.

@shertel With changes in 2.19, doesn’t that mean that you cannot use the “entire” hostvars variable (like I am/was with map('extract', hostvars) until all variables have been initialised, which is hard to ensure in any slightly complex playbook?

I get what you’re saying, but the counter argument is that it’s no more broken under 2.19 than it was before, only now you know about it.

Either way, the options available are the same. This is off the top of my head. As such, I may be a little loose with some terms. We can work up a “real” list with some more thought.

  1. Avoid declaring/defining variables in terms of other variables from scopes which may never be reached. For example, a variable which can’t be expanded until a particular role is invoked should probably be a role variable rather than an “inventory” variable (host_vars and group_vars).
  2. Failing #1, variables whose “real” values will be (or may be) assigned later but are used early in other variable definitions may be given obvious sentinel values (e.g. “-TBD-”) to prevent inadvertent early evaluation from dropping a whole box of punch cards on the machine room floor. (I need to update my metaphors.)
  3. Failing #2, avoid expanding all of a hosts variables in lieu of examining the particular variables you actually need.
  4. Failing #3, put blatant overzealous variable expansions in a block with a rescue section. (I thought this wouldn’t actually work for undefined variables, but it does, at least on core 2.18.12. Yay!)

In your case, I think #1 — moving complex variable definitions out of your inventory and into the roles before which they have no expandable values — is the right approach.

It depends on how you are using it. Loop variables and task arguments must be fully defined, but in general, no. To see that map('extract', hostvars) itself is not the problem, you could try:

- name: "[{{ elastic_prefix }}] Stat node cert files"
  stat:
    path: "{{ elastic_host_certificates_path }}/{{ elastic_prefix }}-{{ docker_swarm_vars[item].inventory_hostname_short }}.crt"
  loop: "{{ range(0, docker_swarm_vars | length) | list }}"
  vars:
    docker_swarm_vars: "{{ groups['docker-swarm'] | map('extract', hostvars) | list }}"
  register: elastic_node_certs_stat

Although I think using map + extract is less easy to read than just looping over the hostvars keys:

- name: "[{{ elastic_prefix }}] Stat node cert files"
  stat:
    path: "{{ elastic_host_certificates_path }}/{{ elastic_prefix }}-{{ hostvars[item].inventory_hostname_short }}.crt"
  loop: "{{ groups['docker-swarm'] }}"
  register: elastic_node_certs_stat

Allowing the loop to be partially undefined (when the undefined subset is not accessed) for the task seems technically possible, but it would reintroduce a similar bug for callback plugins which also expect the loop variable is fully defined.

1 Like