Combine ansible.builtin.set_fact and with_items loop

Okay, let me briefly explain what’s on my mind right now.

I have a directory where another system stores files that I need to continue working with. The following files are currently located there, for example:

/srv/orders/local/Ocustomername1.+2023+february+1.txt
/srv/orders/local/Ocustomername5.+2020+may+14.txt
/srv/orders/local/Ocustomername2.+2025+october+1923.txt
/srv/orders/local/Ocustomername7.+2026+january+29.txt

The contents of the /srv/orders/local/ directory change, and I don’t know what the individual files are actually called. The only clue is the customer’s name, e.g., “customername1.”

Now I have to extract and remember the file name and path, including the name, for further actions. To do this, I have created the following small task string for customername2:

---
   
- name: "Determine customer file name."
  ansible.builtin.command:
    cmd: find /srv/orders/local/ -name Ocustomername2.*.txt
  register: market_ordername                                                                                                      
  changed_when: market_ordername.rc != 0

- name: "Make a note of the file name and path of the order file."
  ansible.builtin.set_fact:
    market_order_path: "{{ market_ordername.stdout }}"
    market_order_filename: "{{ market_ordername.stdout[19:] }}"
    cacheable: true
 
- name: "Order filename"
  ansible.builtin.debug:
    msg: "The order filename is: {{ market_order_filename }}"
   
- name: "Order path."
  ansible.builtin.debug:
    msg: "The whole path of the order is:  {{ market_order_path }}"
   
...

So far so good. That works without any problems, as I only pick one name from the order list. However, there are around 50 different files in the directory, and I need the data for not just one customer, but for four or 21 selected customers!

I therefore have a corresponding array in the inventory:

customers:
  - info :                 'Huber'
    market_order_name:     'customername1'
    market_order_filename: ''
    market_order_path:     ''
  - info :                 'Mustermann'
    market_order_name:     'customername2'
    market_order_filename: ''
    market_order_path:     ''
  - info :                 'Maier'
    market_order_name:     'customername42'
    market_order_filename: ''
    market_order_path:     ''
  - info :                 'Schneider'
    market_order_name:     'customername69'
    market_order_filename: ''
    market_order_path:     ''
 

If I knew all the customer data, I could put it in this array and then later access this content specifically in a task using with_items. Unfortunately, I don’t have this data because the data in the directory in question changes frequently.
I have to laboriously extract the data from the directory myself and remember that I can access it later.

My idea was as follows. I extract the determined data (file and path name) using the task shown above and write it to the existing fields of the array using set_facts. This allows me to access all data in the array later on.

- name: "Determine customer file name."
  ansible.builtin.command:
    cmd: find /srv/orders/local/ -name Ocustomername2.*.txt
  register: market_ordername                                                                                                      
  changed_when: market_ordername.rc != 0

- name: "Make a note of the file name and path of the order file."
  ansible.builtin.set_fact:
    market_order_path: "{{ market_ordername.stdout }}"
    market_order_filename: "{{ market_ordername.stdout[19:] }}"
    cacheable: true

So I have to run this task for every market_order_name from the array and then use set_facht to put the result of market_order_filename and market_order_path into the respective fields of the array. That’s my naive approach.

My attempt to combine ansible.builtin.include_tasks with a with_items loop failed miserably.

But what’s the best way to do that? Maybe someone has an idea and can point me in the right direction. I’m grateful for any tips!

1 Like

The find module and basename filter make this much easier IMO.

I have a data directory like:
image

My playbook:

---
- hosts: localhost
  gather_facts: false
  vars:
    some_input_customer_id_to_name:
      customer1: Schneider
      customer2: Buzz
      customer3: Gates
  tasks:
    - ansible.builtin.find:
        path: "{{ playbook_dir }}/some_dir"
        patterns: customer1_.*\.txt
        use_regex: true
      register: _find_customers

    - name: Create some data dict
      ansible.builtin.set_fact:
        customer_files: >-
          {{ customer_files | default([]) + [
            {
                "info": some_input_customer_id_to_name[_market_order_name],
                "market_order_name": _market_order_name,
                "market_order_filename": _market_order_filename,
                "market_order_path": _customer_file.path
            }
          ] }}
      vars:
        _market_order_filename: "{{ _customer_file.path | basename }}"
        _market_order_name: '{{ _market_order_filename | regex_replace("(.*)_.*\.txt", "\1") }}'
      loop: "{{ _find_customers.files }}"
      loop_control:
        loop_var: _customer_file
        label: _customer_file.filename

    - debug:
        var: customer_files

Yields the result:

ok: [localhost] => {
    "customer_files": [
        {
            "info": "Schneider",
            "market_order_filename": "customer1_9hMptwdqVoxgo.txt",
            "market_order_name": "customer1",
            "market_order_path": "/var/home/mike/git/test/some_dir/customer1_9hMptwdqVoxgo.txt"
        },
        {
            "info": "Schneider",
            "market_order_filename": "customer1_BlLXmJ52HMcG8.txt",
            "market_order_name": "customer1",
            "market_order_path": "/var/home/mike/git/test/some_dir/customer1_BlLXmJ52HMcG8.txt"
        },
        {
            "info": "Schneider",
            "market_order_filename": "customer1_kwbkRBEg82uFk.txt",
            "market_order_name": "customer1",
            "market_order_path": "/var/home/mike/git/test/some_dir/customer1_kwbkRBEg82uFk.txt"
        }
    ]
}

If you wanted to select multiple customers, I would put all of the tasks in my playbook into another file, like some_sub_tasks.yml, and then call that with include_tasks. You can loop over your customer ids and make that a variable in some_sub_tasks.yml instead of hardcoding customer#. Short example:

- hosts: localhost
  gather_facts: false
  vars:
    some_input_customer_id_to_name:
      customer1: Schneider
      customer2: Buzz
      customer3: Gates
  tasks:
    - ansible.builtin.include_tasks: some_sub_tasks.yml
      loop:
        - customer1
        - customer2
      loop_control:
        loop_var: customer_id

    # should have the full list of files for customer1 and customer2 now
    - debug:
        var: customer_files

Theres some performance tuning that could be done at that point by moving the find module back out of the sub tasks, but that comes with its own challenges.

2 Likes

Find the files and register the output. For example,

    - find:
        paths: "{{ playbook_dir }}/{{ mo_path }}"
      register: out
      vars:
        mo_path: srv/orders/local

Create a list of the files

    mo_filenames: "{{ out.files
                      | map(attribute='path')
                      | map('basename') }}"

gives

    mo_filenames:
    - Ocustomername7.+2026+january+29.txt
    - Ocustomername1.+2023+february+1.txt
    - Ocustomername5.+2020+may+14.txt
    - Ocustomername2.+2025+october+1923.txt

Create a list of the customers with an order(s)

    mo_customers: "{{ mo_filenames
                      | map('split', '.')
                      | map('first')
                      | unique
                      | sort }}"

gives

    mo_customers:
    - Ocustomername1
    - Ocustomername2
    - Ocustomername5
    - Ocustomername7

Create a dictionary of customers with orders

    mo_dict: |
      {% filter from_yaml %}
      {% for c in mo_customers %}
      {{ c[1:] }}: {{ mo_filenames | select('match', c) }}
      {% endfor %}
      {% endfilter %}

gives

    mo_dict:
        customername1:
        - Ocustomername1.+2023+february+1.txt
        customername2:
        - Ocustomername2.+2025+october+1923.txt
        customername5:
        - Ocustomername5.+2020+may+14.txt
        customername7:
        - Ocustomername7.+2026+january+29.txt

Now, use the dictionary. For example, iterate the customers

    - debug:
        msg: >
          Customer: {{ item.market_order_name }}
          Orders: {{ mo_dict[item.market_order_name] | d([]) }}"
      loop: "{{ customers }}"

gives (abridged)

    msg: |-
        Customer: customername1 Orders: ['Ocustomername1.+2023+february+1.txt']"

    msg: |-
        Customer: customername2 Orders: ['Ocustomername2.+2025+october+1923.txt']"

    msg: |-
        Customer: customername42 Orders: []"

    msg: |-
        Customer: customername69 Orders: []"

source code

3 Likes

Hey, what can I say, you guys are the best! Both solutions have already helped me take a decisive step forward! Thank you very much for your input!