Combine ansible.builtin.set_fact and with_items loop

Okay, let me briefly explain what’s on my mind right now.

I have a directory where another system stores files that I need to continue working with. The following files are currently located there, for example:

/srv/orders/local/Ocustomername1.+2023+february+1.txt
/srv/orders/local/Ocustomername5.+2020+may+14.txt
/srv/orders/local/Ocustomername2.+2025+october+1923.txt
/srv/orders/local/Ocustomername7.+2026+january+29.txt

The contents of the /srv/orders/local/ directory change, and I don’t know what the individual files are actually called. The only clue is the customer’s name, e.g., “customername1.”

Now I have to extract and remember the file name and path, including the name, for further actions. To do this, I have created the following small task string for customername2:

---
   
- name: "Determine customer file name."
  ansible.builtin.command:
    cmd: find /srv/orders/local/ -name Ocustomername2.*.txt
  register: market_ordername                                                                                                      
  changed_when: market_ordername.rc != 0

- name: "Make a note of the file name and path of the order file."
  ansible.builtin.set_fact:
    market_order_path: "{{ market_ordername.stdout }}"
    market_order_filename: "{{ market_ordername.stdout[19:] }}"
    cacheable: true
 
- name: "Order filename"
  ansible.builtin.debug:
    msg: "The order filename is: {{ market_order_filename }}"
   
- name: "Order path."
  ansible.builtin.debug:
    msg: "The whole path of the order is:  {{ market_order_path }}"
   
...

So far so good. That works without any problems, as I only pick one name from the order list. However, there are around 50 different files in the directory, and I need the data for not just one customer, but for four or 21 selected customers!

I therefore have a corresponding array in the inventory:

customers:
  - info :                 'Huber'
    market_order_name:     'customername1'
    market_order_filename: ''
    market_order_path:     ''
  - info :                 'Mustermann'
    market_order_name:     'customername2'
    market_order_filename: ''
    market_order_path:     ''
  - info :                 'Maier'
    market_order_name:     'customername42'
    market_order_filename: ''
    market_order_path:     ''
  - info :                 'Schneider'
    market_order_name:     'customername69'
    market_order_filename: ''
    market_order_path:     ''
 

If I knew all the customer data, I could put it in this array and then later access this content specifically in a task using with_items. Unfortunately, I don’t have this data because the data in the directory in question changes frequently.
I have to laboriously extract the data from the directory myself and remember that I can access it later.

My idea was as follows. I extract the determined data (file and path name) using the task shown above and write it to the existing fields of the array using set_facts. This allows me to access all data in the array later on.

- name: "Determine customer file name."
  ansible.builtin.command:
    cmd: find /srv/orders/local/ -name Ocustomername2.*.txt
  register: market_ordername                                                                                                      
  changed_when: market_ordername.rc != 0

- name: "Make a note of the file name and path of the order file."
  ansible.builtin.set_fact:
    market_order_path: "{{ market_ordername.stdout }}"
    market_order_filename: "{{ market_ordername.stdout[19:] }}"
    cacheable: true

So I have to run this task for every market_order_name from the array and then use set_facht to put the result of market_order_filename and market_order_path into the respective fields of the array. That’s my naive approach.

My attempt to combine ansible.builtin.include_tasks with a with_items loop failed miserably.

But what’s the best way to do that? Maybe someone has an idea and can point me in the right direction. I’m grateful for any tips!

The find module and basename filter make this much easier IMO.

I have a data directory like:
image

My playbook:

---
- hosts: localhost
  gather_facts: false
  vars:
    some_input_customer_id_to_name:
      customer1: Schneider
      customer2: Buzz
      customer3: Gates
  tasks:
    - ansible.builtin.find:
        path: "{{ playbook_dir }}/some_dir"
        patterns: customer1_.*\.txt
        use_regex: true
      register: _find_customers

    - name: Create some data dict
      ansible.builtin.set_fact:
        customer_files: >-
          {{ customer_files | default([]) + [
            {
                "info": some_input_customer_id_to_name[_market_order_name],
                "market_order_name": _market_order_name,
                "market_order_filename": _market_order_filename,
                "market_order_path": _customer_file.path
            }
          ] }}
      vars:
        _market_order_filename: "{{ _customer_file.path | basename }}"
        _market_order_name: '{{ _market_order_filename | regex_replace("(.*)_.*\.txt", "\1") }}'
      loop: "{{ _find_customers.files }}"
      loop_control:
        loop_var: _customer_file
        label: _customer_file.filename

    - debug:
        var: customer_files

Yields the result:

ok: [localhost] => {
    "customer_files": [
        {
            "info": "Schneider",
            "market_order_filename": "customer1_9hMptwdqVoxgo.txt",
            "market_order_name": "customer1",
            "market_order_path": "/var/home/mike/git/test/some_dir/customer1_9hMptwdqVoxgo.txt"
        },
        {
            "info": "Schneider",
            "market_order_filename": "customer1_BlLXmJ52HMcG8.txt",
            "market_order_name": "customer1",
            "market_order_path": "/var/home/mike/git/test/some_dir/customer1_BlLXmJ52HMcG8.txt"
        },
        {
            "info": "Schneider",
            "market_order_filename": "customer1_kwbkRBEg82uFk.txt",
            "market_order_name": "customer1",
            "market_order_path": "/var/home/mike/git/test/some_dir/customer1_kwbkRBEg82uFk.txt"
        }
    ]
}

If you wanted to select multiple customers, I would put all of the tasks in my playbook into another file, like some_sub_tasks.yml, and then call that with include_tasks. You can loop over your customer ids and make that a variable in some_sub_tasks.yml instead of hardcoding customer#. Short example:

- hosts: localhost
  gather_facts: false
  vars:
    some_input_customer_id_to_name:
      customer1: Schneider
      customer2: Buzz
      customer3: Gates
  tasks:
    - ansible.builtin.include_tasks: some_sub_tasks.yml
      loop:
        - customer1
        - customer2
      loop_control:
        loop_var: customer_id

    # should have the full list of files for customer1 and customer2 now
    - debug:
        var: customer_files

Theres some performance tuning that could be done at that point by moving the find module back out of the sub tasks, but that comes with its own challenges.

Find the files and register the output. For example,

    - find:
        paths: "{{ playbook_dir }}/{{ mo_path }}"
      register: out
      vars:
        mo_path: srv/orders/local

Create a list of the files

    mo_filenames: "{{ out.files
                      | map(attribute='path')
                      | map('basename') }}"

gives

    mo_filenames:
    - Ocustomername7.+2026+january+29.txt
    - Ocustomername1.+2023+february+1.txt
    - Ocustomername5.+2020+may+14.txt
    - Ocustomername2.+2025+october+1923.txt

Create a list of the customers with an order(s)

    mo_customers: "{{ mo_filenames
                      | map('split', '.')
                      | map('first')
                      | unique
                      | sort }}"

gives

    mo_customers:
    - Ocustomername1
    - Ocustomername2
    - Ocustomername5
    - Ocustomername7

Create a dictionary of customers with orders

    mo_dict: |
      {% filter from_yaml %}
      {% for c in mo_customers %}
      {{ c[1:] }}: {{ mo_filenames | select('match', c) }}
      {% endfor %}
      {% endfilter %}

gives

    mo_dict:
        customername1:
        - Ocustomername1.+2023+february+1.txt
        customername2:
        - Ocustomername2.+2025+october+1923.txt
        customername5:
        - Ocustomername5.+2020+may+14.txt
        customername7:
        - Ocustomername7.+2026+january+29.txt

Now, use the dictionary. For example, iterate the customers

    - debug:
        msg: >
          Customer: {{ item.market_order_name }}
          Orders: {{ mo_dict[item.market_order_name] | d([]) }}"
      loop: "{{ customers }}"

gives (abridged)

    msg: |-
        Customer: customername1 Orders: ['Ocustomername1.+2023+february+1.txt']"

    msg: |-
        Customer: customername2 Orders: ['Ocustomername2.+2025+october+1923.txt']"

    msg: |-
        Customer: customername42 Orders: []"

    msg: |-
        Customer: customername69 Orders: []"

source code

Hey, what can I say, you guys are the best! Both solutions have already helped me take a decisive step forward! Thank you very much for your input!