Issue with select(search,x) returning multiple results that is bugging the code

I am. using select('search') to pull some results from a list of items that contains whitespace separated strings, and the search results are not returning the exact string im searching for, its returning the exact string, plus another string, and that other string is overriding the results which is buggy, or unexpected

  • I am parsing results from nomad node status.
    those results looks like this
  nomad_node_status_str: |-
    ID        DC   Name           Class      Drain  Eligibility  Status
    8a64d5e9  dc  be7   BE  false  eligible     ready
    7795cd1b  dc  be15  BE  false  ineligible   ready
    8db38b87  dc  be1   BE  false  eligible     ready
    ba29bbb1  dc  be2   BE  false  eligible     ready
    a9635b5c  dc  be9   BE  false  eligible     ready
    0d10c06a  dc  fe92  FE  false  eligible     ready
    ca965927  dc  fe46  FE  false  eligible     ready
  • I am then turning that into a list
  nomad_node_status_list:
  - ID        DC   Name           Class      Drain  Eligibility  Status
  - 8a64d5e9  dc  be7   BE  false  eligible     ready
  - 7795cd1b  dc  be15  BE  false  ineligible   ready
  - 8db38b87  dc  be1   BE  false  eligible     ready
  - ba29bbb1  dc  be2   BE  false  eligible     ready
  - a9635b5c  dc  be9   BE  false  eligible     ready
  - 0d10c06a  dc  fe92  FE  false  eligible     ready
  - ca965927  dc  fe46  FE  false  eligible     ready

I then refine those results here (only because im trying to debug this, or i would have done it all in one action)

- name: restructure the output of nomad into a more usable object
  set_fact:
    # grab the id, and hastname which is the first item in the new list of text
    # make it a list of dicts
    node_host_map: "{{ node_host_map | default ([]) + [{_id:_text.0}] }}"
  # remove the first item in the list which is the headers
  loop: "{{ nomad_node_status_list[1:] }}"
  vars:
    # get the 3rd and beyond items in the whitespace separated string
    _text: "{{ item.split()[2:] }}"
    # get the ID which is the first item in the whitespace separated string
    _id: "{{ item.split() | first }}"
- debug:
    var: node_host_map

those results looks ike this now

  node_host_map:
  - 8a64d5e9: be7
  - 7795cd1b: be15
  - 8db38b87: be1
  - ba29bbb1: be2
  - a9635b5c: be9
  - 0d10c06a: fe92
  - ca965927: fe46
  • i am then querying that variable using the select filter like this (in a loop that loops over the nomad ips, in the vars section (see _node_id)
 - name: set fact for all_node_host_map including all hostnames/ips including shutdown ones
  set_fact:
    line: "{{ line | default ([]) + _line }}"
    results: "{{ results | default ([]) + [ {idx:[_nomad_host_ip_item, _hostvars_hostname, _nomad_hostname_ip_map_hostname, _node_id]} ] }}"
  loop: "{{ groups.nomad_instances_all }}"
  loop_control:
    loop_var: nomad_host_ip_item
    index_var: idx
  vars:
    _nomad_host_ip_item: "{{ nomad_host_ip_item }}"
    _hostvars_hostname: "{{ hostvars[nomad_host_ip_item]['ansible_hostname'] }}"
    _nomad_hostname_ip_map_hostname: "{{ nomad_hostname_ip_map[nomad_host_ip_item] }}"
    # for one element only, the one for be1, its returning two items, be1 and be15.  since be15 is last, thats where the bug wasi
    # example for fe46 which looks like the rest, as expected, a single result for the _node_id that was filtered by select()
    #
    #- 8:
    #  - 10.100.1.60
    #  - be1
    #  - be1
    #  - - 7795cd1b: be15
    #    - 8db38b87: be1
    #- 9:
    #  - 10.100.1.46
    #  - fe46
    #  - fe46
    #  - - ca965927: fe46
    _node_id: "{{ node_host_map | select('search',_hostvars_hostname) }}"

so sull results from _node_id is coming back like this

ok: [10.100.1.48] =>
  results:
  - 1:
    - 10.100.1.48
    - be15
    - be15
    - - 7795cd1b: be15
  - 3:
    - 10.100.1.93
    - be7
    - be7
    - - 8a64d5e9: be7
  - 5:
    - 10.100.1.92
    - fe92
    - fe92
    - - 0d10c06a: fe92
  - 6:
    - 10.100.1.25
    - be9
    - be9
    - - a9635b5c: be9
  - 7:
    - 10.100.1.91
    - be2
    - be2
    - - ba29bbb1: be2
  - 8:
    - 10.100.1.60
    - be1
    - be1
    - - 7795cd1b: be15
      - 8db38b87: be1
  - 9:
    - 10.100.1.46
    - fe46
    - fe46
    - - ca965927: fe46

Expected:
When i select the _node_id line, I only expect one list string, that contains the node id. Notice in the be1 results i get be15 first. So when i select that line later in a more refined setup, i was only getting node_id 7795cd1b for be1, which erroneous.
So i demonstrated it in a way that I can atleast know why the results were erroneus, and thats because the item for be1 is returning two result items. Why is that??

Do i simply need to add another filter? E.g. i Can add | first first but that could be disastrous, i would rather resolve how to select/search properly, or maybe use map or match instead, if someone can show me how to better extract the correct linle from the nomad_node_status_list or node_host_map

Is this a bug?

  • ansible -v 2.14.2
    python 3.11.2

Sorry this doesn’t directly answer you question but have you considered using the community.general.read_csv module or the jc csv parser via the community.general.jc filter to parse the data to potentially make it easier to extract the exact bits needed?

Also rather than select('search',_hostvars_hostname) would select('regex',_hostvars_hostname) return the expected result?

1 Like

This is not a bug. search matches using regex, and the regex be1 matches both be1 and be15. You may be able to make it work by adding an anchor or otherwise making the regex more exact, but personally I would just construct an actual map in the first place:

- hosts: localhost
  vars:
    nomad_node_status_str: |-
      ID        DC   Name           Class      Drain  Eligibility  Status
      8a64d5e9  dc  be7   BE  false  eligible     ready
      7795cd1b  dc  be15  BE  false  ineligible   ready
      8db38b87  dc  be1   BE  false  eligible     ready
      ba29bbb1  dc  be2   BE  false  eligible     ready
      a9635b5c  dc  be9   BE  false  eligible     ready
      0d10c06a  dc  fe92  FE  false  eligible     ready
      ca965927  dc  fe46  FE  false  eligible     ready
  tasks:
    - set_fact:
        node_host_map: "{{ node_host_map | default({}) | combine({_text.0: node_host_map[_text.0] | default([]) | union([_id])}) }}"
      loop: "{{ (nomad_node_status_str.splitlines())[1:] }}"
      vars:
        _text: "{{ item.split()[2:] }}"
        _id: "{{ item.split() | first }}"

    - debug:
        msg: "{{ node_host_map }}"

Output:

TASK [set_fact] ****************************************************************
ok: [localhost] => (item=8a64d5e9  dc  be7   BE  false  eligible     ready)
ok: [localhost] => (item=7795cd1b  dc  be15  BE  false  ineligible   ready)
ok: [localhost] => (item=8db38b87  dc  be1   BE  false  eligible     ready)
ok: [localhost] => (item=ba29bbb1  dc  be2   BE  false  eligible     ready)
ok: [localhost] => (item=a9635b5c  dc  be9   BE  false  eligible     ready)
ok: [localhost] => (item=0d10c06a  dc  fe92  FE  false  eligible     ready)
ok: [localhost] => (item=ca965927  dc  fe46  FE  false  eligible     ready)

TASK [debug] *******************************************************************
ok: [localhost] =>
    msg:
        be1:
        - 8db38b87
        be15:
        - 7795cd1b
        be2:
        - ba29bbb1
        be7:
        - 8a64d5e9
        be9:
        - a9635b5c
        fe46:
        - ca965927
        fe92:
        - 0d10c06a
4 Likes

Thanks!!

I went with the latter option

    - set_fact:
        node_host_map: "{{ node_host_map | default({}) | combine({_text.0: node_host_map[_text.0] | default([]) | union([_id])}) }}"
      loop: "{{ (nomad_node_status_str.splitlines())[1:] }}"
      vars:
        _text: "{{ item.split()[2:] }}"
        _id: "{{ item.split() | first }}"

I noticed that in place of union this also worked

node_host_map: "{{ node_host_map | default({}) | combine({_text.0: node_host_map[_text.0] | default([]) + [_id] } ) }}"

Is there an advantage to using union which could be a bit more of a learning curve for those that need to manage the code later? I was thinking maybe there was some functionality that benefitted such as when the _id is empty or something?..

Also, whats the definition of an

actual map

-Brian

I basically always use union because it deduplicates items and I find it easier to read. In this situation it probably has no practical difference.

Map is another term for a dictionary/hash, because it maps values to keys (see Built-in Types — Python 3.12.1 documentation) Despite the name of the variable the structure you were creating was a list, which you then had to iterate over to find the item corresponding to the key you wanted.

2 Likes