Parsing XML where attribute has period in it

Hi all…I am trying to parse an XML file. The piece I need to grab is slightly different between versions of the XML.

Old Version that works:
XML:

<dataSource 
        <properties serverName="RDS-NAME"/>
</dataSource>

Ansible Code (after grabbing XML using Xpath):

ansible.builtin.set_fact:
  server_name:  "{{ db_engine.matches[0].properties['serverName'] }}"

So that properly returns RDS-NAME
*

New Version that I can’t get to work:
XML:

<dataSource 
        <properties.oracle serverName="RDS-NAME"/>
</dataSource>

Ansible Code (after grabbing XML using Xpath):

ansible.builtin.set_fact:
  server_name:  "{{ db_engine.matches[0].properties.oracle['serverName'] }}"

I get an error saying that it has no attribute 'properties' … so I think that because there is now a period in the XML (properties.oracle), my Ansible code is seeing properties and oracle as separate.

I have tried various escape characters (/, //, \, \\) before the period, and enclosing in quotes ("{{ db_engine.matches[0].'properties.oracle'['serverName'] }}", yet I cannot get it to read it properly.

Any help is greatly appreciated!!! Thanks!

This worked for me:

    - name: Get XML data
      community.general.xml:
        path: stuff.xml
        xpath: '/dataSource/properties.oracle'
        content: attribute
      register: db_engine

    - name: Show serverName
      ansible.builtin.debug:
        msg: "{{ db_engine.matches[0]['properties.oracle']['serverName'] }}"

@mcen1 This didn’t work for me …

I should have specified that I have my Xpath set as xpath: /server/dataSource/* so I can grab various other pieces.

So let me be more specific:

XML:

<server 
    <dataSource 
            <jdbcDriver id="oracle" libraryRef="Lib" />
            <properties.oracle serverName="RDS-NAME"/>
    </dataSource>
</server>

When I have this for Ansible Code, using my Xpath an your suggested format:

- name: Get database engine information for Tririga
  community.general.xml:
    path: "XML-FILE"
    xpath: /server/dataSource/*
    content: attribute
  register: db_engine

- name: Echo variables
  ansible.builtin.debug:
    msg:
      - "{{ db_engine.matches[0]['properties.oracle']['serverName'] }}"

I now get the error 'dict object' has no attribute 'properties.oracle'

Thanks!

I believe you’re receiving that error because there are multiple entries under the dataSource element in your XML, so you can’t rely on list index 0 to always include “properties.oracle”. Given the example you have above, db_engine.matches[0] probably only contains jdbcDriver, and properties.oracle is in db_engine.matches[1] (depending on if you’re still cutting out content that’s in the XML file). I’m basing this off my local copy of what you have here (I added another XML element called testItem for fun), which results in this:

TASK [show] *******************************************************************************************************************************************************************************************

ok: [localhost] => {

    "msg": "db_engine.matches[0] is {'jdbcDriver': {'id': 'oracle', 'libraryRef': 'Lib'}} db_engine.matches[1] is {'testItem': {'id': 'example'}} db_engine.matches[2] is {'properties.oracle': {'serverName': 'RDS-NAME'}}"

}

You’ll need to loop through the “db_engine.matches” list for the information you’re after, in some way. Sorry I can’t be more specific as I don’t know enough about your code to recommend what to do exactly, but hopefully this is helpful.

First, fix your XML. You are missing a closing “>” after both “<server” and “<dataSource”.

Once you fix that, you’ll find the error message is actually correct. The dict object “db_engine.matches[0]” indeed does not have a “properties.oracle” attribute. But dict object “db_engine.matches[1]” does.

You probably want a general expression that will work with multiples, like in this example:

---
# jrglynn2_01.yml
- name: XML parsing
  hosts: localhost
  gather_facts: false
  vars:
    xmlfilename: /tmp/jrglynn2_01.xml
  tasks:
    - name: Create XML file
      ansible.builtin.copy:
        content: |
          <server>
              <dataSource>
                      <jdbcDriver id="oracle" libraryRef="Lib" />
                      <properties.oracle serverName="RDS-NAME-One"/>
                      <properties.oracle serverName="RDS-NAME-Two"/>
                      <something  from="nowhere" without="meaning" />
              </dataSource>
          </server>
        dest: '{{ xmlfilename }}'

    - name: Get database engine information for Tririga
      community.general.xml:
        path: '{{ xmlfilename }}'
        xpath: /server/dataSource/*
        content: attribute
      register: db_engine

    - name: Echo db_engine
      ansible.builtin.debug:
        msg: '{{ db_engine }}'

    - name: Echo serverName attribute(s) from properties.oracle(s)
      ansible.builtin.debug:
        msg: "{{ db_engine.matches
                 | map('dict2items')
                 | map('selectattr','key','eq','properties.oracle')
                 | flatten(1)
                 | map(attribute='value.serverName')
              }}"

Hope that helps. Good luck.

1 Like

That was it - looking at [1] worked. I didn’t realize that when I changed the Xpath, it changed the number references for matches.

Thanks all!

Be careful hard-coding that index. I don’t think the ordering is guaranteed, in which case an expression like

             "{{ db_engine.matches
                 | map('dict2items')
                 | map('selectattr','key','eq','properties.oracle')
                 | flatten(1)
                 | map(attribute='value.serverName')
                 | first
              }}"

will give you the serverName regardless of the order, but {{ db_engine.matches[1]['properties.oracle']['serverName'] }} will give you an error if the order ever changes.

It looks like the issue you’re facing stems from how Ansible handles attribute names with periods. In your new XML version, properties.oracle is interpreted as a nested dictionary, so you need to adjust your Ansible code accordingly. Try accessing the attribute like this:

ansible.builtin.set_fact:
server_name: “{{ db_engine.matches[0].properties[‘oracle’][‘serverName’] }}”

Ensure that you correctly navigate through the nested dictionary to fetch the serverName attribute. If the issue persists, double-check that properties.oracle exists within your XML data and isn’t being overlooked.

That’s not what’s going on, though. “properties.oracle” really is a key, with the dot in it. If you dump out the registered variable, you get:

    actions:
      namespaces: {}
      state: present
      xpath: /server/dataSource/*
    changed: false
    count: 2
    failed: false
    matches:
    - jdbcDriver:
        id: oracle
        libraryRef: Lib
    - properties.oracle:
        serverName: RDS-NAME
    msg: 2

Oh that’s interesting. The ordering seems to be consistent as of now, but this does seem like a better way to do it. I will test it out. Thanks @utoddl !!

My problem with expressions like that (and I wrote that one!) is that as soon as I quit staring at it, I can’t explain how it works. A week later, I can’t remember why it works. A month later I won’t be sure if it works.

In order to remind myself what it’s doing, I have to do a series of ansible.builtin.debug tasks, the first one with just the first line or two in it, then adding single lines in subsequent debug tasks so I can see the changes in the data due to each line.

Perhaps it’s just me, but powerful as these expressions can be, they can be equally as opaque. I have some - in production no less - that have “| map('map',[…])” in them, and they’d better be documented in the extreme. Don’t edit them without a buddy, and don’t commit/push changes to them unless both of you are convinced it’s solving more problems than it’s creating. And not on a Friday.

Yeah I am with you @utoddl . I try to make a LOT of comments in the code itself, plus I keep a “code examples/best practices” document for more detail. Right now I am pretty much the only one doing automation, but supposedly more on the team want to get into it …

I haven’t had a chance to play with this yet, hopefully tomorrow - do you know offhand if I can use wildcards? So instead of map('selectattr','key','eq','properties.oracle'), use map('selectattr','key','eq','properties*')?

Then I can use one expression for either version shown above … where the file might have properties or properties.oracle

If you don’t know that is fine, I can experiment myself.

Thanks again!

Not with “eq”, but you can use other tests, like “match” and “search”:

    - name: Echo serverName attribute(s) from properties.oracle(s)
      ansible.builtin.debug:
        msg: "{{ db_engine.matches
                 | map('dict2items')
                 | map('selectattr','key','eq','properties.oracle')
                 | flatten(1)
                 | map(attribute='value.serverName')
              }}"

    - name: Echo serverName attribute(s) from properties.*
      ansible.builtin.debug:
        msg: "{{ db_engine.matches
                 | map('dict2items')
                 | map('selectattr','key','match','properties.*')
                 | flatten(1)
                 | map(attribute='value.serverName')
              }}"

    - name: Echo serverName attribute(s) from .*orac.*
      ansible.builtin.debug:
        msg: "{{ db_engine.matches
                 | map('dict2items')
                 | map('selectattr','key','search','orac')
                 | flatten(1)
                 | map(attribute='value.serverName')
              }}"
1 Like