Extracting parts of a string

dulhaver · April 14, 2024, 12:23pm

this is kind of a follow up quesiton on Unarchive - using wildcards in 'src' option

So, I am getting a value like /opt/nexus/install/nexus-3.67.1-01-unix.tar.gz baked into a variable which I can address via nexus_tar.path.

This results into an unarchived directory later down the line named /opt/nexus/nexus-3.67.1-01.

Now I want to create a symlink to that directory, but only have /opt/nexus/install/nexus-3.67.1-01-unix.tar.gz to work with.

how do I extract nexus-3.67.1-01 from /opt/nexus/install/nexus-3.67.1-01-unix.tar.gz in order to use that in a new path? I think it is python string manipulation knowledge (which apparently I am lacking) being needed here.

I can figure out something like this

❯ python

>>> path = '/opt/nexus/install/nexus-3.67.1-01-unix.tar.gz'
>>> short_path = path.replace("/opt/nexus/install/","")
>>> version = short_path.replace("-unix.tar.gz" , "")
>>> version
'nexus-3.67.1-01'

or

>>> import re
>>> file_path = "/opt/nexus/install/nexus-3.67.1-01-unix.tar.gz"
>>> re.search(r'(nexus-.*?)-unix.tar.gz', file_path).group(1)
'nexus-3.67.1-01'

which, besides from probably being horrible python, makes me confident that it is something along those lines.

Can anybody kindly explain how to translate something along those python lines in anything like i.E.

- name: extract nexus-version
  set_fact:
     nexus_path: "/opt/nexus/{{ nexus_tar.path ..... }}"

or is there another approach to it?

chris · April 14, 2024, 1:30pm

I posted an answer for that in the last thread, you can use the basename filter to get just the filename and regex_replace filter to remove the start and the ends of the filename, for example, something like this:

- name: Set a fact for the Nexus version
  ansible.builtin.set_fact:
    nexus_version: >-
      {{ nexus_path |
      ansible.builtin.basename |
      ansible.builtin.regex_replace('^nexus-') |
      ansible.builtin.regex_replace('-unix[.]tar[.]gz$') }}
  vars:
    nexus_path: /opt/nexus/install/nexus-3.67.1-01-unix.tar.gz

dulhaver · April 14, 2024, 7:51pm

chris:

- name: Set a fact for the Nexus version
  ansible.builtin.set_fact:
    nexus_version: >-
      {{ path |
      ansible.builtin.basename |
      ansible.builtin.regex_replace('^nexus-') |
      ansible.builtin.regex_replace('-unix[.]tar[.]gz) }}
  vars:
    nexus_path: /opt/nexus/install/nexus-3.67.1-01-unix.tar.gz

it’s almost that.

path & nexus_path need to match in order to make it work. Would you mind to adjust that in your post, so I can mark it as solution?

chris · April 15, 2024, 9:20am

Oops, thanks for spotting the typo.

One other thing that could be optionally added is a version test in case the file name format changes, eg append something like | ansible.builtin.version('1.0.0', '>').

vbotka · April 16, 2024, 2:46am

path: /opt/nexus/install/nexus-3.67.1-01-unix.tar.gz
nexus_split: "{{ path | basename | split('-') }}"
nexus_path: "{{ nexus_split[0:3] | join('-') }}"
nexus_version: "{{ nexus_split[1:3] | join('-') }}"
nexus_version_major: "{{ nexus_split[1] }}"
nexus_version_minor: "{{ nexus_split[2] }}"

gives

nexus_path: nexus-3.67.1-01
nexus_version: 3.67.1-01
nexus_version_major: 3.67.1
nexus_version_minor: '01'

utoddl · April 16, 2024, 9:44pm

We just last week tackled a very similar problem, but with a complication: the top-level directory of the .tar.zg or .tgz archives don’t match their files’ basenames. Not even close. There’s marketing involved.

So we have to examine the tar archive’s contents to get the correct directory name, like this:

    - name: Inspect archive contents for app directory name
      ansible.builtin.shell: |
        set -o pipefail
        tar tf {{ our_app_staging_dir }}/{{ survey_app_tar_file }} | sed -n -e '1s/\/.*//g' -e 1p
      changed_when: false
      register: app_dir_name
      run_once: true  # noqa run-once[task]

In the course of developing that solution, we hit another problem that someone will surly say is an example of why you should favor modules over shell scripts. And they aren’t exactly wrong.

Our first attempt had a “head -n 1” in our pipeline after the tar command, but (surprise!) that causes the task to fail! It returns the right data, but head quits reading its input stream once it receives enough data to satisfy its parameters. This causes the tar process to fail, which triggers the pipefail to fail the script. In contrast, sed reads the entire stream even though it’s only ever going to print the 1st line. That’s what the “-e 1p” does, obviating the need for a head process in the pipeline.

I hope someone can benefit from our mistakes. We learned something. (Again, probably.)

system · May 16, 2024, 9:44pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unarchive - using wildcards in 'src' option Get Help	4	387	May 11, 2024
Extract module - looking for feedback Ansible Project	0	1	August 13, 2013
unarchive.Get a name of file what is inside of src file Ansible Project	1	6	January 30, 2017
Unpacking a zip with variably named root folder Get Help playbook , ansible-core	3	224	January 7, 2024
unable to extract version out of registered output Ansible Project	5	6	May 24, 2023

Extracting parts of a string

Related topics