Hi - Does anyone (who understands how backslashes work in Ansible/YAML) know why both of the following tasks work:
(ansible2_15_8) rowagn@localhost:~#> cat d.yml
hosts: all
gather_facts: no
vars:
s: ‘This is a string containing 1 and 2.’
t:
p1_xyz
p2_xyz
p4_xyz
tasks:
name: single backslash
debug:
msg: ‘{{ item }} is in s’
loop: ‘{{ t }}’
when: ( item | regex_replace(‘^p(\d+).*$’, ‘\1’) ) in s
name: double backslash
debug:
msg: ‘{{ item }} is in s’
loop: ‘{{ t }}’
when: ( item | regex_replace(‘^p(\d+).*$’, ‘\1’) ) in s
(ansible2_15_8) rowagn@localhost:~#> ansible-playbook -i l d.yml
PLAY [all] ******************************************************************************************************************************************************
TASK [single backslash] *****************************************************************************************************************************************
ok: [localhost] => (item=p1_xyz) => {
“msg”: “p1_xyz is in s”
}
ok: [localhost] => (item=p2_xyz) => {
“msg”: “p2_xyz is in s”
}
skipping: [localhost] => (item=p4_xyz)
TASK [double backslash] *****************************************************************************************************************************************
ok: [localhost] => (item=p1_xyz) => {
“msg”: “p1_xyz is in s”
}
ok: [localhost] => (item=p2_xyz) => {
“msg”: “p2_xyz is in s”
}
skipping: [localhost] => (item=p4_xyz)
The tasks are extracting the number from the strings in list t and then looking for that number in string s. What is strange is the second example at https://docs.ansible.com/ansible/latest/collections/ansible/builtin/regex_replace_filter.html#examples indicates the backslashes in both parameters need to be doubled, but the above testing shows double backslashes are not required in the first parameter (they are required in the second parameter).
‘\1’ in the second argument is a “backref” (backwards reference) to the (\d+) in the first argument. It seems it is looking for an expression with digits and extracting the digits.
Your list ‘t’ has names with p1_xyz, p2_xyz, p4_xyx so this regex would extract the 1, 2, 4 digits from those strings.
Your string ‘s’ has digits 1 and 2. You are getting two lines of output as expected.
Thanks Matt, but I still don’t get why the first parameter (\d) MAY be double backslashed but the second parameter (\1) MUST be double backslashed. However, I’m starting to think it’s at the python level. https://stackoverflow.com/a/33582215 says Python’s string parser causes both \d and \d to become \d. But why? A little more searching takes me to https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences, where I think I see why \1 becomes \1 and \1 becomes a non-printable character (octal 1). But then, by analogy, \d should become \d (it does) but why doesn’t \d become an error (since it’s not listed as a valid escape sequence).
Right, but why doesn’t the \d need to be double-backslashed? Backslash-d is regex for matching on a digit. I just don’t get why doubling the backslash is needed on the 1 but not on the d.
Part of the problem is also knowing what characters are escape sequences in python.
\1 is an escape sequence, equivalent to \x01, and not equivalent to the literal \1. As such a literal \1 needs to be represented in python as \\1. \d is not an escape sequence and thus can be written as a literal \d without escaping the \
There is also a difference with quoting in YAML as mentioned above, between single quotes and double quotes. But note that the behavior of YAML with quotes only applies to quotes that surround the entire YAML value. So the single quotes you have in the middle of your string do not affect the YAML quoting differences. When not using quotes surrounding the full value in YAML, you are using “Plain Style” which has different rules than both single and double quoted values.
YAML single quotes are basically equivalent to python raw strings, where a backslash is always treated as literal. Double quotes require escaping backslashes. You can read more about the flow scalar styles of YAML at https://yaml.org/spec/1.2.2/#73-flow-scalar-styles
Thanks everyone. I’m going to chalk this up to a Python anomaly. IMO, since \d is not a valid escape sequence, Python should raise an error rather than transparently converting it into \d.