Core 2.19 and Data tagging playground

felixfontein · January 29, 2025, 7:02pm

In case you heard about data tagging in the past and are wondering when we’ll finally have it, there are some news regarding that: There is now a public PR in ansible/ansible with the current WIP implementation: Templating overhaul, implement Data Tagging by nitzmahone · Pull Request #84621 · ansible/ansible · GitHub

This is highly experimental and far from done (as far as I understand), and will take some time to get completed. It’s also a very massive change, and there are still quite a few known issues (see for example the changelog fragments included in the PR). For that reason, please refrain from commenting in that PR unless absolutely necessary - it’s easier to ask somewhere else first whether the comment is appropriate (like in this thread, generally in the forum, or on Matrix or IRC).

Anyway, I want to start this thread with a little test module that uses data tagging to mark a return value as deprecated:

#!/usr/bin/python
from ansible.module_utils.basic import AnsibleModule
from ansible.module_utils.datatag import AnsibleTagHelper
from ansible.module_utils.datatag.tags import Deprecated

def main():
    m = AnsibleModule(argument_spec=dict())
    m.exit_json(
        a='normal result',
        b=AnsibleTagHelper.tag('deprecated result', Deprecated(msg="Yo, this is deprecated!", removal_version="2.3.4")),
    )

if __name__ == '__main__':
    main()

When registering the result and using the deprecated return value, you get:

TASK [Output result B] ***********************************************************************************************************************************************************************
[WARNING]: Deprecation warnings can be disabled by setting `deprecation_warnings=False` in ansible.cfg.
[DEPRECATION WARNING]: Yo, this is deprecated! This feature will be removed in version 2.3.4.
Origin: /path/to/playbook.yml:13:14

11     - name: Output result B
12       debug:
13         msg: "{{ result.b }}"
                ^ column 14

ok: [localhost] => {
    "msg": "deprecated result"
}

If you don’t use the deprecated return value, no deprecation message will be shown. (See my gist for a playbook and full result.)

Please note that right now the Deprecated tag does not allow to specify the collection’s name (as opposed to the module.deprecated() call). I’m sure this will get added (it should be pretty simple to add actually, the hard work is all the other functionality and machinery tagging needs).

It should also be pretty simple to add a module utils to a collection that does try/catch around the data tagging imports to provide some convenience functionality to your collection to tag data if the user uses ansible-core 2.19+, and to not use it or older verisons. That will allow collections to use this feature once it’s released, without having to wait until all ansible-core versions supported by your collections have it (which tends to need a few years longer).

Please use this thread to discuss this, share ideas, tests, etc.!

felixfontein · January 29, 2025, 8:06pm

One observation: module parameters seem to be never tagged. (At least the Deprecated tag isn’t send in.)

You can test whether a variable x is tagged with Deprecated with Deprecated in AnsibleTagHelper.tag_types(x).

If you’re interested in the details of the Deprecated tag, you can do something like [(tag.msg, tag.removal_version, tag.removal_date) for tag in AnsibleTagHelper.tags(x) if isinstance(tag, Deprecated)].

In plugins you have access to more tags; see ansible.utils.datatag.tags for details. Currently there are AnsibleSourcePosition, VaultedValue, TrustedAsTemplate, NotATemplate, and _EncryptedSource (that one is for internal use only).

russoz · February 2, 2025, 6:03am

I tried to submit a PR for ansible core with the deps feature we use in community.general and I was told that it should wait for data tagging. Any idea of how dependency management is going to be impacted by data tagging? I am not munch that PR with 800+ files just to find out.

In fact, applying agile mindset here, a PR that big is quite a risk. Why is a change that big being pushed forward? Wouldn’t it be easier to manage a number of smaller changes rather than one big one? Just wondering.

felixfontein · February 2, 2025, 8:41am

It likely isn’t, but if you look at how many parts of ansible-core are touched by the data tagging PR, I think they want to merge only absolutely neccessary things until data tagging is merged. Apparently now they already sometimes have to spend days rebasing to resolve conflicts when something is merged to devel.

In fact, applying agile mindset here, a PR that big is quite a risk. Why is a change that big being pushed forward? Wouldn’t it be easier to manage a number of smaller changes rather than one big one? Just wondering.

No idea, I’ve wondered about that as well… A guess of mine is that development of this feature took so many exploration that at some point it was easier to create a gigantic PR instead of trying to split it up. But

felixfontein · February 2, 2025, 8:44am

(Data tagging has been announced for several years now, and always got delayed. I think I heard first mentions of the idea for it 5-6 years ago, back when everything was still in ansible/ansible. Context back then was deprecation of a return value of a module.)

russoz · February 2, 2025, 8:50am

Well, that is frustrating to some degree - it tells me that Ansible development is largely frozen until this mammoth hatches out of its egg.

felixfontein · February 2, 2025, 9:51am

ansible-core Ansible is more than ansible-core… But yeah, that definitely seems to be the case.

felixfontein · February 2, 2025, 12:43pm

If you’re curious how communication from the module to controller looks like: it’s still JSON, and using special elements to communicate:

{
  "a": "normal result",
  "b": {
    "value": "deprecated result",
    "tags": [
      {
        "msg": "Yo, this is deprecated!",
        "removal_version": "2.3.4",
        "__ansible_type": "Deprecated"
      }
    ],
    "__ansible_type": "_AnsibleTaggedStr"
  },
  "invocation": {
    "module_args": {}
  }
}

So basically if you get a dict with a __ansible_type key, then this dict encodes a value that can have tags, and you have to decode it according to what __ansible_type’s value is (in many cases, it’s just taking what’s in value, like in this example). The protocol also allows to transport some Python objects like dates, datetimes, or times via JSON.

This information also allows non-Python modules to return tags, or receive tags (if that will ever happen - I hope there will be some no_log tag that’s passed to modules, but ).

felixfontein · March 4, 2025, 5:30am

See Data Tagging preview and testing.

apollo13 · March 10, 2025, 7:47pm

This does indeed look interesting. Looking through the PR I didn’t find any other (useful) tags that could be used. I was hoping for a tag to mark a value as sensitive to prevent it from ever showing up in logs – is something like this planned?

felixfontein · March 10, 2025, 8:32pm

I would hope so, but so far: no idea… Right now the public API doesn’t even allow to query whether a tag is set, you can only add tags. So far there’s only “trusted” (on controller) and “deprecated” (controller and modules).

My guess is (still) that the core team first wants to get the feature out before starting to add more tags. But I don’t know if anything is planned so far… Maybe @nitzmahone has some insights to share?

nitzmahone · March 10, 2025, 10:23pm

Yes, a SensitiveData tag was part of the original PoC that I demoed at a contrib summit a few years ago, but with the huge swath of Ansible’s surface area that the feature touches (and the raft of other changes it exposed a need for), we had to drop that one from the initial release.

I really hope we’ll have time to get back to it, because IMO it’s the most compelling use case for data tagging. It’s not hard to make it work in the happy path, but (as with most things in Ansible due to its “organically grown” nature) there are lots of weird corner cases where it can’t work securely/correctly/quickly without more significant rework of the guts.

Our concern with keeping it in the initial release was that, given the inherent security implications of such a feature, it’ll be an endless source of data disclosure CVEs if we can’t work most of those problems out before it ships. Since data tagging’s template trust model inversion just got rid of one of those, you can probably understand our reticence to write a new one.

This project has also been delayed way too many times already- we really want folks to start reaping some of the other benefits! Plus, if you haven’t had the pleasure of regularly rebasing a > 850 file divergent branch of ansible-core for over two years, well, you just don’t know what a good time looks like .

apollo13 · March 11, 2025, 7:19am

This project has also been delayed way too many times already- we really want folks to start reaping some of the other benefits!

Makes sense, I was mainly curious. I think updating the existing preview forum post to include a bit more information about potential planned tags might help to gather more interest as well.

Plus, if you haven’t had the pleasure of regularly rebasing a > 850 file divergent branch of ansible-core for over two years, well, you just don’t know what a good time looks like .

Well, obviously not for ansible-core but I had my fair share of running large side-branches for multiple years in other projects. Nothing that I plan to do anytime again anytime soon

felixfontein · March 11, 2025, 7:42pm

The interface changed BTW compared to the example in my original post. On the controller side (plugins), you can use

from ansible.template import trust_as_template, is_trusted_as_template

to mark strings as trusted (my_string_variable = trust_as_template(my_string_variable)) or test whether a string variable is trusted (if is_trusted_as_template(my_string_variable): ...).

In module_utils there’s

from ansible.module_utils.datatag import deprecate_value, native_type_name

where deprecate_value allows to deprecate return values (module.exit_json(value=deprecate_value(value, "This value is deprecated", removal_version="2.0.0"))) and native_type_name allows to simplify type names (type(my_variable) can have funky names like _AnsibleTaggedStr, native_type_name(my_variable) or native_type_name(type(my_variable)) will give back things like str).

Jeff_Pullen · April 7, 2025, 10:46pm

There are some examples here in how this looks from a module perspective, but I didn’t see how/if it would be exposed to automation developers. Is this only an internal change, or does it change how you define and use variables in your playbooks and roles?

samccann · April 15, 2025, 1:13pm

@Jeff_Pullen - There is another conversation for developers starting at Making a collection compatible with data tagging - might be a place to get more info/advice.

mariolenz · April 23, 2025, 2:56pm

@Jeff_Pullen I guess it depends on what you’re doing. Maybe it would be a good idea to read the Ansible-core 2.19 Porting Guide to find out more.

It looks like there are some changes that might affect people who are only using but not developing modules. So I wouldn’t say this is an internal change only.

Topic		Replies	Views
ansible-core 2.19 release schedule update Ecosystem Releases ansible-core	2	384	June 3, 2025
[Vote ended on 2024-08-21] Ansible 11 roadmap Project Discussions documentation , community , community-wg	5	295	August 26, 2024
Core-2.19 templating changes - preview and testing Project Discussions awx , fedora , ansible-core , ansible-lint , aap , templating	56	2821	June 21, 2025
The Bullhorn #181 Newsletter galaxy-ng , collections , news-for-maintainers , templating	0	514	April 14, 2025
Opening issues against all collections in Ansible for core-2.19 template testing Collection Development community-wg , templating	23	207	April 26, 2025

Core 2.19 and Data tagging playground

Related topics