Data tagging playground

In case you heard about data tagging in the past and are wondering when we’ll finally have it, there are some news regarding that: There is now a public PR in ansible/ansible with the current WIP implementation: [WIP] Templating overhaul, implement Data Tagging by nitzmahone · Pull Request #84621 · ansible/ansible · GitHub

This is highly experimental and far from done (as far as I understand), and will take some time to get completed. It’s also a very massive change, and there are still quite a few known issues (see for example the changelog fragments included in the PR). For that reason, please refrain from commenting in that PR unless absolutely necessary - it’s easier to ask somewhere else first whether the comment is appropriate (like in this thread, generally in the forum, or on Matrix or IRC).

Anyway, I want to start this thread with a little test module that uses data tagging to mark a return value as deprecated:

#!/usr/bin/python
from ansible.module_utils.basic import AnsibleModule
from ansible.module_utils.datatag import AnsibleTagHelper
from ansible.module_utils.datatag.tags import Deprecated

def main():
    m = AnsibleModule(argument_spec=dict())
    m.exit_json(
        a='normal result',
        b=AnsibleTagHelper.tag('deprecated result', Deprecated(msg="Yo, this is deprecated!", removal_version="2.3.4")),
    )

if __name__ == '__main__':
    main()

When registering the result and using the deprecated return value, you get:

TASK [Output result B] ***********************************************************************************************************************************************************************
[WARNING]: Deprecation warnings can be disabled by setting `deprecation_warnings=False` in ansible.cfg.
[DEPRECATION WARNING]: Yo, this is deprecated! This feature will be removed in version 2.3.4.
Origin: /path/to/playbook.yml:13:14

11     - name: Output result B
12       debug:
13         msg: "{{ result.b }}"
                ^ column 14

ok: [localhost] => {
    "msg": "deprecated result"
}

If you don’t use the deprecated return value, no deprecation message will be shown. (See my gist for a playbook and full result.)

Please note that right now the Deprecated tag does not allow to specify the collection’s name (as opposed to the module.deprecated() call). I’m sure this will get added (it should be pretty simple to add actually, the hard work is all the other functionality and machinery tagging needs).

It should also be pretty simple to add a module utils to a collection that does try/catch around the data tagging imports to provide some convenience functionality to your collection to tag data if the user uses ansible-core 2.19+, and to not use it or older verisons. That will allow collections to use this feature once it’s released, without having to wait until all ansible-core versions supported by your collections have it (which tends to need a few years longer).

Please use this thread to discuss this, share ideas, tests, etc.!

2 Likes

One observation: module parameters seem to be never tagged. (At least the Deprecated tag isn’t send in.)

You can test whether a variable x is tagged with Deprecated with Deprecated in AnsibleTagHelper.tag_types(x).

If you’re interested in the details of the Deprecated tag, you can do something like [(tag.msg, tag.removal_version, tag.removal_date) for tag in AnsibleTagHelper.tags(x) if isinstance(tag, Deprecated)].

In plugins you have access to more tags; see ansible.utils.datatag.tags for details. Currently there are AnsibleSourcePosition, VaultedValue, TrustedAsTemplate, NotATemplate, and _EncryptedSource (that one is for internal use only).

I tried to submit a PR for ansible core with the deps feature we use in community.general and I was told that it should wait for data tagging. Any idea of how dependency management is going to be impacted by data tagging? I am not munch that PR with 800+ files just to find out.

In fact, applying agile mindset here, a PR that big is quite a risk. Why is a change that big being pushed forward? Wouldn’t it be easier to manage a number of smaller changes rather than one big one? Just wondering.

It likely isn’t, but if you look at how many parts of ansible-core are touched by the data tagging PR, I think they want to merge only absolutely neccessary things until data tagging is merged. Apparently now they already sometimes have to spend days rebasing to resolve conflicts when something is merged to devel.

In fact, applying agile mindset here, a PR that big is quite a risk. Why is a change that big being pushed forward? Wouldn’t it be easier to manage a number of smaller changes rather than one big one? Just wondering.

No idea, I’ve wondered about that as well… A guess of mine is that development of this feature took so many exploration that at some point it was easier to create a gigantic PR instead of trying to split it up. But :person_shrugging:

(Data tagging has been announced for several years now, and always got delayed. I think I heard first mentions of the idea for it 5-6 years ago, back when everything was still in ansible/ansible. Context back then was deprecation of a return value of a module.)

Well, that is frustrating to some degree - it tells me that Ansible development is largely frozen until this mammoth hatches out of its egg. :person_shrugging:

ansible-core :slight_smile: Ansible is more than ansible-core… But yeah, that definitely seems to be the case.

If you’re curious how communication from the module to controller looks like: it’s still JSON, and using special elements to communicate:

{
  "a": "normal result",
  "b": {
    "value": "deprecated result",
    "tags": [
      {
        "msg": "Yo, this is deprecated!",
        "removal_version": "2.3.4",
        "__ansible_type": "Deprecated"
      }
    ],
    "__ansible_type": "_AnsibleTaggedStr"
  },
  "invocation": {
    "module_args": {}
  }
}

So basically if you get a dict with a __ansible_type key, then this dict encodes a value that can have tags, and you have to decode it according to what __ansible_type’s value is (in many cases, it’s just taking what’s in value, like in this example). The protocol also allows to transport some Python objects like dates, datetimes, or times via JSON.

This information also allows non-Python modules to return tags, or receive tags (if that will ever happen - I hope there will be some no_log tag that’s passed to modules, but :person_shrugging:).