LVM facts don't include all LVs and VGs

Hello all,

sharing this here in parts to have it recorded somewhere and in other parts to get some feedback from the community.

I’ve spent a bunch of effort deep in Ansible trying to optimize the time it takes to gather facts on linux systems recently,
more on that in a later post once I polished up some things.

While testing some performance improvements I noticed a bug in the way lvs and vgs
facts are generated into the lvm fact.

Ansible treats the names of both of them as globally unique, which is unfortunately not true.
They are only guaranteed to be unique in relation to their parent vg and pv respectively.

In effect this leads to the resulting fact not containing all lvs or vgs
as the last one (as determined by default sort order) always overwrites the previous one(s).

Believing this to be both a bug and a breaking change if it were to be fixed
I headed over to the issue tracker and apparently this issue was already identified quite a while ago in 2020.
https://github.com/ansible/ansible/issues/71041

Unfortunately no work happened on it back then, which either means no one is really affected by this bug
or all parties affected are working around this in private.

So I guess my real question is this:

Have you ever been bitten by this bug in your environment and if yes, are you maintaining a local workaround?

Ps:
The code changes to make these facts behave more in-line with the underlying reality are not complex and I’d be happy to provide them - But the factual breaking behavior change is a very different story, which I do not feel comfortable without upfront discussions and guidance from someone more experienced in Ansible core development.

I’ve been giving this a lot of thought, but I don’t have a good solution:

  • make this part of a larger project to phase out setup.py and move to new modules ansible_lvm_facts for this case, which would use a proper return schema that does match reality.
  • change the current schema to allow for lists, this is still backwards incompatible, but will only affect those with this exact issue, then they can test for list existence as way to find the multiple matches.
  • add a new root to the existing schema that deals with a hierarchical version of the same data.
  • create unique numbered entries with ‘real_name’ key underneath

Every option has different pro/cons and I’m not sure there is a clear winner.