Fixing architecture-specific gaps in ansible core facts?

I’ve previously used Ansible to manage x86 server hardware, so I’m used to having access to hardware make/model etc. info via Ansible facts.

I’m currently managing a fleet of hardware which includes several different models of small Linux ARM processor devices (think Raspberry Pi 3 classes of class hardware).

The hardware facts which ansible usually provides on Linux makes use of x86-specific technologies like DMI https://en.wikipedia.org/wiki/Desktop_Management_Interface to expose hardware manufacturer information which is reported by the manufacturer’s firmware via get_dmi_facts() in ansible/module_utils/facts/hardware/linux.py

To expedite development, I made a trivial (18 line) patch to ansible/module_utils/facts/hardware/linux.py to record the hardware board “model” and “compatible” which is passed to Linux by the device firmware via the “Device Tree” hardware description standard which is used on most ARM (and RISC-V, etc.) architectures devices, including nearly all Android devices (and exposed by Linux under /sys/firmware/devicetree/ ).

Before I contribute this patch, I was wondering whether it is appropriate to add this functionality to ansible/module_utils/facts/hardware/linux.py ? Would these facts be better collected by a module?

Thanks,

Tim.

For now module_utils/facts/ is best place to add these.

In the long run, I hope to be able to make a bunch of 'facts modules'
and eventually deprecate 'setup.py'.
Many of the changes to fact gathering have been moving in this direction:
  - gather_facts action is now responsible for the play's gather_facts
(normally still calls setup module)
  - gather_facts supports configurable set of modules (network_os
resolution already built in)
  - gather_facts can run multiple modules in parallel

Hi Brian,

If you are making changes in this area, then something that would be
really helpful is providing more fine grained control over which facts
are collected (currently gather_subset), and which are stored
(filter). The gather_subset is quite coarse grained, and filter is
only really usable if you are wanting to refresh one or more similarly
named facts.

To provide some context, I have been spending a lot of time recently
trying to improve the performance of Ansible at a moderate (~200)
scale. As I'm sure you're aware, facts are a *huge* source of
slowness, taking us from a baseline of 2 seconds to execute a task to
30+ seconds.

In our particular case the target hosts are the controllers and
hypervisors in an OpenStack cloud. On these machines there can be many
virtual network interfaces, each creating a large fact. Our current
solution looks like this:

filter: "ansible_[!qt]*"

Which removes the virtual interfaces that begin with q or t. We are
lucky to not be filtering out any useful facts with this. A more
flexible system such as a list of regexes would be a big help.

Thanks,
Mark

I was not going to use 'filter' for that since this is an 'after
gathering' feature, but a more 'proactive' ignore capability for each
subsystem, filter just removes the transmission of the data back to
the controller, not the actual gathering of it.

The issue with ignore, it is very dependant on the method/code used to
do the gathering, which can vary by platform, OS, distro, distro
version, etc. I have not found a good way to handle that yet.

Absolutely, avoiding gathering the facts in the first place is
preferable. However, in terms of performance I would say that the
overhead of gathering the facts is minimal compared with the effect
that those facts have in slowing down subsequent tasks.
Mark

Well, 2 diff 'performance' considerations, target and controller, yes
filter and ignore will be the same on that side, but this is also a
problem on some targets in which accessing a fact can hang or take
very long time .. think badly performant nfs mount. Even then, target
performance can also affect the controller by occupying the forks for
too long and delaying the play in general.