Best approach to filtering "hardware" / "mounts" facts for specific problematic file system types?

Hi all:

I was investigating how some of our playbooks run extremely slow, and while some of this may be due to general Ansible performance problems, I suspected that the excessive slowness may be environmental in some way. This eventually lead to an analysis of system facts.

We (still) use a system called ClearCase. High level impact here is that many of our systems have hundreds of mounts that are not real file systems, but are views of contextual data out of ClearCase that overlay other file systems. We don’t really care about these mounts as ClearCase manages them. When ClearCase starts, they come into existence. When ClearCase stops, they disappear. This makes them somewhat transient, although most of these systems that are up, will have most of these mounts most of the time.

This generates very large ansible_mounts data structures for each host in the inventory, that seem to have significant overhead in processing. For about 200 machines, running a couple of tasks took about 12.5 minutes. With the following quick fix to add another condition to get_mount_facts, it dropped to 4 minutes:

for fields in mtab_entries:

if fstype == ‘none’:

continue

if fstype == ‘mvfs’:

continue

For relative comparison, here is the size of the files output from the setup module before and after the above quick fix:

-rw-r–r–. 1 root root 504946 Apr 30 19:11 facts.before

-rw-r–r–. 1 root root 49933 Apr 30 19:13 facts.after

This is significantly less JSON to generate, process, and transfer. 9/10th of the JSON is due to these MVFS mounts!

Here is an example of one of the entries we don’t care about, with some data stripped:

{

“block_available”: 30807263,

“block_size”: 65536,

“block_total”: 58720256,

“block_used”: 27912993,

“device”: “:”,

“fstype”: “mvfs”,

“inode_available”: 28714298,

“inode_total”: 31876696,

“inode_used”: 3162398,

“mount”: “/vobs/…”,

“options”: “rw,nosuid,relatime,uuid=ae21830d.e45211df.98a8.00:01:83:4d:ca:30”,

“size_available”: 2018984787968,

“size_total”: 3848290697216,

“uuid”: “N/A”

},

There are hundreds of these entries in the original facts. We also use NFS mounts, although they haven’t been as a big a deal as we use auto mounter for many of them. In many scenarios, I would probably want to skip both “mvfs” and “nfs”. It’s just that “mvfs” is the one causing the biggest problem today.

I’m happy to make a code submission, but I’d like to know what people think about these ideas:

  1. Hard code the exception, as I did above. Very simple change. However, if there are other file systems that should also be skipped this may be a limited design choice. Also, there could be somebody somewhere that actually does want to see MVFS mount points?

  2. Introduce a new variable which identifies the file system / mount types that should be skipped? In this case, MVFS wouldn’t need to be hard coded in the Ansible code. We could configure it in ansible.cfg or as a host variable? Possible implementation: gather_mounts=!mvfs,!nfs

  3. Introduce a new variable which identifies a class of file system that should be skipped (or kept?). This one is tougher, because it would require categorizing of all of the file systems, and there could be conflicting opinions about this, and this also leaves MVFS hard coded. Possible implementation: gather_mounts=local,!network

  4. I should disable the “hardware” facts entirely, and implement my own facts for the checks I do require, such as seeing of /localdisk is mounted. Leave Ansible alone. This might be the easiest fix for today, but it disables some pretty useful built-in capabilities from Ansible.

  5. I should maintain the fix locally as nobody else is interested in this, and nobody else cares about ClearCase or NFS or has problematic mounts like this.

Thanks!