Sar_facts module: work in progress

Hello everyone :slightly_smiling_face:

Since my colleagues, friends, and I primarily work on Linux hosts, we often need to extract or verify the data collected by sar.

While exploring the existing Ansible modules in ansible.builtin and community.general, I noticed that there is currently no facts module capable of extracting this data.

To address this, I am developing a new module called sar_facts, which retrieves data collected by sar and generates a structured dictionary within ansible_facts.

Current selectable data categories:

  • CPU
  • Load Average
  • Memory
  • Swap
  • Network
  • Disk

Available parameters:

parameter type required choices default description
type str true CPU, Load, Memory, Swap, Network, Disk ND collection category
date_start str false ND None collection start date
date_end str false ND None collection end date
average bool false true,false false get only average data
partition bool false true,false false get Disk data by partition

The module produces a dictionary with the following structure:

    "ansible_facts.sar_data": {
        "TYPE": {
            "DATE": {
                "TIME": {
                    "key": "value"
                }
            }
        }
    }

DATE and TIME are repeated for each collected day and hour.

Here’s an example of a task to extract disk data from 06/02/2025, to 07/02/2025, in partition mode:

    - name: collect disk data
      sar_facts:
        type: "Disk"
        partition: true
        date_start: "06/02/2025"
        date_end: "07/02/2025"

The ease of data extraction comes at the cost of the effort required to filter it and obtain specific information. :sweat_smile:

For example, to retrieve the list of await values for the specific volume centos-root, you would need to do the following:

    - name: Extract all await values for centos-root
      set_fact:
        root_await: >-
          {{ ansible_facts.sar_data.Disk
            | dict2items
            | map(attribute='value')
            | map('dict2items')
            | list | sum(start=[])
            | selectattr('value', 'defined')
            | map(attribute='value')
            | list | sum(start=[])
            | selectattr('DEV', 'equalto', 'centos-root')
            | map(attribute='await')
            | list
          }}

This module is still a work in progress and has not yet been published on GitHub.

The question is: would it actually be useful to Ansible users?

Would it be worth adding to ansible-core or community.general?

UPDATE

I have managed to simplify the extraction process by modifying the structure of the generated dict.

As a result, the dictionary will now be structured as follows:

    "ansible_facts.sar_data": {
        "TYPE": [
                 "date": "date value",
                 "time": "time value",
                 "key": "value"
                ]
     }

This change will make it much easier to filter the desired values.
Previous example:

    - name: Extract all await values for centos-root
      set_fact:
        root_await: >-
          {{ ansible_facts.sar_data.Disk
            | selectattr('DEV', 'equalto', 'centos-root')
            | map(attribute='await')
            | list
          }}

Or extract rxpck/s value of enp0s3 Network Interface:

    - name: Extract all rxpck values for enp0s3
      set_fact:
        enp0s3_rxpck: >-
          {{ ansible_facts.sar_data.Network
            | selectattr('IFACE', 'equalto', 'enp0s3')
            | map(attribute='rxpck/s')
            | list
          }}

UPDATE

The Module is now public on GitHub at NomakCooper/sar_info

Since it returns ansible_facts you should name it sar_facts; _info is for modules that return information but not in the form of facts.

I would also suggest packaging it in a collection for easier use, but I understand if you don’t want to go to that effort.

2 Likes

@flowerysong I originally named the module sar_facts for this reason, but based on a user’s suggestion on Reddit, I have temporarily published it as sar_info.

Currently, the module can be used by downloading it from the repository or through my newly created personal collection, nomakcooper.collection.

Before considering opening a PR to community.general or ansible-core, I want to test it further and determine whether it would be truly useful to Ansible users.

If you don’t want to call it sar_facts, you should change it so that it doesn’t return facts. (Which is a valid approach, to be fair.)

Also, that’s a terrible choice of default date format. ISO8601 exists for a reason.

2 Likes

Thanks for the suggestions!

My original idea was actually to add the sar_data dictionary to ansible_facts, which is why I initially chose that name, but I ended up being influenced by suggestions.

Changing the format to ISO8601 is always an option, it’s not too difficult.
Moreover, most hosts already use this format.

UPDATE!
I have updated the module locally, reverting the name back to sar_facts.

I also changed the date format to ISO8601 (YYYY-MM-DD).

example:

    - name: collect sar data
      sar_facts:
        type: "disk"
        partition: true
        date_start: "2025-02-07"
        date_end: "2025-02-07"
        time_start: "04:00:00"
        time_end: "04:20:00"

results:

ok: [localhost] => {
    "ansible_facts.sar_data.disk": [
        {
            "%util": "0.04",
            "AM": "AM",
            "DEV": "sda",
            "aqu-sz": "0.00",
            "areq-sz": "13.31",
            "await": "1.05",
            "date": "2025-02-07",
            "dkB/s": "0.00",
            "rkB/s": "0.26",
            "time": "04:10:04",
            "tps": "0.78",
            "wkB/s": "10.18"
        },
        {
            "%util": "0.04",
            "AM": "AM",
            "DEV": "cs-root",
            "aqu-sz": "0.00",
            "areq-sz": "11.20",
            "await": "1.04",
            "date": "2025-02-07",
            "dkB/s": "0.00",
            "rkB/s": "0.26",
            "time": "04:10:04",
            "tps": "0.93",
            "wkB/s": "10.18"
        },
        {
            "%util": "0.00",
            "AM": "AM",
            "DEV": "cs-swap",
            "aqu-sz": "0.00",
            "areq-sz": "0.00",
            "await": "0.00",
            "date": "2025-02-07",
            "dkB/s": "0.00",
            "rkB/s": "0.00",
            "time": "04:10:04",
            "tps": "0.00",
            "wkB/s": "0.00"
        }
    ]
}
1 Like

ansible-core does not really accept new plugins/modules (only in some very specific exceptions), so you likely will have no chance of getting it added there. But community.general is accepting new modules.

Regarding the name: it seems to me that _info is likely better than _facts since the data returned looks more like the result of a query (since they are time-based, and the query includes timestamps), than actual (immutable) “facts” of the system.