Ansible_os_family not defined

Hello Folks,

My AWX jobs are failing with “ansible_os_family” not defined for a particular host.
Have the gather_facts = true.

Triaged further via cli tells only this particular host doesn’t have it defined. All rest (1000+) have it set.

All hosts:
ansible -i inventory -l host1 all -m setup | grep ansible_os_family
“ansible_os_family”: “Debian”,

Problematic host:
ansible -i inventory -l host2 all -m setup | grep ansible_os_family
<>

Host2 is U22.
Kindly share details how ansible captures this variable & how to troubleshoot it further?

Can you insert this task before the task that uses ansible_os_family:

- name: Debug ansible_facts
  debug:
    msg: "{{ ansible_facts }}"

This should show you what was captured by Ansible from the host2 during the fact gathering phase.

Also check your configuration, INJECT_FACTS_AS_VARS will enable/prevent top level variables from being created. Note that ansible_facts['os_family'] would be present either way if facts were gathered or cached.

1 Like

Thanks for the inputs folks,
Believe debugging this outside AWX directly via cli will be easier as it’s easily reproducible in cli.

Kindly refer below output.
Believe problem lies in how ansible populates these values. That underlying data is missing on host.

Working host
ansible -i inventory -l host1 all -m setup | grep ansible_os_family
“ansible_os_family”: “Debian”,
ansible -i inventory -l host1 all -m setup | grep ansible_facts[‘os_family’]
<>

Problematic host
ansible -i inventory -l host2 all -m setup | grep ansible_os_family
<>
ansible -i inventory -l host2 all -m setup | grep ansible_facts[‘os_family’]
<>

FYI, the output of the setup module will have the ansilbe_ prefix on facts, it is when used in a playbook that the prefix is removed and only added back in the case of ‘injection’ being enabled. When using ‘adhoc’ you do not get any of this processing.

I would try ansible -i inventory -l host2 all -m setup | grep os_family since this will catch the fact if it exists with or without the ansible_ prefix. I’m also curious if ansible -i inventory -l host2 all -m setup | grep distribution returns anything.

If only a single host among thousands is failing to report these facts (prefix or not), then I would think that perhaps something is wrong with that host’s /etc/os-release file. Modifications to this file (or missing it entirely) can cause Ansible to misclassify or fail to gather OS related facts altogether.

Output is pretty much same with modified grep’s.

Working host:
ansible -i inventory -l host1 all -m setup | grep os_family
“ansible_os_family”: “Debian”,
ansible -i inventory -l host1 all -m setup | grep distribution
“ansible_distribution”: “Ubuntu”,
“ansible_distribution_file_parsed”: true,
“ansible_distribution_file_path”: “/etc/os-release”,
“ansible_distribution_file_variety”: “Debian”,
“ansible_distribution_major_version”: “20”,
“ansible_distribution_release”: “focal”,
“ansible_distribution_version”: “20.04”,

Problematic host:
ansible -i inventory -l host2 all -m setup | grep os_family
<>
ansible -i inventory -l host2 all -m setup | grep distribution
<>

If only a single host among thousands is failing to report these facts (prefix or not), then I would think that perhaps something is wrong with that host’s /etc/os-release file.

Yes, only 1 host has problem.
/etc/os-release is identical to other hosts.

This is the function that defines ansible_os_family ansible/lib/ansible/module_utils/facts/system/distribution.py at v2.18.1 · ansible/ansible · GitHub. What’s the value of platform.system() (using the same Python interpreter as the setup module) on the problematic host?

ansible -i inventory host2 -m command -a '{{ ansible_python_interpreter }} -c "import platform; print(platform.system())"'

If you don’t have ansible_python_interpreter defined in inventory, you can check the discovered interpreter to use in that command:

ansible -i inventory host2 -m setup -a 'gather_subset="!all,!min,python"'

2 Likes

Getting same & good output for both systems.

ansible -i inventory host1 -m command -a ‘/usr/bin/python -c “import platform; print(platform.system())”’
host1 | CHANGED | rc=0 >>
Linux

ansible -i inventory host2 -m command -a ‘/usr/bin/python -c “import platform; print(platform.system())”’
host2 | CHANGED | rc=0 >>
Linux

If collecting any of the fact subsets (including distribution) fails with an Exception, that subset of facts would just be an empty dictionary due to how it’s handled here ansible/lib/ansible/module_utils/facts/ansible_collector.py at v2.18.1 · ansible/ansible · GitHub.

It might be easier to debug with -vvv than tracing the code (assuming you’re using the ssh connection plugin which includes the detail at that verbosity level): ansible -i inventory host2 -m setup -a 'gather_subset="!min,!all,distribution"' -vvv | grep '.*{"ansible_facts":.*"'

The gather_subset option limits the running code/output, and the grep matches the JSON returned by the setup module in -vvv since otherwise there’s a lot to sift through.

I’m just emulating the behavior by raising an arbitrary exception in the distribution facts logic, but the output could look similar to:

<host2> (0, b'TypeError("emulate issue collecting distribution facts")\r\n\r\n{"ansible_facts": {"gather_subset": ["!min", "!all", "distribution"], "module_setup": true}, "invocation": {"module_args": {"gather_subset": ["!min", "!all", "distribution"], "gather_timeout": 10, "filter": [], "fact_path": "/etc/ansible/facts.d"}}}\r\n', ...
1 Like

Bingo, was able to find the root cause.

Can you help me understand how does the ssh verbose option of -vvv fit here?
And how do I find which ansible variable belong to which gather_subset group?

Here we go:

ansible -i inventory host2 -m setup -a ‘gather_subset=“!min,!all,distribution”’ -vvv | grep ‘.{“ansible_facts”:."’
(0, b’CalledProcessError(1, ('lsb_release', '-a'))\r\n\r\n{“ansible_facts”: {“gather_subset”: [“!min”, “!all”, “distribution”], “module_setup”: true}, “invocation”: {“module_args”: {“gather_subset”: [“!min”, “!all”, “distribution”], “gather_timeout”: 10, “filter”: “*”, “fact_path”: “/etc/ansible/facts.d”}}}\r\n’, b’')

Above told lsb_release process error. When i ran the same on host, it did fail.
Fixed it & we are back to normal :grinning:.

4 Likes