Hi, we have a playbook that runs once a day on all our Linux hosts.
On each day, the same task fails on a different host with “‘ansible_system_vendor’ is undefined”.
I know that facts are gathered
because previous tasks to the failing one are already use facts.
This started after upgrading Ansible from 2.3.3 to 2.10.8.
I can’t find any warning.
Permissions is also not a problem because we gather facts as root.
The host was also not busy at the time of facts gathering.
Also, if I understand it correctly, facts should be “NA” if they can’t be gathered, and not undefined.
What is also strange is, that it started after using a newer Ansible version (2.10.8).
We never saw anything like this in your old Ansible version (2.3.3).
Sadly not all the facts gathering code consistently uses N/A or
warnings. But in this case, for system_vendor, it can be populated by
either VM detection (hardcoded), query of /sys/devices or executing
dmidecode (these all seem to use the N/A standard). So afaict you
should not be getting undefined unless you are using the `subset`
option.
That is the point, I’m getting “undefined”.
So I’m looking for someone who have an idea why I get here “undefined”.
This happens only sometimes when ansible-playbook is run in one of our Jenkins pipelines.
When I try to reproduce this with the ansible-playbook run from my workstation, no facts are undefined.
I can’t find anything in the target hosts logs, Jenkins logs, or Jenkins host logs.
from the code, you don't have access to stat
/sys/devices/virtual/dmi/id/product_name (or if you do, you cannot
access /sys/devices/virtual/dmi/id/sys_vendor)
and executing dmidecode it does not provide this info (not installed,
lack of permissions, etc).
As stated before we gather facts as root user, so we have access to /sys/devices/virtual/dmi/id/{product_name,sys_vendor}.
Also dmidecode is installed, but is not used because before mentioned paths are accessible.
Also this happens only from time to time on one or two of our hosts (we have 1300 hosts).
On each ansible-playbook run, different 1 to 2 hosts appear with a undefined ansible_system_vendor fact,
sometimes also ansible_product_name is undefined.
Sometimes a ansible-playbook run finishes with no undefined Ansible facts.
And as also stated before, this happened after updating our Ansible from 2.3.3 to 2.10.8.
We never ever saw this problem with Ansible 2.3.3, which was running fine for years.
created a /usr/local/bin/lsblk bash script with “sleep 10”
check that my faked lsblk command is used: which lsblk: /usr/local/bin/lsblk
set in ansible.cfg “gather_timeout = 1”
run “time ansible testhost -b -m setup”, this took 12 seconds, no warning shown
run “time ansible-playbook facts.yaml -b -l testhost” (facts.yaml is a playbook which just gather facts), his took 12 seconds, no warning shown
I’m sure that my faked lsblk command is used, because when I change the sleep from 10 to 20,
the ansible and ansible-playbook runs take 22 instead of the previous 12 seconds.
I would expect a warning from the above ansible and ansible-playbook runs, but nothing is shown.
Expected, the 2 first are meant to show better errors and allow for
debugging, while the last one fixes concurrency issues with threads
for modules that call run_command.
One last question regarding this: https://github.com/ansible/ansible/pull/74791 is currently labeled with affects_2.12, so do we have to wait for 2.12 or will this fix back ported to 2.11 and 2.10?