Gathering facts and use them in configuraton files later.

Hi,

Not quite sure about this but I think there is some bug in ansible 1.2.2 related with gathering facts and use them for config genration.

We have following configuraiton
PLAY1: Setup munin-node configuration on all hosts belong to this group
PLAY2: Use facts and generate nagios / munin-server files on 1 monitoring system.

Everything seems be working well, but yesterday from some unknown reson connection with one ‘host’ was slower and ansbile complain with error during gathering facts.

  • This host has been taken off from host lists - which is OK
  • But when PLAY 2 started, and nagios configuraiton files should be generated by using facts from servers, ansbile complained with error about fact: ipv4.ipaddress from missing node.
    and this host was taken off from hosts list - as this is 1 main host, ansible finished with error that task can’t be done on this host.

I think is worth to check whether ansible removes host from list only on specific play or in general that even if
<% for host in group is used in some file, host will be removed as well from list and files can be generated as should be.

Not sure whether somebody else had the same issue or not.

I hope that this can help.

Best regards,
Marcin Praczko

I’m having a little bit of difficulty about what kind of output you are seeing, can you please share?

Thanks!

Hi,

This is how this can be repeat - at least on our environment:

  • Inventory has host 8 hosts to manage by ansible
  • There is one monitoring host (9th host):

Sections in inventory:
[webservers]
srv01
srv02

srv03

[monitoring]
monitor-host

When I added additional host (lets say: srv09) in ‘websevers’ section - domain doesn’t exist and then run ansible, got error during gathering facts:
fatal: [srv09] => {‘msg’: ‘FAILED: [Errno -2] Name or service not known’, ‘failed’: True}

Then asible is runinng everything as should on all 8 servers (srv01…srv08) - srv09 was excluded from task list.

Next play is configuration for [monitoring] group and now ansible complain with error:
TASK: [create the nagios webserver object files] ******************************
fatal: [monitor-host] => {‘msg’: “‘dict object’ has no attribute ‘ansible_default_ipv4’”, ‘failed’: True}
fatal: [monitor-host] => {‘msg’: “‘dict object’ has no attribute ‘ansible_default_ipv4’”, ‘failed’: True}
FATAL: all hosts have already failed – aborting

Summary:

  • when ansible can’t get access to server during gathering facts - complain with error - expected.
  • host has been taken off from tasks lists - expected
  • running next play only on monitoring system - ansible complain with error and stop doing anything - NOT expected.
  • Removing host which can’t be accessed during gathering facs solve problem (no error during gathering facts, no error during generating configuration on monitoring server)

“When I added additional host (lets say: srv09) in ‘websevers’ section - domain doesn’t exist and then run ansible, got error during gathering facts:”

Ok, that’s expected.

Seeing the host wasn’t there, it can’t really make a template for it.

You may wish to check if the variable in hostvars is defined before trying to reference it as that will solve the problem.

Hi,

Can I have some hints how to do that.
I have following template (.j2) related with ‘webservers’ section:

{% for host in groups[‘webservers’] %}
{% if ‘nagios’ in hostvars[host][‘monitoring’] %}

{{ host }} - Checks

define host {
use linux-server
host_name {{ host }}
alias {{ hostvars[host][‘ansible_hostname’] }}
address {{ hostvars[host].ansible_default_ipv4.address }}
hostgroups webservers,region-{{ hostvars[host][‘region’] }}
}
{% endif %}
{% endfor %}

Is command for checking:
{% if hostvars[host].ansible_default_ipv4.address defined %}

{% endif %}

Best regards,
Marcin Praczko

Sure thing, it would look something like this:

{% for host in groups[‘webservers’] %}

{% if ‘ansible_default_ipv4’ in hostvars[‘webservers’] %}

{% endif %}
{% endif }

If the host hasn’t been “talked to” yet, the hostvars will just contain the inventory specific variables for that host, as it will still be missing facts.

I suspect we’ll cache facts (optionally) in 1.4 timeframe – so I wouldn’t rely on this as an indicator of whether the host is up or not for the long term.

–Michael