lineinfile transiently results in empty file

watsonb · February 14, 2017, 7:20pm

Greetings,

I’m using the devel branch on Ansible Github (e.g. version 2.3) and have run into a curious problem the past few weeks regarding the use of lineinfile that was seemingly bulletproof in prior versions (e.g. 2.0, 2.1, 2.2). We’ve moved to 2.3 for needed enhancements.

I say that the problem is transient because it doesn’t always manifest. That is, I cannot reliably reproduce the problem. However, when it does manifest, it results in an empty file. I’ll attempt to describe my running scenario here, however the actual system we use is Internet-disconnected, so I cannot simply copy/paste results here.

I’m using lineinfile to ensure entries exist in /etc/hosts

My play looks like this:

name: Ensure all machines have /etc/hosts entries
become: true
hosts: [machines]
tasks:
name: ensure /etc/hosts has entries
lineinfile:
dest: /etc/hosts
owner: root
group: root
mode: u=rw,g=r,o=r
line: “{{ hostvars[item][‘vm_ip’] }} {{ hostvars[item][‘vm_hostname’] }} {{ hostvars[item][‘vm_alias’] }}”
create: yes
state: present
backup: yes
with_items: “{{ groups[‘all’] }}”

In other words, I’m looping over my inventory and adding known facts about the inventory as lines in /etc/hosts on [machines]. The known facts are all stored in aptly named files under the host_vars folder. This seemed pretty trivial and worked well in the past. However now, some subset of [machines] end up with a 0-byte length /etc/hosts file. Moreover, it is not consistent on a run-to-run basis. On the majority of runs of the playbook, everything is as expected. But on some runs, certain hosts will have this empty file.

I added the “backup: yes” as a means to troubleshoot and when I do encounter one of these machines with an empty /etc/hosts file, I can see via a directory listing all of the instances of the backups of the file corresponding to each time lineinfile modified the file. I can even see the file size growing over these instances until at some point, the size goes to 0 bytes and stays there. For example:

ls -la /etc/hosts

-rw-r–r–. 1 root root 0 Feb 14 18:02 /etc/hosts
-rw-r–r–. 1 root root 225 Sep 16 18:21 /etc/hosts.2753.2017-02-14@18:02:06

-rw-r–r–. 1 root root 278 Feb14 18:02 /etc/hosts.2774.2017-02-14@18:02:06

-rw-r–r–. 1 root root 331 Feb14 18:02 /etc/hosts.2795.2017-02-14@18:02:06

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2816.2017-02-14@18:02:07

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2837.2017-02-14@18:02:07

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2858.2017-02-14@18:02:07

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2879.2017-02-14@18:02:07

When actually exeuting the ansible-playbook, I’m seeing yellow “changed” output thinking that everything is getting updated and is OK in the end, however I end up in the above situation. Is this a known bug in Ansible 2.3?

watsonb · February 28, 2017, 3:22pm

Bump. I’m still having this random problem. Anyone else seen anything like this or have any recommendations? I tried a more recent pull of Ansible ‘devel’, but it broke something else (filesystem module).

Brian_Coca · February 28, 2017, 4:14pm

It sounds like corruption after writing, after you get the first 0
length file it makes sense that it stays that way if it cannot match
the line to change.
what is your filesytem? anything show up in the logs? devices errors?
write errors?

watsonb · March 1, 2017, 1:06pm

Haven’t checked device logs just yet. These are RHEL7 systems with XFS filesystems.

I’ve internally chalked this up to a “race condition”, where multiple Ansible process forks are trying to update the same file at nearly the same time. As an experiment, I added the “serial” directive to the play and set it to “1”:

Example:

name: Ensure all machines have /etc/hosts entries
become: true
hosts: [machines]
serial: 1
tasks:

I see the difference in Ansible’s console output, where it is doing the lineinfile call to one host at a time vice all at once. I’ve only run it a few times after this change but have yet to see the empty file problem. I’ll continue running tests to ensure this is a good fix.

v/r

Ben

Brian_Coca · March 3, 2017, 7:46pm

Ansible updates the file atomically, so having 'concurrent' updates
will clobber each other, but should not create a corrupt file. IIRC,
XFS does by default 0 out files when there is an I/O error.

Topic		Replies	Views
lineinfile with serial is overwriting the data Ansible Project	1	25	February 23, 2017
Ansible lininfile behavior Ansible Project	22	4	May 10, 2020
lineinfile ansible module skips a line Ansible Project	0	12	April 28, 2017
lineinfile task fails on missing file, doesn't add line when it should Ansible Project	8	16	March 19, 2015
Understanding lineinfile Ansible Project	13	2	September 28, 2012

lineinfile transiently results in empty file

Related topics