lineinfile transiently results in empty file

Greetings,

I’m using the devel branch on Ansible Github (e.g. version 2.3) and have run into a curious problem the past few weeks regarding the use of lineinfile that was seemingly bulletproof in prior versions (e.g. 2.0, 2.1, 2.2). We’ve moved to 2.3 for needed enhancements.

I say that the problem is transient because it doesn’t always manifest. That is, I cannot reliably reproduce the problem. However, when it does manifest, it results in an empty file. I’ll attempt to describe my running scenario here, however the actual system we use is Internet-disconnected, so I cannot simply copy/paste results here.

I’m using lineinfile to ensure entries exist in /etc/hosts

My play looks like this:

  • name: Ensure all machines have /etc/hosts entries
    become: true
    hosts: [machines]
    tasks:

  • name: ensure /etc/hosts has entries
    lineinfile:
    dest: /etc/hosts
    owner: root
    group: root
    mode: u=rw,g=r,o=r
    line: “{{ hostvars[item][‘vm_ip’] }} {{ hostvars[item][‘vm_hostname’] }} {{ hostvars[item][‘vm_alias’] }}”
    create: yes
    state: present
    backup: yes
    with_items: “{{ groups[‘all’] }}”

In other words, I’m looping over my inventory and adding known facts about the inventory as lines in /etc/hosts on [machines]. The known facts are all stored in aptly named files under the host_vars folder. This seemed pretty trivial and worked well in the past. However now, some subset of [machines] end up with a 0-byte length /etc/hosts file. Moreover, it is not consistent on a run-to-run basis. On the majority of runs of the playbook, everything is as expected. But on some runs, certain hosts will have this empty file.

I added the “backup: yes” as a means to troubleshoot and when I do encounter one of these machines with an empty /etc/hosts file, I can see via a directory listing all of the instances of the backups of the file corresponding to each time lineinfile modified the file. I can even see the file size growing over these instances until at some point, the size goes to 0 bytes and stays there. For example:

ls -la /etc/hosts

-rw-r–r–. 1 root root 0 Feb 14 18:02 /etc/hosts
-rw-r–r–. 1 root root 225 Sep 16 18:21 /etc/hosts.2753.2017-02-14@18:02:06

-rw-r–r–. 1 root root 278 Feb14 18:02 /etc/hosts.2774.2017-02-14@18:02:06

-rw-r–r–. 1 root root 331 Feb14 18:02 /etc/hosts.2795.2017-02-14@18:02:06

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2816.2017-02-14@18:02:07

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2837.2017-02-14@18:02:07

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2858.2017-02-14@18:02:07

-rw-r–r–. 1 root root 0 Feb14 18:02 /etc/hosts.2879.2017-02-14@18:02:07

When actually exeuting the ansible-playbook, I’m seeing yellow “changed” output thinking that everything is getting updated and is OK in the end, however I end up in the above situation. Is this a known bug in Ansible 2.3?

Bump. I’m still having this random problem. Anyone else seen anything like this or have any recommendations? I tried a more recent pull of Ansible ‘devel’, but it broke something else (filesystem module).

It sounds like corruption after writing, after you get the first 0
length file it makes sense that it stays that way if it cannot match
the line to change.
what is your filesytem? anything show up in the logs? devices errors?
write errors?

Haven’t checked device logs just yet. These are RHEL7 systems with XFS filesystems.

I’ve internally chalked this up to a “race condition”, where multiple Ansible process forks are trying to update the same file at nearly the same time. As an experiment, I added the “serial” directive to the play and set it to “1”:

Example:

  • name: Ensure all machines have /etc/hosts entries
    become: true
    hosts: [machines]
    serial: 1
    tasks:

I see the difference in Ansible’s console output, where it is doing the lineinfile call to one host at a time vice all at once. I’ve only run it a few times after this change but have yet to see the empty file problem. I’ll continue running tests to ensure this is a good fix.

v/r

Ben

Ansible updates the file atomically, so having 'concurrent' updates
will clobber each other, but should not create a corrupt file. IIRC,
XFS does by default 0 out files when there is an I/O error.