Introduce a new module option ‘encoding’ to the lineinfile module

ketankelkar · April 15, 2025, 4:53pm

There is a draft PR for the lineinfile module which introduces a new module option encoding to add compatibility for target files not encoded in UTF-8.

Currently, the lineinfile module code does a binary-read on a target file and puts the contents as bytes in a buffer. This buffer is assumed to contain UTF-8 encoded bytes upon which regex matching operations and write operations are done. If a target file is not UTF-8 encoded, the regex matching does not work correctly because the regex comparison is a UTF-8 regex pattern compared to non-UTF-8 encoded bytes. And since write operations are done by adding UTF-8 bytes to the buffer, in the case of a non-UTF-8 encoded file, since this buffer would not contain UTF-8 encoded bytes, when the buffer is written to the file, the resulting file contains characters from multiple encodings.

The proposed change introduces a new module option encoding, which when specified reads the file contents into a Unicode text buffer instead of bytes so that regex matching is done in Unicode and write operations are done by adding Unicode chars to the buffer instead of UTF-8 bytes. Since Python3 strings internally represent characters in Unicode, all the Unicode operations are just simply Python string operations. File reads and writes are done in text-mode so that the optional encoding parameter can be specified when opening the file descriptor (docs for Python open function).

An alternative approach was explored where the initial file read was done in text-mode and converted to UTF-8 bytes so the remainder of the code could remain unchanged until the write operation at the very end which also involved a conversion. The Unicode approach seems overall cleaner even though it requires more changes. See the code diff for the other approach here.

See the code in the PR here.

bcoca · April 15, 2025, 4:55pm

FYI, adding something like this just to lineinfile does not make as much sense as creating a common facility for all modules that alter files to use.

ketankelkar · April 15, 2025, 8:35pm

The initial thinking was to start with lineinfile , move on to blockinfile and then to any other suitable modules in the ansible.builtin collection one by one.

A common facility sounds like a good idea though I’m not sure what would that look like. Would it be like a class in module_utils which each relevant module calls?

bcoca · April 15, 2025, 9:07pm

mostly, we already have such things like ‘atomic_move’, permissions and other fs handling

Topic		Replies	Views
question on lineinfile module Ansible Project	4	15	June 30, 2016
module lineinfile Ansible Project	1	0	March 7, 2013
Announcing: lineinfile module Ansible Project	5	0	June 21, 2012
Lineinfile append at EOF Ansible Project	3	19	June 15, 2016
Templates: how to handle file encoding and line endings? Ansible Project windows	0	52	October 16, 2017

Introduce a new module option ‘encoding’ to the lineinfile module

Related topics