Several modules support a “backup:
” parameter:
ansible.builtin.copy
ansible.builtin.template
- (others?)
- conspicuously absent:
ansible.builtin.file
This is a very simple – dare I say, naive – mechanism which makes a backup copy of an existing file on a target host when making a change to that file.
Multiple raised issues, forum topics, and draft PRs address some aspect of handling these backup files. ( Delete a file, create a backup and mark the task as changed when it is, make backup file names confiugrable #83536 ) Each of these addresses a valid issue, scratching someone’s particular itch.
However, fixing any one of these issues in isolation would be a mistake in my opinion. Such changes would only make a real fix messier to implement, since each of these issues is – I’m claiming – merely a symptom of the Real Problem™.
The Real Problem is that there’s no clear path to automation of the entire backup file life-cycle. The naive strategy of “Create a backup file including the timestamp information so you can get the original file back if you somehow clobbered it incorrectly” – to quote the copy
module’s description of the backup:
parameter – leads to an ever-increasing set of unmanaged files on each target host.
Sure, you can write additional tasks to “handle” such a backup file set. Every. Single. One. Even encapsulate it into a role that you can call – over and over and over. You could even make it flexible enough to deal with some of the issues alluded to above.
This is not something I want cluttering up my projects’ task files or my logs. I want it to work as transparently as the current modules’ “backup:
” parameter works. What I really want is for the copy
, template
, or whatever
module to follow a named “backup_policy
” selectable at the task level with a parameter of the same name.
So what would such a named backup policy do? For starters, one named “default
” would do exactly what specifying “backup: true
” currently does. Other named policies would contain a subset of additional optional settings. For example:
backup_policies:
- name: default # Backwards compatible! Could be redefined though.
- name: two_week_3_12 # naming things is hard
retention:
min: 3 # default is infinite
max: 12 # default is undef/not specified
span: 14d # default is undef/not specified
path: ../bkups # default is '.'; could be relative or absolute.
Applying a policy at the task level might look like any of the following:
- name: Do the same old thing
ansible.builtin.template:
src: foo.j2
dest: '{{ wherever }}/foo'
backup: true # backwards compatible - unless
# backup_policies.default has been set.
- name: Do a thing
ansible.builtin.template:
src: foo.j2
dest: '{{ wherever }}/foo'
backup: true
backup_policy: two_week_3_12
- name: Do a different thing
ansible.builtin.template:
src: bar.j2
dest: '{{ elsewhere }}/bar'
backup: true
backup_policy:
name: two_week_3_12 # Apply this policy's parameters...
max: 8 # but override 'max: 12' with this setting.
- name: Do an even more different thing
ansible.builtin.template:
src: baz.j2
dest: '{{ undisclosed_location }}/baz'
backup: true
backup_policy: # No 'name:'; we're augmenting 'default' policy.
min: 4
max: 8
span: 30d
I have not included any options for tweaking a backup’s filename template in these examples, but that could certainly be a part of it. (See the 2nd link above.) That complicates the bit of policy implementing deletion of old backup files, though. For instance, what if you configure the filename template such that ordering of backup files is not obvious, or that all backup files have the same name!? (That’s on you, I say!)
You have interesting resolution issues when either retention.min
or retention.max
conflicts with retention.span
. My gut instinct is to go with whichever one retains the most backups. Feel free to change my mind.
And what about the path:
, specifically when the specified path doesn’t exist?
Should the task fail outright, or is it a non-issue unless it tries to make a backup? Should it make an attempt to create the path, and fail the task if it can’t? My lean is to fail the task outright. Certainly not to make the behavior in this case configurable!
Making backup file life-cycle manageable through named policy like this makes it simple for users to apply different policies to different tasks as appropriate. It gives a clean mechanism to address issues raised here in the Forum and github without breaking existing behavior. Users could simply define backup_policies.default
to upgrade from legacy naive unmanaged backup files to rational life-cycle managed backup files across all tasks that currently set “backup: true
”.
Thanks for reading. I’m looking forward to your thoughts and discussion.