Synchronizing directory structures

Hi all,

I am trying to figure out a nice way to synchronize a directory structure between my ansible repository and my nodes, but it seems like there is no right tool for the job. My requirements are a clean interaction with password-authenticated become, control over owner, group and file/directory mode on the one hand, and deletion of files that are not present on the host, on the other hand. The latter seems unwanted inside builtin.copy, the former is not supported with posix.synchronize (which seems to not have changed since Danā€™s Cheat Sheet on the topic was written).

Is there a possibility Iā€™m missing? Would it be worth trying to revisit the request for a delete flag for builtin.copy? Iā€™d also be happy to take a stab at the implementation, if the answer is positive.

We are very unlikely to add delete to copy, I still recommend either using the existing synchronize module or have ansible setup and trigger rsync/cdist/syncthing services.

If none of the above works for you, i would recommend creating a role that does the steps you need (mostly copy + find + file actions). You can use this as an example Ansible Galaxy, while it does not copy/sync, it handles removing unwanted files. Note that a role or just the copy module will be much slower than the dedicated software i mention above.

1 Like

I looked through the options, and it looks to me like neither of them are up to the task

  • rsync/cdist/syncthing requires additional services that Iā€™d like to avoid for this seemingly simple task
  • the Ansible galaxy example requires listing the files explicitly, which will become very hard to do in code once we move to a deeper directory layout
  • copy doesnā€™t support deletion
  • synchronize doesnā€™t work with become and canā€™t control permissions/owners

Despite these limitations, I think the right place to support this feature would be copy, since itā€™s closest to supporting all features. I looked at the code briefly, and I believe it should be possible with very little work thanks to filecmp.dircmp. So if you say ā€œvery unlikely to addā€, do you mean itā€™s unlikely somebody will work on it, or itā€™s likely that a PR adding the feature would be rejected outright?

I cannot speak for the core team, but my guess is that it would be likely rejected since it increases the complexity of the copy action quite a bit (which is already pretty complex).

I think this would be a rather inefficient solution, since you first have to copy everything over to the target to a temp directory. It would be more efficient to first run a module which retrieves the file/directory list on the target with some information, maybe already passing some information in from the controller (like file / directory tree with file sizes so that the module can determine for which files a checksum should be computed), then decide which files to actually copy, and only copy these and then do the sync on the target.

This is a bit similar to how rsync works, except that you cannot work with a real RPC protocol (like which rsync uses), but have to do it with a fixed number of transfers (the module invocations) and some file transfers inbetween.

1 Like

It can control permissions/owners (because rsync can do so, and you have the rsync_opts option - itā€™s not very straightforward to use, though). Also it does work with some become methods (actually only with sudo).

The main disadvantage to ansible.posix.synchronize is that it only works with a fixed set of connections, with one specific become method (and it ignores most of the become methodā€™s options), and it requires rsync to be set up.

Also it does work with some become methods (actually only with sudo).

I donā€™t think rsync supports sudo with password, does it? Thatā€™s the main issue I was facing.

It does not support that, it only supports prepending sudo or sudo -u user to the rsync command. (Iā€™ve been succssfully using it with password-less sudo in the past.)

Thatā€™s more a restriction of rsync though, I think, than one of ansible.posix.synchronize.

Edit: ah I thought you were talking about ansible.posix.synchronize, not about rsync. Yes, rsync does not support sudo with password (at least to my knowledge).

We have rejected PRs that add this to copy ā€¦ if it were possible I would even remove/revert the ā€˜recurisveā€™ code in copy. Not only does it require 2 implementations, but the inherit complexity has been a source of many bugs over the years and is incredibly slow as Ansible itself is not designed to handle replicating data efficiently.

The role I linked was an example, yes you need to pass the files to it, but using it as a basis and using the find action you could:

  • create the list and permissions of files to copy
  • create the list of existing files and permissions on target
  • use copy/file actions operating on the differences of both lists

Put this in a role and you probably have the solution you want , this basically recreates what the other tools do. But this will not perform well, rsync, cdist and solutions like it have been extremely optimized to handle this use case, like being able to very efficiently detect, compute and only copy the differences.

Thanks for the thorough explanations! Even if it is disappointing to hear, itā€™s understandable from a maintenance standpoint. Maybe getting sudo support with password into rsync could be a more realistic approach.

The synchronize action started as a naive implementation, complicated by a lot of updates that fixed ā€˜that caseā€™ but not in good generic ways, that is why it only works with sudo/no password and has many ā€˜particularitiesā€™. You can have from 2 to 5 different servers involved and the way rsync itself works makes it hard to use become on ā€˜both sidesā€™ of the transfer.

I had thought of a solution, but I never had time to implement it. We can create 3 different actions to replace it: sync_to, sync_from, sync_between. This would both make the code for each MUCH simpler and make it behave much better with other plugins (connection/become).