RFC: rsync action plugin

With the introduction of action plugins, I decided to develop a rsync module to use with Ansible.

Users always have the option to run rsync directly with the command module and choose any of its many many options. I wanted something that would keep things simple -- options to a minimum, "smart" defaults and integrating as much of its operation to parameters and facts Ansible is running with.

This is what I came up with so far and want to get some feedback from the list: https://github.com/tima/ansible-rsync/blob/master/lib/ansible/runner/action_plugins/rsync.py

I am assuming --archive operations only that use the temp directory created by Ansible and delays updates (--delay-updates) until everyting is complete so destinations are not left in some in-between state. All file transfers will be compressed and tunneled over SSH using the configured private key. The optional delete option will use --delete-after to also make sure the destination is not left in some in-between state.

I implemented a mode attribute that specifies if the files are being sent from the controller to a remote node mode=copy -- the default) or the other way around (mode=fetch). The mode attribute is only considered if the connection is not local. I used copy and fetch to coincide to the modules in Ansible though I would have preferred push and pull.

Here is how you use it in its most basic form:

    action: rsync src=/path/to/src dest=/path/to/dest
  
To sync the local file system with a remote node:

    action: mode=fetch rsync src=/path/to/src dest=/path/to/dest

To clear out any files not in your source:

    action: rsync src=/path/to/src dest=/path/to/dest delete=true

By default rsync runs in quiet mode, but if you want want input feedback on what rsync did:

    action: rsync src=/path/to/src dest=/path/to/dest verbosity=1
  
The number given to verbosity corresponds to the number of 'v' that is being passed to rsync. A value of 1 is some output and 3 is a lot of output only useful for debugging.

One last feature I have implemented is one I needed at my day job. The ability to specify which/where rsync is being run on the remote. Similar to how Ansible deals with the location of the python intrepreter, the path to the rsync to use can be set with a host variable of ansible_rsync_path in your inventory.

I need to implement a way to declare the local version of rsync to use, but I'm not sure what's the right way for a plugin to implement a configuration param currently. (Michael?)

Other features I'd considered was logging and include/exclude file options, but had questions on how to best implement those so I set them aside for now.

Thoughts?

<tim/>

Sent this yesterday but I didn’t see it go out to the list. Retrying.

Hi Tim, I've won't be much help reviewing your plugin at this point
but I can tell you that having a simple, ansible integrated module
that covers basic, most commonly used rsync usage sounds like a great
idea to me. It seems this could be very useful for syncing large
directory tree's of config files, etc...

Also, similar to you I'd need to be able to specify the path to rsync.

Thanks,

I very much like the idea.

I'm wondering if it needs to be a server side module and instead
should be a regular module module?

I think this might be better as a regular module module, and using
local_action in the case where we want to transfer something to/from
the ansible-playbook machine to the remote. There are use cases for
wanting to rsync from other places.

We could include examples of how to do that in examples/playbooks

One caveat is you pretty much need to have your authorized_keys set up
when doing this, no matter how you do this. It's not going to work
with --ask-pass, etc, but we could just make a note of that in the
docs.

Hi, guys. Sorry for getting into the middle of discussion.

I have been playing with ansible and rsync plugin for a while. Here're my 50
cents.

I believe rsync is not the right way to copy folders recursively. It is a
side lean from compatibility and all the ansible traditions.

I disagree.

It's true that if someone is using rsync to move config files, you're
missing the point and are doing config management wrong.

However, if you are transferring a large directory tree of static
content out to a webserver, because you didn't want to check it all
into git, that's not a bad idea at all.

If you are moving ISOs, git is not a good answer at all.

Rsync is very well optimized and ideal for this purpose.

It is working only with ssh, drawing paramiko useless.

I think this is ok. Paramiko is a starter configuration. When
people need the connection reuse, kerberos, jump hosts, etc, they are
going to need to migrate to "-c ssh" anyway.

I'm a big fan of not reinventing the wheel and would like to avoid
maintaining a fileserver, and if you are transferring large directory
trees, rsync pretty much rocks. There's also rsync protocol, which
this could be made to support.

It requires ssh private key to be provided as a readable file (possible even
w/o password).
It is impossible to sudo rsync on the remote host. I've spent an hour trying
to do it and finally adding --rsync-path 'sudo rsync' make no sense — I get
"sudo: no tty present and no askpass program specified.

I agree that it won't work with sudo but this is *NOT* a reason to not
have it available.

Allowing rsync do things as root via sudo will be very, very useful.

Overall I find the project to be super-useful, but one thing stops me from
using it everyewhere — I can't copy folders from my local computer.

Maybe we can somehow write recursive 'file' module. This will be the right
way, I think.

If you do want this, and can't use rsync, using the git module is a
pretty good way to do this.

You could also use the mount module and NFS or Samba.

This isn't going to be the only way to do things, but it's a pretty
NICE way to do things.

And I think by doing this in a regular module you get some of the nice
syntax of modules, a bit better error checking, and some room for
future expansion.

--Michael

Thanks for the answer.

I agree that it won’t work with sudo but this is NOT a reason to not
have it available.

Okay, sorry, maybe I was too strict. Two ways are always better than no ways.

If you do want this, and can’t use rsync, using the git module is a
pretty good way to do this.

You could also use the mount module and NFS or Samba.

This isn’t going to be the only way to do things, but it’s a pretty
NICE way to do things.

Got it. Thanks.

On Sep 13, 2012, at 7:52, Michael DeHaan

And I think by doing this in a regular module you get some of the nice
syntax of modules, a bit better error checking, and some room for
future expansion.

I understand these benefits but the flip side is that a regular module does not have access to the runner and it's state as far as I know. I use that to grab the user name and private key to avoid passing redundant info.

<tim/>

I'd rather have a little redundancy than limit the capability.

We should just make the username available like we do things like
$inventory_hostname. $remote_user and $private_key_path seem
reasonable, and we can add them to the list of variables you get for
free (wherever that happens to be on the doc site).

OK I suppose. It diminishes one of my design goals to make it easier than using command with rsync. I suppose the ability to work node-to-node and avoid the admittedly awkward "mode" param is the value add.

I'm also utilizing the ansible temp directory that action plugins are passed. Any others?

<tim/>

You could probably still have something like a move_files module that
called rsync, we'd just indicate the limitations.

I think I was just taking issue with the idea that what we were
exposing was less than rsync and people would try different things
that might not actually work.

If naming is the issue I will change it. It doesn't have to be rsync. Perhaps just "sync" since "move" says something different.

If that's the issue then I need to...

* rename the module. keep it as an action plugin.
* create a patch to make remote user and private key accessible in playbooks for use with command rsync.

Anything else?

<tim/>

Let's call it 'sync_files' and we'll be set. It will be a little
confusing that you don't use it with local_action, but I think we can
just explain that with an example.

I'm still giving some thought to some of the things floated here (normal module implementation) etc.

Couldn't I implement the file sync function as a action_plugin and normal module to get the best of both worlds? (The assemble module was my inspiration.) The action module would embellish the params passed in (adding temp directory, add user and host to dest or src) and then pass control to the normal module to do its thing either locally (local_action) or on a remote node. That would cut down on the number of params a user would have to provide while still getting the benefits of a normal module.

Sound right? What am I missing?

<tim/>

Sure.

They should probably be named differently so you can still access both maybe?

(assemble is actually going to be ported to be entirely a client
module, but that's beside the point)

I posted a pretty significant refactor of this.

There is an action plugin that does a lot of advanced param default processing based on Ansible's state. It figure out the host, remote user, private key -- it detects an alternate path to rsync for the host etc. That plugin is called synchronize.

I included a ("normal") module called rsync that is much more open ended in that you set all the params in the playbooks yourself. There are few defaults so you can have more flexibility/power to working with rsync.

Take a look and let me know what you think.

http://github.com/tima/ansible-rsync

<tim/>

在 2012年9月18日星期二UTC+8上午5时41分56秒,Timothy Appnel写道:

I posted a pretty significant refactor of this.

There is an action plugin that does a lot of advanced param default processing based on Ansible’s state. It figure out the host, remote user, private key – it detects an alternate path to rsync for the host etc. That plugin is called synchronize.

I included a (“normal”) module called rsync that is much more open ended in that you set all the params in the playbooks yourself. There are few defaults so you can have more flexibility/power to working with rsync.

Take a look and let me know what you think.

http://github.com/tima/ansible-rsync

Sent from my iPhone

Is this plugin recommended to use?
How can I use this plugin?