Copying a directory from one remote host to another

Hi all,

here’s a problem that’s been on my mind lately. I’m trying to set up a master-slave pair of Postgres servers using Ansible. Near the end of the process, I need to recursively copy a directory from the master to the slave. Atm while I still don’t have an automated solution, I log into the slave and run an rsync command, inputting the password manually. This is the only step I’ve not been able to elegantly automate so far.

I’d like to pick the community’s brain for the most elegant and simple (not necessarily easy, but simple) solution. I’ve considered automating rsync with expect, and setting up passwordless ssh between the master and slave. The expect option feels really hacky; I’d have to ask the user for the password again (vars_prompt, or can I access the password from -k?), generate the script from a template, upload the script, make sure expect is installed, execute the script, remove the script.

I’m not entirely sure how I’d go about setting up temporary SSH access from one machine to the other. I suppose it would have to go like this: generate (or use prepared) key pair, upload the public key to one machine (using the authorized_key module), upload the private key to the other machine, execute rsync, remove both keys (don’t think I’d like to leave the passwordless connection set up).

Am I missing a really simple alternative? Has anyone dealt with this issue, and how have you approached it, and with what success?

Thanks in advance.

One possible other option, which may or may not work well depending on the
size of what you need to move, is rsyncing everything up to the ansible
server from the master, then rsyncing it all down to the slave.

Another alternative is to tar the directory, fetch to the ansible server and then copy back down to the other postgres server and untar it.

Depending on what kind of access/traffic you’re willing to allow from the slave to the master, you could set up an rsyncd.conf on the master and run an rsync daemon there (lock down access to the ip’s of the slaves only, or however you want to restrict things, but that works without a password), then use the rsync protocol to pull the data down via the ansible “command” module against the slave; you could even make it idempotent assuming you know what files will be created. e.g.

  • name: get slave files from master
    command: rsync -a master::rsyncmodule/stuff /dir/on/slave creates=/dir/on/slave/somefile

If you’re uncomfortable giving that access from slave to master, you could do it in reverse (push from master → slave via rsyncd running on the slave), but i’m not sure how to make that idempotent in a “simple” way.

matt

Thanks for the suggestions, Romeo, Lester, Matt.

I think I’m leaning towards just tarring up the directory, fetching the tar and pushing the tar to the slave. Rsyncing master-ansible and then ansible-slave still runs into the issue of passwords if I’m using -k instead of ssh-agent/passwordless SSH.

The idea of temporarily popping up an rsync daemon that allows passwordless access on the master is interesting. I’d have to spawn a new rsync daemon process using a custom rsyncd.conf (also taking into consideration there might already be an rsync daemon running, for whatever reason, on the default port), do the sync, and stop the daemon. Is there an Ansible idiom for doing this sort of thing? (Starting something up in one task, then stopping it in a later task, with the thing not being a service but just a process started by shell/command.) I could use an inline shell command that would start the server up and echo the PID into a temp file, then later feed that to kill, but this is getting kind of elaborate.

I'd be very tempted to just use the git module if it's not a lot of binary data.

What's the use case?

Another possibility that I thought of today while driving to the grocery
store :slight_smile: is (temporarily?) installing paramiko on the master (or slave)
server and having ansible push down a simple python script, using the
script module, which does the rsync through ssh for you. That way you don't
have to setup keys, etc.... Just an idea.

Hmm, wonder if there's a use-case for a paramiko ansible module ?....

I would find that a little weird.

I would find that a little weird.

Well, they all can't be winners right. :wink: Though, I do think the paradigm
of being able to direct one minion to do something on another minion via
ssh, without keys, would be a useful ability. Maybe that's just installing
ansible on the minions and having ansible on the master calling ansible on
the minion though. I'll need to think about it some more.

Romeo

I had, at one point, considered creating of a task-only ephemeral
fileserver that required access by using certain tokens, but I thought
it was a bit of a distraction and I don't want to maintain bonus
security systems.

While I understerstand the idea of using --ask-pass and so on, this is
really what locked SSH keys and ssh-agent excel at.

Hi guys, thanks again for the comments.

It’s not a lot of data, just a freshly initialized Postgres database. (Almost no rows, just empty tables, indexes etc.) My particular use case, as mentioned in the OP, is setting up Postgres streaming replication (step 6: http://wiki.postgresql.org/wiki/Streaming_Replication), which involves jump starting the replication process by making a base backup.

How would the git module help here? I’m thinking, on the master:

ensure git is installed
git init
git add
git daemon upload-archive
punch a hole in the firewall

then on the slave:

ensure git is installed
git archive

then shutdown the daemon on the master, remove firewall hole and delete the .git directory. (Or set up a script that would automatically shut it down and clean up in one minute.)

That might work. Could also work with hg and the hg serve command.

If we’re brainstorming, wouldn’t the fireball semi-daemon be good for things like this? It seems to already be able to deal with time-based auth and auto shutdown. If the ephemeral daemon could serve rsync (or scp, or a custom protocol that supports recursive directory fetching with excludes) requests, that would be just great. (Hopefully all the dependencies are available on CentOS :slight_smile:

FYI, fireball file transfer is, right now, very un-optimized for large
file transfer.

Really the super easiest way is still:

local_action: rsync -avz /where/from user@$ansible_hostname:/where/to

of course using keys.

we could of course explore that fileserver option but I'd want to run
some benchmarks. It's hard to beat rsync!

--Michael

Here’s an idea: expand on the authorized key module, and find a way to set up passwordless SSH between two nodes temporarily (during a playbook, during an Ansible run, for a temporary time like the fireball daemon?). Instead of implementing a file server, just allow the full SSH arsenal to function between nodes unhampered during deployment.

This could probably be done via normal tasks (and a handler to clean up?), but I’m thinking a module (or modules) could make this sort of thing easier and more robust.

Linking two remote nodes (node 1 needs to have access to node 2):

  1. generate a temporary, throw-away SSH key pair, keep it in memory (or temp file?). No passphrase.
  2. using the authorized_key module, install the public key on node 2
  3. on node 1, back up the default identity if it exists, install the generated identity as the default identity
  4. let the playbook(s) do their thing
  5. clean up

The clean up step is the tricky one. What the clean up should do is: get rid of the authorized key installed at #2, and delete the installed identity and reinstate the default identity from step #3. This should be fairly robust. Going by the fireball process, two temporary processes could be spawned on the hosts, with a timeout and hardcoded rules what to do when the timeout expires. The clean up should be performed even if the playbooks fail during step #4, or Ansible loses connectivity during any of the steps, or even if the user terminates Ansible mid-exec.

Does this sound interesting or have I gone off the deep end here?

How about using tar cz |nc with a delegate_to host#2 with nc|tar xz

Having recently written a (non-Ansible) shell script to perform this kind of one-shot file transfer between remote hosts, I agree with Michael. A temporary rsync daemon on one end and an rsync command on the other is the simplest approach that offers speed, recovery from an interrupted transfer, and ease of setup/teardown. It’s simpler than I thought it would be when I started to write my script.

To illustrate, here are the main steps my script performs on Ubuntu 10.x/12.x:

  • On the sending machine (db master), create a temporary rsync config file with a suitable name (e.g., /tmp/send-to-slave.conf). The file has only 3 lines:
    [db-image]
    path = /path/to/master/image/dir
    use chroot = no

  • On the sending machine, start rsync in daemon mode using the above config file and a custom port. Note: no need for ‘&’ at the end of the command:
    rsync --daemon --port=1873 --config=/tmp/send-to-slave.conf

  • On the receiving machine (db slave), run rsync to pull the files:
    rsync -av rsync://sending-machine:1873/db-image/ /path/to/slave/image/dir

When the transfer is done, an ordinary kill or pkill command works to end the rsync daemon, and a pgrep command works to verify it exited.

Since the port number is not the usual rsync port (873), there’s little risk of conflict with a normal rsync service daemon. Since the port number is above 1024, the one-shot rsync daemon doesn’t have to run as root.

-Greg

Rsync over SSH makes more better sense -- it's secure.

This is what local_action rsync already does.

Thanks for the input, guys.

Since my needs are very simple, I’m leaning towards the netcat option (very clever!). On the server side, timeout 30 tar cz . | nc -lp 2222 (maybe wrap it with commands to open a firewall port and close it), and the reverse on the client side. I don’t mind it being insecure and inefficient in this case, and the amount of data is very low. I like that it’s reasonably clean (no temporary files, shuts down after one request or the timeout). If I needed anything more complex I’d have gone with the temp rsync daemon, also wrapped in a timeout command.

I think this might be it, nice ideas.

Tin:

I might be missing something here, but can’t you use SSH agent forwarding to achieve this? You’d need to use the “ssh” connection instead of paramiko.

At the risk of hijacking a thread, another line of reasoning about the OP’s
use case is to provide a mechanism to supply passwords the way that
ssh-agent provides tokens. Yes it is different that the private key never
leaves the ssh-agent process.

The issue with ask-pass is really that it is manual.

There was some discussion of integrating a keyring here:

https://github.com/ansible/ansible/pull/2184

This is a bit like the situation with secure passwords in a browser
that leads to a solution like LastPass.

Just a thought.

ssh-agent and keys are great. Just saying.