I hope someone could offer some guidance on how to solve a scenario I’m trying to automate.
The objective is to obtain a back-up of a directory in the target machine (managed) and needs to be saved to the controller host (where ansible runs).
Now, the problems:
I feel I cannot use synchronize as it has its own rules for access (requiring ssh public keys unlike the rest of the ansible modules).
The back-up needs to be done as a privileged account (but ansible must login as an unprivileged user able to do sudo or su). This kind of prevents me from using a local_action/shell doing "sshpass -p XXXX privileged@host “cd directory_to_back_up && tar -czf - *” > local_backup_file.tar.gz
There’s not enough space in the target machine to create a tar file of the directory that could later be fetched.
The size of the back-up is huge, precluding writing a tar file to stdout to be stored in a register variable that could be later be written locally.
Any ideas on how to handle this?
So far I suspect I might need to write my own module, perhaps looking into how fetch and command are handled to write the output of a command to a local file without registering anything.
Why use Ansible? Something like rsync would seem to be the right tool. Though you might need to use Ansible to install rsync on the target system
rsync has the advantage (since you say the backup is huge) of being able to restart a failed transfer (look for “partial” in the man page) and of being able to transfer only changed items if the backup is going to be a regular event; this massively reduces the size and duration of backup operations after the first.
I’m not sure what “synchronize” is; if you mean rsync, then I suggest that the benefits to be gained will more than offset the investment in figuring out how to use it (though in my experience it is extremely simple to use).
Because rsync can use ssh, any user on the local system can access any user, privileged or otherwise, on the target system, provided the local system user’s public key is in the authorized_keys file for the relevant user on the target system.
synchronize is the module in ansible wrapping rsync.
The systems I’m dealing with are configured (due to security policies) to not allow login through SSH to the root user. That precludes connecting as root with neither ssh nor rsync.
Also, ideally I’d like keep permissions and users (and the ansible controller does not really share the same users as the target nodes), so I would need to store it to an archive. Because that archive might be later used to bootstrap other host. (but yes, I could do
Running this through ansible would bring some advantages, one is the internal knowledge of facts and introspection (e.g. I would be running this backup only on hosts that meet some learned criteria) and abstractions (there’re different systems with different OS levels and restrictions (some have sudo, others can only do su) and ansible knows about them and how to connect in the right way (most of the time).
Sorry, can’t figure out how to reply to your reply. Anyway: if you have the ability to do so, adjust the sshd config on the target to allow root users to log in, but only from localhost (127.0.0.0/8 or ::1). This is exactly as secure as allowing su or sudo, so does not breach policy.
Then set up ssh port forwarding via a non-root account on the target host. It should forward local connections on the controller host on (say) port 2222, to port 22 on localhost on the target. Finally, do your rsync to the host by connecting to the local port 2222. That will end up connecting across the tunnel to a root login on localhost on the target. From the target host’s point of view, it will be a local connection and thus permissible.
I have no idea whether Ansible can cope with such a two-step, but it’s a relatively common pattern for ssh generally. Assuming you are an ordinary user on the controller and there is an ordinary user fred on the target system:
Not sure about that rsync command; you’d have to read the man page for getting ssh to use a different port. The point is that if you connect to TCP port 2222 on the controller, you are actually connecting to TCP port 22 on the target.
Put your (the controller user’s) public key (not fred’s!) in the target root user’s ~/.ssh/authorized keys and in fred’s ~/.ssh/authorized keys and the whole thing can be passwordless.
If you are unable to get ssh root logins to localhost on the target, then I’m stumped too