Speeding up a task

Hi Guys,

I need advice on speeding up this task:

- name: "restore files to {{ location }}"
   ansible.builtin.copy:
     src: "/home/ian/backup/hobson42/var/www/ianhobson.com/"
     dest: "{{ location }}"

It copies 5022 files, totalling 109.2Mb, and even though both machines have NVME SSDs and the network is quiet,
the task takes over 30 minutes! Note - some of the files are owned by root, and have "600" permissions, which need to be preserved.

How do I go about this in a secure way?

I was thinking of setting up passwordless keyfile access for root, and then using command: sudo rsync, delegated to localhost, and tearing root access down again afterwards. How would I do this using Ansible?

Ideas and advice welcome.

Ian

Hi Ian,

Look at https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.html

This incorporates raync and may well do what you need to do substantially faster.

Cheers

When in trouble, or in doubt
Run in circles, scream and shout

I'd recommend to avoid copying thousands of files.

Ideas to try:
- Trigger rsync [1] if most of the files are static
- Pack the files into a single tar.gz and transfer and extract that [2][3]

Ciao, Michael.

[1] https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.html

[2] https://docs.ansible.com/ansible/latest/collections/community/general/archive_module.html

[3] https://docs.ansible.com/ansible/latest/collections/ansible/builtin/unarchive_module.html#ansible-collections-ansible-builtin-unarchive-module

Use module *ansible.posix.synchronize* if you can install *rsync*
both on local and remote host. See
https://docs.ansible.com/ansible/latest/collections/ansible/posix/synchronize_module.html

Hi Michael,

Your idea triggered the best solution I have found. Thanks.

This is my solution - for people who discover this later.

io.hcs holds the backup we are trying to restore.
sinope.hcs is where we are restoring to.
callisto.hcs is where ansible is run.

There is a (read only) samba share on io.hcs that is permanently open on callisto.hcs, but can't be used to copy the files over, because the permissions get altered. But it can be used to copy the archive over. tar maintains ownership and permissions as default behaviour.

Therefore the tasks are:

   - name: get IO to tar up the website
     community.general.archive:
       path: "/home/ian/BackupFiles/hobson42/var/www/ianhobson.com/"
       dest: /home/ian/BackupFiles/ianhobson.com.gz # will appear in share as /home/ian/backup/ianhobson.com.gz
     delegate_to: io.hcs

   - name: create {{ location }} directory
     ansible.builtin.file:
       path: "{{ location }}"
       state: directory

   - name: move archive over to target, and extract it.
     ansible.builtin.unarchive:
       src: /home/ian/backup/ianhobson.com.gz
       dest: "{{ location }}"

   - name: remove archive
     ansible.builtin.file:
        path: /home/ian/BackupFiles/ianhobson.com.gz
        state: absent
     delegate_to: io.hcs

Result is a speed up of between 60 and 100 times!
The files are restored with the correct permissions and ownership. I think it is necessary that all users are set up on all machines with the same user numbers in /etc/passwd.

Regards
Ian

Hi Michael,

Your idea triggered the best solution I have found. Thanks.

This is my solution - for people who discover this later.

io.hcs holds the backup we are trying to restore.
sinope.hcs is where we are restoring to.
callisto.hcs is where ansible is run.

There is a (read only) samba share on io.hcs that is permanently open on callisto.hcs, but can't be used to copy the files over, because the permissions get altered. But it can be used to copy the archive over. tar maintains ownership and permissions as default behaviour.

Therefore the tasks are:

   - name: get IO to tar up the website
     community.general.archive:
       path: "/home/ian/BackupFiles/hobson42/var/www/ianhobson.com/"
       dest: /home/ian/BackupFiles/ianhobson.com.gz # will appear in share as /home/ian/backup/ianhobson.com.gz
     delegate_to: io.hcs

   - name: create {{ location }} directory
     ansible.builtin.file:
       path: "{{ location }}"
       state: directory

   - name: move archive over to target, and extract it.
     ansible.builtin.unarchive:
       src: /home/ian/backup/ianhobson.com.gz
       dest: "{{ location }}"

   - name: remove archive
     ansible.builtin.file:
        path: /home/ian/BackupFiles/ianhobson.com.gz
        state: absent
     delegate_to: io.hcs

Result is a speed up of between 60 and 100 times!
The files are restored with the correct permissions and ownership. I think it is necessary that all users are set up on all machines with the same user numbers in /etc/passwd.

Regards
Ian