Stumped on how to restore AWX backup to minikube running on mac laptop.

Hi,

I have an AWX-operator deployment running and I am testing my disaster recovery procedures. I have made a backup of my AWX operator deployment and move the backup files to my laptop:

[red@BP22006.local tower-openshift-backup-2023-04-14-16:06:26]$ pwd /Users/russell.cecala/AWX/BACKUP_RESTORE/rocky/pvc-37ccc9bf-1ffb-4565-bd31-5c3eb1ad21cb_awx_awx-demo-backup-claim/tower-openshift-backup-2023-04-14-16:06:26 [red@BP22006.local tower-openshift-backup-2023-04-14-16:06:26]$ ls -l total 164360 -rw-r–r-- 1 red staff 1118 Apr 17 09:57 awx_object -rw-r–r-- 1 red staff 13164 Apr 17 09:57 secrets.yml -rw-r----- 1 red staff 72039972 Apr 17 09:57 tower.db

I think now I need to create a PVC on my laptop’s minikube system like so …

$ cat laptop-pvc.yml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: awx-backup spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi

$ kubectl apply -f laptop-pvc.yml
persistentvolumeclaim/awx-backup created

But when I do a describe on the PV I see a Path that does not exists!

$ kubectl describe pv pvc-3910cc84-e2bd-47cb-9830-8b13516cd56f | grep Path Annotations: hostPathProvisionerIdentity: ced76979-2cc1-4903-891e-9a30e656cf5b Type: HostPath (bare host directory volume) Path: /tmp/hostpath-provisioner/default/awx-backup HostPathType:

$ ls -l /tmp/hostpath-provisioner/default/awx-backup ls: /tmp/hostpath-provisioner/default/awx-backup: No such file or directory

Do I need to create that path first?
Please help. I am getting very confused and cannot figure out how to do a simple restore from a backup.

I’m pretty sure that you need to create the path first.
As well as the PVC, you also need to define a PV.
Have a look at the pv.yaml and pvc.yaml files in https://github.com/kurokobo/awx-on-k3s/tree/main/base.
The storageClassName links the pvc to the pv.

Thanks for the reply Micheal, and thanks for the link to your github. Lots of good info there.
Please let me show you here what I am doing and maybe you can see where I am going wrong.

Here are the file I need to restore.

bpadmin@minikube:~/RESTORE/FILES$ ls -l total 70376 -rw-r–r-- 1 bpadmin bpadmin 1118 Apr 17 13:13 awx_object -rw-r–r-- 1 bpadmin bpadmin 13164 Apr 17 13:13 secrets.yml -rw-r----- 1 bpadmin bpadmin 72039972 Apr 17 13:13 tower.db

I have created the /data dir’s as you suggested:

sudo mkdir -p /data/postgres-13 sudo mkdir -p /data/projects sudo chmod 755 /data/postgres-13 sudo chown 1000:0 /data/projects

I have create the PVC and PV.

bpadmin@minikube:~$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE awx-postgres-13-volume 8Gi RWO Retain Available awx-postgres-volume 10m awx-projects-volume 2Gi RWO Retain Bound default/awx-projects-claim awx-projects-volume 10m bpadmin@minikube:~$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE awx-projects-claim Bound awx-projects-volume 2Gi RWO awx-projects-volume 10m

Now here is where I think I go wrong.

I copy my backup files to the /data dirs:

bpadmin@minikube:~$ tree /data /data ├── postgres-13 │ └── tower.db └── projects ├── awx_object └── secrets.yml

Then I do

bpadmin@minikube:~$ kubectl apply -k restore error: must build at directory: not a valid directory: evalsymlink failure on ‘restore’ : lstat /home/bpadmin/restore: no such file or directory

But as you see I get an error message. I am really don’t get what I am supposed to do. Any help would be great. :slight_smile:

How did you setup AWS and generate a backup?

First, thanks for the reply :slight_smile:
My source system is an awx-operator running on a 3 node k3 cluster. Basically it is the awx-demo deployment used in the tutorial for setting up AWX-opeator.
I created a back up of this AWX server by creating this awx-backup.yml file:

$ cat backup-awx.yml — apiVersion: awx.ansible.com/v1beta1 kind: AWXBackup metadata: name: awxbackup-2023-04-13 namespace: awx spec: deployment_name: awx-demo

And then applied that file like so:
kubectl apply -f backup-awx.yml

This created the backup files on one of my k3 nodes:

[root@rocky-k3-2 ~]# cd /var/lib/rancher/k3s/storage/pvc-37ccc9bf-1ffb-4565-bd31-5c3eb1ad21cb_awx_awx-demo-backup-claim/tower-openshift-backup-2023-04-14-16:06:26 13
[root@rocky-k3-2 tower-openshift-backup-2023-04-14-16:06:26]# ls -l 15-rw-r–r-- 1 root root 1118 Apr 14 09:06 awx_object 16-rw-r–r-- 1 root root 13164 Apr 14 09:06 secrets.yml 17-rw-rw---- 1 root root 72039972 Apr 14 09:06 tower.db

I then set up a completely new system with minikube on it so I can test these backup files and test my restore procedure.
And I scp’ed the awx_object, secrets.yml and tower.db files to my new system. I want to load my backup files to my new minikube cluster
so I can show the backup works.

Hi @RedCrick,

A bit of context first: The main use case for the AWXBackup and AWXRestore objects is to take a backup you can restore from on the same cluster, typically during upgrades. Before you do an upgrade, take a backup, and if the upgrade causes unanticipated issues, you can always delete the AWX CR and restore from the AWXBackup object (which has knowledge of the backup PVC and correct backup directory on it).

So a typical user would have an AWXRestore object like this, where the “backup_name” is the name of the AWXBackup object.


apiVersion: awx.ansible.com/v1beta1
kind: AWXRestore
metadata:
name: restore1
namespace: awx
spec:
deployment_name: awx-new
backup_name: awxbackup-2023-04-13

Links to docs:

However, it seems like you want to migrate to a different cluster entirely. For this, you have a few options I know of:

  • Use the migration logic, which requires both clusters to be up simultaneously. Docs here.
  • OR do what you are attempting to do, which would allow you to restore your AWX even if your original cluster is not available (assuming you have the contents of the backup somewhere accessible, as you described).
  • Alternatively, you could achieve this by taking an infrastructure as code approach by: A) using the awx.awx import/export modules (downside is credentials won’t be exported), or B) define infrastructure as code using the awx-resource-operator, a k8s-native way to define the resources (projects, jobtemplate, credentials, etc.) in your AWX instance. The resource operator is not fully feature complete yet, but is rapidly approaching it.

Thank you Christian,

I must have done something wrong as I tried running AWX Restore object yaml you had suggested.
What do you think my chances are, if on my minikube cluster I create an AWXbackup and then overwrite the backup files that get created with the backup files from my old AWX server and then do a restore so as to imitate the main use case?

Christian,
thank you for clarifying the use of the AWXBackup and AWXRestore objects.

I had been thinking about how I might use the AWXBackup resource to implement scheduled backups but my knowledge of Kubernetes is not great, so I started looking for other options.

There are a few.
Two in particular stood out.

One is called Velero.
This is an Open Source project and ongoing development is very active, 37 open PRs, 400+ issues, most recent commit was yesterday.
Backups are stored in the cloud.

The other one is Kasten K10.
This is a commercial product but it does have a free version that you can use with up to 5 nodes.
However it comes with a fairly comprehensive EULA that must be accepted before you can start to use it.
I’m waiting on feedback from our legal eagle before proceeding.
One thing that I do like about it is that, in addition to using cloud storage, you can also store backups in NFS storage.

Hope this helps.

Hi All, I tried my approach of “faking out” the AWXbackup/Awxrestore use case.

I believe I deleted the postgres PVC and the awx-demo “CR’s” by deleting the PVC kubectl delete pvc postgres-13-awx-demo-postgres-13-0 -n awx. but it only ever went to “Terminating” status. I also deleted the awx-demo deployment. Then copied of the backup files on the my minikube’s host

/home/bpadmin/.local/share/docker/volumes/minikube/_data/hostpath-provisioner/awx/awx-demo-backup-claim/tower-openshift-backup-2023-04-19-02*

Then I create an AWXrestore like so:

bpadmin@minikube:~/RESTORE/KOBO$ kubectl get awxbackup -n awx NAME AGE awxbackup-2023-04-18 14h bpadmin@minikube:~/RESTORE/KOBO$ cat restore-awx.yml — apiVersion: awx.ansible.com/v1beta1 kind: AWXRestore metadata: name: restore1 namespace: awx spec: deployment_name: awx-demo backup_name: awxbackup-2023-04-18

$ kubectl apply -f restore-awx.yml

Then I watched pods come and go …

bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx
Wed 19 Apr 2023 09:43:04 AM PDT
NAME READY STATUS RESTARTS AGE
awx-demo-postgres-13-0 1/1 Running 0 36m
awx-operator-controller-manager-7d79f6f96d-r2gr7 2/2 Running 3 (70m ago) 15h

bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx
Wed 19 Apr 2023 09:43:11 AM PDT
NAME READY STATUS RESTARTS AGE
awx-demo-postgres-13-0 1/1 Running 0 36m
awx-operator-controller-manager-7d79f6f96d-r2gr7 2/2 Running 3 (70m ago) 15h
restore1-db-management 0/1 ContainerCreating 0 4s

bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx Wed 19 Apr 2023 09:44:04 AM PDT
NAME READY STATUS RESTARTS AGE
awx-demo-757b674d65-gskgf 4/4 Running 0 35s
awx-demo-postgres-13-0 1/1 Running 0 37m
awx-operator-controller-manager-7d79f6f96d-r2gr7 2/2 Running 3 (71m ago) 15h
restore1-db-management 1/1 Running 0 56s

Then I try to see what’s going on on my EE container …

bpadmin@minikube:~/RESTORE/KOBO$ kubectl exec -it awx-demo-757b674d65-gskgf -n awx -c awx-demo-ee – bash
bash-5.1$ ls -l /var/lib/awx/projects/
total 0
bash-5.1$ ls -la /var/lib/awx/projects/
total 8
drwxrwxrwx 2 root root 4096 Apr 19 16:43 .
drwxr-xr-x 3 root root 4096 Apr 19 16:43 …

bash-5.1$ command terminated with exit code 137 bpadmin@minikube:~/RESTORE/KOBO$ date ; kubectl get pods -n awx Wed 19 Apr 2023 09:44:52 AM PDT NAME READY STATUS RESTARTS AGE awx-demo-postgres-13-0 1/1 Running 0 38m awx-operator-controller-manager-7d79f6f96d-r2gr7 2/2 Running 3 (71m ago) 15h restore1-db-management 1/1 Running 0 104s bpadmin@minikube:~/RESTORE/KOBO$

And this cycle keeps happening. Is there someway I can tell if the restore will work or if something has gone awry?

Hi RedCrick,

You could inspect the awx-operator logs to see the error, I bet the restore role has a failing task that is causing the management pod to keep coming up. Can you paste any errors here?

Thanks,
AWX Team

I am not sure I am grabbing the correct info but here is what I see:

bpadmin@minikube:~$ kubectl logs awx-operator-controller-manager-7d79f6f96d-r2gr7 -n awx

… lots and lots of log output …

TASK [restore : Restore database dump to the new postgresql container] ********* task path: /opt/ansible/roles/restore/tasks/postgres.yml:84\nfatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that ‘no_log: true’ was specified for this result", "changed": true}\n PLAY RECAP ********************************************************************* localhost : ok=47 changed=10 unreachable=0 failed=1 skipped=13 rescued=0 ignored=0 \r\n\n", “name” : “restore1”, “namespace” : “awx”, “ts” : 1681927674.66927 }

Do you think that has something to do with the fact the old postgres PVC delete job never did terminate?

Ok. I was finally able to figure out how to get rid of the old postgres PVC.
I had to delete the statefulset that was attached to it like so:

$ kubectl delete statefulset awx-demo-postgres-13 -n awx

But now I see a different failed message in the controller logs.

$ kubectl logs -f awx-operator-controller-manager-7d79f6f96d-r2gr7 -n awx


[localhost]: FAILED! => {"changed": true, "failed_when_result": true, "rc": 1, "return_code": 1, "stderr": "pg_restore: error: connection to database \"awx\" failed: connection to server at \"awx-demo-postgres-13.awx.svc.cluster.local\" (10.244.0.9), port 5432 failed: FATAL: password authentication failed for user \"awx\"\n", "stderr_lines": ["pg_restore: error: connection to database \"awx\" failed: connection to server at \"awx-demo-postgres-13.awx.svc.cluster.local\" (10.244.0.9), port 5432 failed: FATAL: password authentication failed for user \"awx\""], "stdout": "", "stdout_lines": }\n\r\nPLAY RECAP *********************************************************************\r\nlocalhost : ok=47 changed=10 unreachable=0 failed=1 skipped=13 rescued=0 ignored=0 \r\n\n",“job”:“3619264603841225764”,“name”:“restore1”,“namespace”:“awx”,“error”:“exit status 2”

Looks like ansible is not able to connect to the postgres DB. How can I fix that?

What does kubectl -n awx get statefulset return?

bpadmin@minikube:~$ kubectl -n awx get statefulset NAME READY AGE awx-demo-postgres-13 1/1 23h

Can you connect to the container by doing
kubectl -n awx exec -it awx-demo-postgres-13 -c postgres – bash

Not awx-demo-postgres-13 … but I can exec onto awx-demo-postgres-13**-0.**

admin@minikube:~$ kubectl -n awx exec -it awx-demo-postgres-13-0 -c postgres – bash root@awx-demo-postgres-13-0:/#

And I can get into the db shell …

root@awx-demo-postgres-13-0:/# psql -U awx psql (13.10 (Debian 13.10-1.pgdg110+1)) Type “help” for help. awx=#

My bad.
Can you connect to the database by doing
psql -d awx -U <DB_USERNAME> -W

Yep I can see these databases …

root@awx-demo-postgres-13-0:/# psql -U awx psql (13.10 (Debian 13.10-1.pgdg110+1)) Type “help” for help. awx=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------±------±---------±-----------±-----------±------------------ awx | awx | UTF8 | en_US.utf8 | en_US.utf8 | postgres | awx | UTF8 | en_US.utf8 | en_US.utf8 | template0 | awx | UTF8 | en_US.utf8 | en_US.utf8 | =c/awx + | | | | | awx=CTc/awx template1 | awx | UTF8 | en_US.utf8 | en_US.utf8 | =c/awx + | | | | | awx=CTc/awx (4 rows)

I just noticed that the error output below includes
f*ailed: connection to server at \"awx-demo-postgres-13.awx.*svc.cluster.local\" (10.244.0.9), port 5432 failed: FATAL
There’s no ‘-0’ in it, I’m not sure if that is relevant or not.