Increasing Replicas from 1 to 2 Fails

AWX version: 21.9.0
awx-operator version: 1.1.0

I have awx-operator deployed in a k3s multi-master cluster. I increased the # of replicas in the awx CRD from 1 to 2. awx deployment and replicaset correctly get increased.

the new awx pod gets scheduled but fails to init.

Attached are the awx-manager logs. Those seem to indicate a problem with the “Apply deployment resources” TASK.

kubectl events for the failed pod in the same attachement.

My PVCs are also set to RWX.

(attachments)

troubleshooting_replica_Increase.txt (52.6 KB)

Hello,
It sounds like the project’s PVC may be the culprit. Project persistence is not required. When the pods come back up they will pull your projects anew from SCM. Perhaps try reconfiguring this to not require project persistence and see if that helps. Please let us know if it does.

This does sound like it may be a bug with the project’s persistence feature. Is there an issue associated with this report?

-AWX Team

Thank you for the suggestion! I will give this a go in the morning and will report back.

Changing project persistence to false did the trick. I now have two instance deployed and both are reporting healthy when looking at them from within the AWX GUI.

The only thing I am seeing now is that the PVC for the Postgres pod is complaining about a failed mount:

Warning FailedMount 60s (x487 over 16h) kubelet MountVolume.MountDevice failed for volume “pvc-790c4558-4718-4a5f-93ac-df7fa8383988” : rpc error: code = FailedPrecondition desc = volume pvc-790c4558-4718-4a5f-93ac-df7fa8383988 requires shared access but is not marked for shared use

I don’t seem to be seeing any adverse affects from the above. The GUI is still accessible, controlplane jobs have ran on both instances. Normal jobs being executed in the default ContainerGroup are working just fine.

Do I just ignore this or is something just waiting to leap out and bite me?

Thank you for the help!

yeah probably can ignore the warning. wondering what “kubectl describe pvc/pvc-790c4558-4718-4a5f-93ac-df7fa8383988” returns? what is this pvc intended for?

This is the postgres-13 PVC.

There are no events when I look under the describe for both the PV and the PVC.

PVC and PV output attached.

(attachments)

PVC_Troubleshooting.txt (2.01 KB)

although benign, the warning might be something we wish to fix. would you mind opening an issue in awx-operator for this? thanks!

AWX Team