I’m trying to deploy awx with awx-operator. When I apply my kustomize, things start up, but then it terminates all the pods and only restarts postgres. It has replicas for awx-task and awx-web set to 0. I have to manually increase them to 1 before it starts these containers. If I make any changes to my config and reapply, it does the same thing. How can I get it to keep running awx-task and awx-web?
K8s 1.27 (k3s)
Kustomize.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
# Find the latest tag here: https://github.com/ansible/awx-operator/releases
- github.com/ansible/awx-operator/config/default?ref=2.19.1
# - secrets.yaml
- tls.yaml
- awx.yaml
# Set the image tags to match the git version from above
images:
- name: quay.io/ansible/awx-operator
newTag: 2.19.1
# Specify a custom namespace in which to install AWX
namespace: sea
This may be unrelated, but when I deleted my namespace and tried to recreate, the awx-task won’t start at all. It’s endlessly waiting for database migrations, but there is no migration job ever created, so it seems completely stuck here.
{“level”:“error”,“ts”:“2024-09-17T14:21:20Z”,“msg”:“Reconciler error”,“controller”:“awx-controller”,“object”:{“name”:“awx”,“namespace”:“sea”},“namespace”:“sea”,“name”:“awx”,“reconcileID”:“1fa7a53a-ca3a-46db-9049-dcd78f6e1cbb”,“error”:“event runner on failed”,“stacktrace”:“sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227”}
I think this is the problem, but I don’t know what to do about it.
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined. 'web_manage_replicas' is undefined\n\nThe error appears to be in '/opt/ansible/roles/installer/tasks/resources_configuration.yml': line 248, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Apply deployment resources\n ^ here\n"}
I saw an issue report on the github saying it was a problem upgrading from 2.18 to 2.19, but I didn’t upgrade from 2.18. I installed fresh from 2.19.1. If I install version 2.18, I don’t get this error and the deployment gets scaled back up after the initial “migration”.
Hi, I see the same error trying to go from 2.15.0 to 2.19.1. I tried adding the spec to awx-operator helm deployment: web_manage_replicas: true
But it still says it is not defined.
p.s. wxs.awx.ansible.com CRD needs to be forcefully upgraded or AWX complete redeployed.