I’m running an operator on OpenShift controlling 5 instances and connected to the current release. The operator upgraded itself and now my instances are reporting themselves as AWX 23.8.0. That’s all good. Everything functions as expected, except the follow function.
Running jobs no longer “follow” correctly. I launch the job, and the GUI switches to the job page correctly. The status button correctly reports as running. I can watch the automation pod start, run, and stop in OpenShift. The whole time, the job page status button says the job is running with no output. It will stay in that state indefinitely. I can switch to the Details tab, and the status button will still say running. Then I can switch back to the output tab, and the status button will still be spinning and say running, but the accurate job output showing success and closure will appear. The output tab has the correct output, but the status button is still spinning and running.
If I go back to the jobs tab, the job accurately shows successful. If I open the job again, the status button reports successful. Everything is working, except the follow function. I’ve restarted the deployments and I’ve bounced the pods.
{"level":"error","ts":"2024-02-20T18:52:19Z","msg":"Reconciler error","controller":"awx-controller","object":{"name":"awx-i-sid","namespace":"awx-operator"},"namespace":"awx-operator","name":"awx-i-sid","reconcileID":"637da168-d0cb-47e0-ab9e-c50c339e4c5a","error":"event runner on failed","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.
We are trying a new method of installing AWX (as a tenant to OpenShift). We are starting with 2.12.1, so this will eliminate that variable. I am also going to use the more familiar way of creating secrets, which will remove another variable. I suspect that’s the problem with the reconcile. Time will tell.