Hello all,
We did a big upgrade of AWX from 15.0.1 to 23.1.0.
To do so I dumped the DB, created another postgres cluster, database, and user, granting access so the user can access the database.
Afterwards, I used the AWX Operator in OpenShift 4 to create an AWX instance.
After a lot of back and forth I was able to get AWX up and running, I can log into it as I am able to in my old environment (using creds stored in AD).
I’ve tried sending curl requests to add a host to inventory and that works, moving through the UI seems to work, but my projects aren’t syncing reliably.
I’ve seen cases where a sync will succeed, but the majority of sync attempts fail.
The logs aren’t much use:
Here’s the entirety of the logs:
Enter passphrase for /var/tmp/awx_89945_1pdcl45c/artifacts/89945/ssh_key_data:
Identity added: /var/tmp/awx_89945_1pdcl45c/artifacts/89945/ssh_key_data (awx@awxpoc-web-8f97cb7fc-khxrp)
PLAY [Update source tree if necessary] *****************************************
TASK [Update project using git] ************************************************
The logs from the task pod:
2023-09-26 16:06:24,921 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 controller node chosen
2023-09-26 16:06:24,921 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 execution node chosen
2023-09-26 16:06:25,066 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 waiting
2023-09-26 16:06:25,421 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 pre run
2023-09-26 16:06:25,440 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 preparing playbook
2023-09-26 16:06:25,512 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 running playbook
2023-09-26 16:06:25,537 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 work unit id received
2023-09-26 16:06:25,577 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 work unit id assigned
2023-09-26 16:06:25,812 INFO [-] awx.main.wsrelay Producer 10.129.6.253-schedules-changed has no subscribers, shutting down.
2023-09-26 16:06:31,119 INFO [11066f637064479eb13a7e75a49b5e86] awx.main.commands.run_callback_receiver Starting EOF event processing for Job 89949
2023-09-26 16:06:31,123 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 post run
2023-09-26 16:06:31,417 INFO [11066f637064479eb13a7e75a49b5e86] awx.analytics.job_lifecycle projectupdate-89949 finalize run
2023-09-26 16:06:31,423 WARNING [11066f637064479eb13a7e75a49b5e86] awx.main.dispatch project_update 89949 (failed) encountered an error (rc=None), please see task stdout for details.
This is what I see on the web pod:
2023-09-26 16:08:02,219 INFO [afd1ae5ddf624e02a06e123b25ad9b19] awx.analytics.job_lifecycle projectupdate-89950 created
10.130.5.200 - - [26/Sep/2023:16:08:02 +0000] “POST /api/v2/projects/527/update/ HTTP/1.1” 202 2265 “https://awx.apps.ocpazt001.csx.com/” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36” “10.92.172.184”
[pid: 244|app: 0|req: 27/3147] 10.130.5.200 () {70 vars in 2450 bytes} [Tue Sep 26 16:08:01 2023] POST /api/v2/projects/527/update/ => generated 2265 bytes in 456 msecs (HTTP/1.1 202) 15 headers in 635 bytes (1 switches on core 0)
If I rsh into the task or web pod, I can run a git clone and it is successful.
I’ve tried creating a project with new creds; however, the issue appears to be the same.
Thanks,
Shawn