Hi All,
We have an AWX deployment with two web and task replicas. Several jobs have been scheduled with the “Update revision on job launch” option enabled. For these scheduled jobs, we observe that some source control updates are failing without any error messages. Upon investigation, we found that the logs are getting stuck midway, resulting in a failed status. Please refer to the attached screenshot for further details.
Additionally, we have observed that the corresponding playbook is also failing. The logs indicate a “Failure Explanation: Previous Task Failed”. Upon further inspection,
In The details in the playbook logs, it appears that the issue is related to a missing resource in the Execution Environment. Interestingly, the same task performs perfectly at the next scheduled time.
It started happening all of a suddenly without making any upgrades or modifications.
AWX version: 24.2.0
Please refer to the attached screenshots for more information.
I am also attaching the pod logs from the ocp for reference.