Hi
We were testing out AWX’s ability to handle concurrent Ansible jobs. We executed a single Ansible job to a 1000 hosts with various number of forks (100,200,500 etc). The AWX server’s hardware’s spec is very good (8 CPU and 64GB of memory). The assigned Ansible job is a simple job where it’s shell command is “echo start; sleep 20; echo”. In the results, the AWX_web shows it inconsistently doesn’t get a response from some of the 1000 hosts. So the whole job status get stuck in a running state. In this state, another job in the AWX queue will not get executed as well. As almost all of the jobs on the host gets executed, and only a few hosts didn’t get executed, there should be ample capacity for the next Ansible job to run. It appears that the way AWX node capacity gets calculated prevents other AWX jobs from being executed. Can you please tell us how the maximum capacity and used capacity gets calculated. Also in the UI, the value used capacity calculated at the start of the job execution doesn’t get changed throughout the execution of the Ansible job.
Jae Kim