We have recently moved from AWX v15 to AWX v21 and some of my users are complaining about jobs taking too long to begin, when compared to previous version.
For instance, when nothing else is being executed, a job takes about 20 seconds to begin in AWX v21, whereas the same job can take about 3 seconds to begin in AWX v15. Both environments are using an external PSQL database.
I understand that creating the execution environment takes some time (even if we have them in “just pull when missing” mode), but I don’t know if those times are expected or we should be doing some tuning in our resources to improve performance.
What are your opinions on this? Do you think those times are normal? Do you have any advice to try to improve the performance?
a job takes about 20 seconds to begin in AWX v21, whereas the same job can take about 3 seconds to begin in AWX v15
do you have project updates running before the job begins?
There is some expected overhead from v15 to v21 due to the execution environment feature. The added benefits of execution environments is worth the overhead:
flexibility of unique environments to run jobs in
not having to run task container in privileged mode which is a security issue for many users
better horizontal scaling across multiple compute nodes within a k8s cluster
It does take time for the pod to spin up and enter Running state. There might be helpful event logs from this pod to see if the pod needed to pull the image or not, or if the pod is scheduled on a node, etc.
No, there aren’t any project updates and the image is already available so no need to pull it. I have managed to let users understand the benefits of this new architecture, but it seems it’s still too long for some cases.
I would try to check the logs while the job is executing in case I see any additional information.
In the meantime, do you have any recommendations about resources or execution environments that could help with performance? Is there any way to limit the resources that an execution environment uses when it is instanced?