Hi,
After a recent upgrade to awx 21.5.0, we have issue with jobs targeting more then 1 host.
We used since a fewl years several Awx releases with a jumphost configured by inventory variables.
All was setup from an IBM post on configuring ssh jumpstation with awx.
https://github.com/IBM/IBMDeveloper-recipes/blob/main/multiple-jumphosts-in-ansible-tower-part-1/index.md
Until now it worked well.
Now we have several ssh timeout occuring and I fixed most of it by forcing the forks=1 in the global awx settings for the jobs…
Is there something new in this release compared to previous ones regarding container isolation that could explain the different behaviour ?
Regards,
Henri Chapelle
In order to better answer this question it would be helpful if you can tell us which version you were previously on.
-The AWX Team
Hi,
indeed I should provide more information, we have upgraded from 19.5 to 21.5.
We have used a jumphost since the start and only had issues with the 21.5.
Here is the procedure used : https://github.com/IBM/IBMDeveloper-recipes/blob/main/multiple-jumphosts-in-ansible-tower-part-1/index.md
On previous AWX releases, it was said to disable job isolation in order to use a jumpstation.
I don’t see that option anymore in the AWX configuration.
The issue we have since 21.5 is that most hosts failed to connect via a "banner exchange timeout’ message.
Sometimes, just restaring the failed job make it runs without issue, sometimes not.
I eventualy configured the forks setting to 1 and most jobs finish successfuly.
I noticed yesterday that in a template workflow, if I have 2 branches coming from a node (and thus a parallel execution), 1 branche exhibit the timeout ssh issue.
So mainly, we can only run 1 job 1 host at a time when using the jumphost.
We tested on a few servers that are directly accessible to run a job on multiple hosts by disabling the ssh jump and it worked.
So I don’t know if it is an expected AWX behaviour, our jumpstation is still used without issue from other applications.
Regards,
Henri Chapelle
With 21.x we are now using receptor which has a concept of a hop node. So you would make a mesh between a node in your managed network, the old “jumphost” and your AWX instance. The AWX instance would be a “control node” on the mesh. The jumphost would be a hop node on the mesh and the host inside the managed network would be an execution node. As a start you may want to look at this blog: https://www.ansible.com/blog/peeling-back-the-layers-and-understanding-automation-mesh
-The AWX Team