I’m working on migrating my company’s Ansible playbooks from our current setup with a static VM, running static flat files against ansible-playbook
, to using AWX. I’ve managed to get everything working, except playbook execution is much slower on AWX.
For example, for a particular playbook running against 9 hosts, I’ve recorded these run times:
- CLI/ansible-playbook: 16m, 5m
- AWX: 41m, 25m
This will be a blocker for migration to AWX, as some of our deployments go to hundreds of hosts and take several hours on the command line already, and I can’t afford to spend 3x-5x as much time.
I’ve noticed a number of “fixed cost” drivers of this, e.g. in some cases, AWX has to spin up an automation-task on the k8s cluster, and sometimes a new k8s node is required to do so. But putting that aside, even execution within the tasks takes consistently longer.
My leading hypothesis is that it’s not using persisted SSH connections correctly. We use a bastion host to connect between Ansible and our devices. In a recent playbook run, I looked on the end device’s /var/log/auth.log and noted:
- for CLI/ansible-playbook: 3 instances of “Accepted password for ”
- for AWX: 293 instances of “Accepted password for ”
- In both cases, on the bastion, I do see hundreds of “Accepted publickey for ” so it seems like the connection between the bastion and the end device is the difference.
In my inventory, I have added the following:
ansible_ssh_common_args: '-o ProxyCommand="ssh -v -W %h:%p user@bastion.example.com -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPersist=3600s"'
and when I run with verbose logs, I see this:
SSH: EXEC sshpass -d12 ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o ‘User=“user”’ -o ConnectTimeout=10 -o ‘ProxyCommand=ssh -v -W %h:%p user@bastion.example.com -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPersist=3600s’ -o ‘ControlPath=“/runner/cp/7fd6ab9754”’ 10.0.0.999 ‘/bin/sh -c ‘"’“‘echo ~user && sleep 0’”’"‘’
(replaced IP user/hostname with examples)
So it seems like AWX wants to persist the connection, but it doesn’t seem to be working. Are there other settings I need to tweak to get SSH persistence to work? Currently using AWX 22.3.0. I could upgrade if that helps, but didn’t see anything in the release notes about SSH or persistence.