Ansible SSH Proxyjump

Good day,

I’ve been using my Macbook Pro M1 2021 as my ansible controller for a couple of years with ssh_args specified like this:

ansible_ssh_common_args: '-o ProxyCommand="ssh -W %h:%p -q 192.198.10.10'

And my ~/.ssh/config like this:

Host 192.168.10.10
  User myuser@domain.net
  IdentityFile ~/.ssh/id_rsa.pub

This way ssh key is used to authenticate towards the jumphost and ansible_password is used to authenticate to the cisco network device, this setup has worked flawlessly.

But now all of a sudden it seems like this is not working.
Running ansible-core 2.18.0 paramiko will give me this error:

fatal: [switch]: FAILED! => changed=false
  msg: Error reading SSH protocol banner[Errno 54] Connection reset by peer

Whilst libssh gives me this:

fatal: [switch]: FAILED! => changed=false
  msg: 'ssh connection failed: ssh connect failed: Timeout connecting to 10.10.13.37'

It seems like the ansible_ssh_common_args for some reason are now not appreciated at all, I’m also using a yubikey and it is completely dead while i try to run something.

And also yes, when I do the same with SSH directly everything is working as expected:

[user@MBP ~/Documents ]$ ssh -o 'ProxyCommand ssh -W %h:%p -q myuser@domain.net@192.168.10.10' ansible_user@"10.10.13.37"
(ansible_user@10.10.13.37) Password:
switch#

Anyone experienced the same after upgrade of either ansible-core or ansible.netcommon?
Unfortunately I havent kept good track of my versions so not sure exactly when this occured :disguised_face:

Best regards.

Can you run ansible with -vvvv option? It will tell you the exact SSH command used by the ansible and show you a way to manually reproduce the problem.

Does it work if you put all the configuration in your ~/.ssh/config, for example:

Host 192.168.10.10
  User myuser@domain.net
  IdentityFile ~/.ssh/id_rsa.pub

Host 10.10.13.37
  User ansible_user
  ProxyJump 192.168.10.10

I think my problem is that the ssh-agent which ansible decides to use is the wrong one.

Here’s what I have in my bashrc to make sure that it’s done when I run ssh, but seems like ansible/ansible.netcommon does not appreciate this:

export SSH_AUTH_SOCK=$(gpgconf --list-dirs agent-ssh-socket)

Thank you for your reply, I’ve tried this, but result is still the same.
The thing is that I have my private key stored on a yubikey, when ssh’ing locally this is invoced by using gpg as ssh-agent. This has been working before, but seems like at some point ansible.netcommon/paramiko does not appreciate this configuration anymore.

Ran this command

ansible switch -i inventory/ -m ios_ping -vvvv -a "dest=8.8.8.8"

This is the output:

The full traceback is:
  File "/Users/username/.ansible/collections/ansible_collections/cisco/ios/plugins/module_utils/network/ios/ios.py", line 60, in get_capabilities
    capabilities = Connection(module._socket_path).get_capabilities()
  File "/opt/homebrew/Cellar/ansible/11.0.0_1/libexec/lib/python3.13/site-packages/ansible/module_utils/connection.py", line 183, in __rpc__
    raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
switch | FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "afi": "ip",
            "count": null,
            "dest": "8.8.8.8",
            "df_bit": false,
            "egress": null,
            "ingress": null,
            "size": null,
            "source": null,
            "state": "present",
            "timeout": null,
            "vrf": null
        }
    },
    "msg": "Error reading SSH protocol banner[Errno 54] Connection reset by peer"
}

You could also have a look at assh, which is a really nice, compact toolkit for SSH that handles the connection setup for stuff like this (and a whole lot more)

2 Likes

Yes but what is preceding this line? Those blue lines where Ansible prints the exact SSH commands it is running? Try running some of them manually to understand what is happening.

Sorry, here is the entire thing:

ansible [core 2.18.1]
 config file = user/Documents/git/ansible-network-automation/ansible.cfg
 configured module search path = ['user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
 ansible python module location = /opt/homebrew/Cellar/ansible/11.1.0/libexec/lib/python3.13/site-packages/ansible
 ansible collection location = user/.ansible/collections:/usr/share/ansible/collections
 executable location = /opt/homebrew/bin/ansible
 python version = 3.13.1 (main, Dec  3 2024, 17:59:52) [Clang 16.0.0 (clang-1600.0.26.4)] (/opt/homebrew/Cellar/ansible/11.1.0/libexec/bin/python)
 jinja version = 3.1.4
 libyaml = True
Using user/Documents/git/ansible-network-automation/ansible.cfg as config file
setting up inventory plugins
Loading collection ansible.builtin from
host_list declined parsing {{ inventory_path }}/campus as it did not pass its verify_file() method
script declined parsing {{ inventory_path }}/campus as it did not pass its verify_file() method
auto declined parsing {{ inventory_path }}/campus as it did not pass its verify_file() method
Parsed {{ inventory_path }}/campus inventory source with ini plugin
setting up inventory plugins
host_list declined parsing {{ inventory_path }}/datacenter as it did not pass its verify_file() method
script declined parsing {{ inventory_path }}/datacenter as it did not pass its verify_file() method
auto declined parsing {{ inventory_path }}/datacenter as it did not pass its verify_file() method
Parsed {{ inventory_path }}/datacenter inventory source with ini plugin
setting up inventory plugins
host_list declined parsing {{ inventory_path }}/management as it did not pass its verify_file() method
script declined parsing {{ inventory_path }}/management as it did not pass its verify_file() method
auto declined parsing {{ inventory_path }}/management as it did not pass its verify_file() method
Parsed {{ inventory_path }}/management inventory source with ini plugin
setting up inventory plugins
host_list declined parsing {{ inventory_path }}/routers as it did not pass its verify_file() method
script declined parsing {{ inventory_path }}/routers as it did not pass its verify_file() method
auto declined parsing {{ inventory_path }}/routers as it did not pass its verify_file() method
Parsed {{ inventory_path }}/routers inventory source with ini plugin
redirecting (type: modules) ansible.builtin.ios_ping to cisco.ios.ios_ping
Loading collection cisco.ios from user/.ansible/collections/ansible_collections/cisco/ios
Loading callback plugin minimal of type stdout, v2.0 from /opt/homebrew/Cellar/ansible/11.1.0/libexec/lib/python3.13/site-packages/ansible/plugins/callback/minimal.py
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
Loading collection ansible.netcommon from user/.ansible/collections/ansible_collections/ansible/netcommon
Loading collection ansible.utils from user/.ansible/collections/ansible_collections/ansible/utils
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
Trying secret FileVaultSecret(filename='user/Documents/git/ansible-network-automation/.secret/vault_pass.txt') for vault_id=default
redirecting (type: modules) ansible.builtin.ios_ping to cisco.ios.ios_ping
redirecting (type: action) ansible.builtin.ios to cisco.ios.ios
<192.168.10.10> Using network group action ios for ios_ping
redirecting (type: action) ansible.builtin.ios to cisco.ios.ios
<192.168.10.10> attempting to start connection
<192.168.10.10> using connection plugin ansible.netcommon.network_cli
<192.168.10.10> local domain socket does not exist, starting it
<192.168.10.10> control socket path is user/.ansible/pc/fee0104b9e
<192.168.10.10> Loading collection ansible.builtin from
<192.168.10.10> Loading collection ansible.netcommon from user/.ansible/collections/ansible_collections/ansible/netcommon
<192.168.10.10> Loading collection ansible.utils from user/.ansible/collections/ansible_collections/ansible/utils
<192.168.10.10> Loading collection cisco.ios from user/.ansible/collections/ansible_collections/cisco/ios
<192.168.10.10> local domain socket listeners started successfully
<192.168.10.10> loaded cliconf plugin ansible_collections.cisco.ios.plugins.cliconf.ios from path user/.ansible/collections/ansible_collections/cisco/ios/plugins/cliconf/ios.py for network_os cisco.ios.ios
<192.168.10.10> ssh type is set to paramiko
<192.168.10.10> Loading collection ansible.builtin from
<192.168.10.10> local domain socket path is user/.ansible/pc/fee0104b9e
redirecting (type: action) ansible.builtin.ios to cisco.ios.ios
<192.168.10.10> ANSIBLE_NETWORK_IMPORT_MODULES: enabled
redirecting (type: modules) ansible.builtin.ios_ping to cisco.ios.ios_ping
<192.168.10.10> ANSIBLE_NETWORK_IMPORT_MODULES: found ios_ping  at user/.ansible/collections/ansible_collections/cisco/ios/plugins/modules/ios_ping.py
<192.168.10.10> ANSIBLE_NETWORK_IMPORT_MODULES: running ios_ping
<192.168.10.10> ANSIBLE_NETWORK_IMPORT_MODULES: complete
The full traceback is:
 File "user/.ansible/collections/ansible_collections/cisco/ios/plugins/module_utils/network/ios/ios.py", line 60, in get_capabilities
   capabilities = Connection(module._socket_path).get_capabilities()
 File "/opt/homebrew/Cellar/ansible/11.1.0/libexec/lib/python3.13/site-packages/ansible/module_utils/connection.py", line 183, in __rpc__
   raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
switch | FAILED! => {
   "changed": false,
   "invocation": {
       "module_args": {
           "afi": "ip",
           "count": null,
           "dest": "8.8.8.8",
           "df_bit": false,
           "egress": null,
           "ingress": null,
           "size": null,
           "source": null,
           "state": "present",
           "timeout": null,
           "vrf": null
       }
   },
   "msg": "Error reading SSH protocol banner[Errno 54] Connection reset by peer"
}

Thanks!

Ah, this is using network_cli connection plugin, not a standard SSH connection plugin, plus it uses paramiko so you don’t get any SSH client commands in the output. I’m not familiar with network_cli connection plugin so I’m not sure how to debug this. Maybe some network guy can help here :slight_smile: ?

This is the example of what I was hoping to see:

<192.168.10.10> ESTABLISH SSH CONNECTION FOR USER: root
<192.168.10.10> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o Port=22 -o 'IdentityFile="somekey"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o 'ControlPath="/home/bvitnik/.ansible/cp/ba7fe97e5a"' 192.168.10.10 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''

Then it is easy to run the command manually to reproduce the problem.

Right, I’ve also tried with ansible-pylibssh, as mentioned in the original post, but the problem is the same just a different output.

I think that the root cause is that I have defined “gpg-agent” as my ssh-agent locally, this does the lookup for my private key which is stored on a yubikey. But this configuration is not appreciated by ansible and hence everything times out while it’s trying to exchange keys with the proxy host.

That may be the case but you said that this setup with yubikey worked for you before and is not working now with Ansible 2.18. What changed in the mean time? OS version? Ansible version? Some SSH and yubikey related config?

That’s on me, but I dont really know exactly when this stopped working, unfortunately…

But was working one or two months ago atleast, and since then I’ve mainly ran things from our AAP deployment, and haven’t really kept track of when this setup stopped working.

The ansible_ssh_common_args variable configured the ssh_common_args option for the paramiko connection plugin, which was removed in 2.18. It’s still configurable as the proxy_command option: ansible.builtin.paramiko_ssh connection – Run tasks via Python SSH (paramiko) — Ansible Community Documentation

1 Like

Inserting this to ansible.cfg did the trick:

[paramiko_connection]
proxy_command = "ssh -W %h:%p -q 192.168.10.10"

Tried defining it within the inventory as a variable aswell, but this does not seem to work.

Thanks for all the help!

2 Likes

You should also be able to configure this with the variable ansible_paramiko_proxy_command (just tested on devel and it works). If you have a template in the variable, there’s a bug that was recently fixed, and should be backported to 2.18 soon ([2.18] fix reset_connection with templated connection variables (#84240) by s-hertel · Pull Request #84439 · ansible/ansible · GitHub). :crossed_fingers:

2 Likes