ssh failure reporting between paramiko and ssh

Having recently moved my ansible install from a CentOS6 to a CentOS7 based node, now the ssh version is pretty modern, and ansible in smart mode has switched from using paramiko to ssh.

Now it’s quicker thanks to pipelined mode, but now the reporting of a ssh failure has lost useful information.

For example, with paramiko, you had:

FAILED => FAILED: Authentication failed.

… for nodes up, with ssh listening, but no valid auth for some reason
FAILED => FAILED: timed out

… for nodes down, no ssh response at all
FAILED => FAILED: [Errno -2] Name or service not known

… for nodes not even resolved in DNS

… and other possible messages.

Now with ssh, you only get:

FAILED => SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh

whatever the cause, the same message for all cases listed above.

I appreciate the increased speed, but sometimes I need to be able to take actions depending of the kind of failure. For example, if a node is not down but someone has messed with the ssh keys I must correct it (first example). If it’s simply down (the other cases), normally I can’t do anything but wait for the user to power it up again.

Anyone knows if there is something that can be done about this, and get again rich ssh error messages instead of the generic one with ssh mode?

Run in -vvvv mode if you encounter a connectivity failure you do not understand and you’ll get more information/verbosity out of SSH.

For this I don’t even need ansible: I can run ssh directly with as many -vvv… flags to the failing node to debug why it fails. If ssh works but ansible fails, then -vvvv ansible flags help, admitted. Except that ansible -vvvv flags provide information only if there is some minimal communication with the remote sshd daemon; if not, there is no / zero output to debug.

But this is not the issue; I already understand the different kind of failures I can find in my environment. What I need is ansible providing minimum usable feedback about them, as paramiko based connection provided, so I can quickly filter out the failures requiring attention in daily, routine operations. I think that we could agree on that -vvvv output, apart from being a lot more verbose, its not practically usable in scripts/filters to program actions depending on the kind of failure.

I deal with hundreds of nodes which dynamically go up and down (DHPC addressing, dynamic DNS self update), so its normal for some names to be on the network one day, missing the next. This is normal and does not require further action from my side. But if a rogue user messes with its node configuration and takes away what allows my ansible station to connect to it via ssh (think authorized_keys), this node will be out of updates until I correct this, don’t matter how many time it is visible on the network.

Right now, the ssh connection mode does not distinguish in any way ssh protocol failures (there is a sshd in the other side, but fails to complete connection) from total lack of ssh connection (you don’t get any sshd answer at all, timeout, like if a node is down of it misses sshd). But paramiko did, in a way that a script or filter could easily detect and highlight to my attention.

This is the point, ansible/ssh filtering out relevant information about ssh connections that ansible/paramiko conveniently provided.

El divendres 18 de juliol de 2014 20:18:00 UTC+2, Michael DeHaan va escriure: