Hanging on SSH connection

I have inherited the job of continuing with my organisation’s ansible setup after our sys admin left. I am having a hell of a job adding new server to the setup as ansible can not seem to connect to any newly created servers.

In the below example, I have a brand new server running Ubuntu 19.04. Ansible asks me for the password as expected and I provide it, but nothing seems to happen.

ansible -vvv [IP] -m ping -u root --ask-pass

I get the following output from the command:

<[IP]> ESTABLISH SSH CONNECTION FOR USER: root
<[IP]> SSH: EXEC sshpass -d9 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/tom/.ansible/cp/2827bd34c0 [IP] ‘/bin/sh -c ‘"’“‘echo ~root && sleep 0’”’"‘’

It will happily it there for ages with nothing happening until I abort the command and I am at a loss as to what to try next.

Are the servers using the same base setup as the old ones? If the other servers are not Ubuntu 19.04, try using the same OS as the older servers to see if there is an OS difference getting in your way.

Have you made any changes to the scripts? I.e., has anything changed in the setup? By the way - even if you are tempted to answer “no”, rest assured something has, or it would be working :slight_smile:

Does a manual connection via ssh work?

The single commonest reason for an ssh connection hanging (rather than failing) is that the connection is is to the wrong IP address. The latter can be the effect of a DNS issue, too. Second most common is a filter stopping ssh getting through. Third most common is that ssh is asking for input somewhere the user can’t see it.

ssh can ask for several kinds of input - not just passwords. It may ask you to confirm connection to a new host, confirm that you want to connect to a host whose key seems to have change, or prompt you for a passphrase. The first two of those can be switched off with client-side options as you have done with one of them with StrictHostKeyChecking=no (it’s bad practice to do so, but in an automated setting perhaps necessary).

Regards, K.

1) It's best practice not to allow ssh to connect to root.

   admin.remote$ grep PermitRootLogin /etc/ssh/sshd_config
   PermitRootLogin no

2) Make sure admin@controller is able to connect to admin@remote. Best
   practice is to put public key into /home/admin/.ssh/authorized_keys.
   https://docs.ansible.com/ansible/latest/plugins/connection.html

   admin.controller$ ssh admin@remote

3) Try to ping remote

   admin.controller$ remote -m ping

4) Optionally use different user at remote

   admin.controller$ remote -m ping -u other-user

5) See how to escalate privilege
   https://docs.ansible.com/ansible/latest/user_guide/become.html#understanding-privilege-escalation

Cheers,

  -vlado

I have tried with brand new servers and a backup of one of the older servers. The only thing I haven’t done is take a fresh backup and try that, which I may try tomorrow. The OSs are different as the servers are running an out of date OS, that I will have to upgrade at some point!

Ansible won’t even connect outside of the script, to the new servers. All the old ones work fine. Unfortunately, if there is something I need to change to allow ansible to work, the sys admin neglected to include it in the docs he wrote!

Manual connection via SSH does work.

I don’t think the IP is wrong as I have checked it. When the IP is wrong and it can’t actually connect, I get an error back saying it is unreachable. In this scenario it hangs and gives no feedback.

I’m aware it is not good practice but at this stage I am just trying to make it work at all. I will sort out security when it works!

Errata:

   admin.controller$ ansible remote -m ping

   admin.controller$ ansible remote -m ping -u other-user

Unreachability due to connection refused or unroutability will fail relatively quickly. Unreachability due to a filter dropping packets can hang for a very long time.

Since a manual ssh works, you know for a fact that the mechanics of it all are sound. Did you duplicate the command being used in Ansible? If not, that would be the next step.

Is the command being used to access the new servers literally character for character identical in every respect with the command being used to access the old servers?

Any commonality between the older servers, not shared by the new ones, is a possible culprit.

Regards, K.

What's the problem with 3) ?