I have inherited the job of continuing with my organisation’s ansible setup after our sys admin left. I am having a hell of a job adding new server to the setup as ansible can not seem to connect to any newly created servers.
In the below example, I have a brand new server running Ubuntu 19.04. Ansible asks me for the password as expected and I provide it, but nothing seems to happen.
Are the servers using the same base setup as the old ones? If the other servers are not Ubuntu 19.04, try using the same OS as the older servers to see if there is an OS difference getting in your way.
Have you made any changes to the scripts? I.e., has anything changed in the setup? By the way - even if you are tempted to answer “no”, rest assured something has, or it would be working
Does a manual connection via ssh work?
The single commonest reason for an ssh connection hanging (rather than failing) is that the connection is is to the wrong IP address. The latter can be the effect of a DNS issue, too. Second most common is a filter stopping ssh getting through. Third most common is that ssh is asking for input somewhere the user can’t see it.
ssh can ask for several kinds of input - not just passwords. It may ask you to confirm connection to a new host, confirm that you want to connect to a host whose key seems to have change, or prompt you for a passphrase. The first two of those can be switched off with client-side options as you have done with one of them with StrictHostKeyChecking=no (it’s bad practice to do so, but in an automated setting perhaps necessary).
I have tried with brand new servers and a backup of one of the older servers. The only thing I haven’t done is take a fresh backup and try that, which I may try tomorrow. The OSs are different as the servers are running an out of date OS, that I will have to upgrade at some point!
Ansible won’t even connect outside of the script, to the new servers. All the old ones work fine. Unfortunately, if there is something I need to change to allow ansible to work, the sys admin neglected to include it in the docs he wrote!
Manual connection via SSH does work.
I don’t think the IP is wrong as I have checked it. When the IP is wrong and it can’t actually connect, I get an error back saying it is unreachable. In this scenario it hangs and gives no feedback.
Unreachability due to connection refused or unroutability will fail relatively quickly. Unreachability due to a filter dropping packets can hang for a very long time.
Since a manual ssh works, you know for a fact that the mechanics of it all are sound. Did you duplicate the command being used in Ansible? If not, that would be the next step.
Is the command being used to access the new servers literally character for character identical in every respect with the command being used to access the old servers?
Any commonality between the older servers, not shared by the new ones, is a possible culprit.