this is preventing me from deploying so it’s kind of a serious and time-sensitive issue…
i’m in China and the internet connection out of the country (to where our servers are) is unstable.
i’ve been trying to run a playbook with around 12 tasks for an hour or two now, and the whole thing fails seemingly if it can’t connect once:
UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
it looks like ansible is creating a new ssh connection for each play:
before each task, and it seems to give up completely if it fails to open a connection at any time. this makes it essentially impossible to get through the playbook.
i can hold an ssh terminal open with the server no problem… what’s going on here? is ansible creating a new connection for every task? can i make it just hold one connection open and use that? can i make it retry a few times if it can’t connect?
that is basically what i ended up doing. it makes the workflow a pain in the ass because i have to make changes to the playbooks on my local machine (connection is too slow and unstable to do much editing on the server), commit, push, and pull on the server, but then running locally works just fine.
whether it’s reusing connections or not, the main issue seems to be that it fails very easily if the connection shakes or dies. the value of software like ansible in my opinion is the declarative nature (this should be in the state), which makes the playbooks idempotent and allows for reties.
mosh http://mosh.mit.edu/ is an excellent replacement for the ssh
connection from your workstation to the server, to enable editing in
situ. It uses udp and a predictive transfer mode so works very well over
sluggish connections. I don't think it would be suitable as a
replacement for ssh within ansible however, but it might cut down your
editing hops at least.
I’ve been toying with a ‘mosh connection plugin’ but that is still in vaporware stage.
For cases like these you might want to look into ansible-pull which will only require an initial access to the ‘git server’ from the machines and then all the tasks run locally, making it much more resilient when connectivity is an issue.
Step 1: Set up your Ansible Control Machine outside of the country closer to your target machines.
Step 2: Learn Tmux or Screen
Step 3: Run ‘ansible-playbook’ in Tmux or Screen on your new Control Machine.
yup, that’s where i’m at. which i pretty much the same place i was at when i admin’d a cluster eight years ago.
i like ansible as a idempotent library for scripting (way better than the shell scripts we used to use), but it was worth a try to be able to fiddle something on my dev machine and run single local command that would take care of all the hosts and ssh’ing and everything and just get the stuff where it needs to be.