We’re having a problem when accelerated mode unexpectedly disconnects for whatever reason. This happens often when running test playbooks from poor connections, e.g. in-flight wifi. It can also happen when running a big play book when you just hit the dropped connection lottery. In either case, the host fail out saying “unable to connect to port 5099”. In future runs, it will continue to refuse the accelerated port until the hanging python daemon has been killed manually.
Thus far the only way I’ve found to kill the daemon is to killall python (I haven’t discovered a way of definitely identifying it’s PID). I can run something like
ansible GROUP -m command -a “killall pyhon”
but this has the unwanted side effect of killing all other python processes on the system.
Advice?
All the best,
~ Christopher
“This happens often when running test playbooks from poor connections, e.g. in-flight wifi.”
I’m pretty sure almost nothing works over in-flight WiFi
“In future runs, it will continue to refuse the accelerated port until the hanging python daemon has been killed manually.”
This part seems more interesting and I haven’t seen this. If you can find a way to replicate the problem without an airplane that would be helpful and we could take a look
"In future runs, it will continue to refuse the accelerated port until
the hanging python daemon has been killed manually."
This part seems more interesting and I haven't seen this. If you can
find a way to replicate the problem without an airplane that would be
helpful and we could take a look
We've used to see this as well on occasion before we ditched accelerate for
ssh_alt. Despite our best efforts we were unable to determine a way to
reliably reproduce. The only solution was logging onto the box and killing
python (or waiting for the timeout). I believe it had something to do with
not have the key from the previous accelerate session and the previous
session still waiting for tasks.
Yes, if you are in a rekey situation that could very well be it.
We have an open feature idea for making each new connection attempt “add” a new key, which would resolve that one particular issue better.
Pipelining (the new ssh_alt isn’t named ssh alt BTW, but is set by pipelining=True in ansible.cfg) is going to be better in most cases.
I’m about to adapt the docs on accelerate to strongly emphasize this.