Long-Running Commands Fail

Art_Zemon · May 11, 2012, 12:59pm

I’m stuck on getting a long-running command to work.

Running a devel from yesterday morning

commit b90c2356c38685b969c6dc42e4f0cee583fdf431
Merge: 461ba57 c362a2e
Author: Michael DeHaan michael.dehaan@gmail.com
Date: Thu May 10 05:05:50 2012 -0700

I am trying to run the Plesk installer, which takes about 8 or 9 minutes to complete. I ran it with the shell module, so I could redirect the roughly 300 KB of output to a file, and even tried async:1200 with poll:5 and get nowhere. Run synchronously, the playbook gets stuck and never returns an error and never moves to the next task. Running with async:1200 and poll:5, the playbook counts down to 880 seconds remaining and gets stuck.

Ideas?
– Art Z.

Michael_DeHaan · May 11, 2012, 1:17pm

Sounds like your plesk installer went interactive or is otherwise fouled up to me.

Art_Zemon · May 11, 2012, 2:14pm

Michael,

Nope; it does not go interactive. It just runs a long time. I even
tested once more just to be sure. Everything is running as expected (and
as we have been seeing it work for the last couple of years).

This is a 100% test machine. If you like, I can install your ssh public
key in root's .ssh/authorized_keys file and send you the playbook and
let you give it a try. Even better, it is a cloud server so I can
re-initialize it from an image in about 2 minutes, which makes repeated
testing reasonably time-efficient.

-- Art Z.

Michael_DeHaan · May 11, 2012, 2:21pm

Sounds like your plesk installer went interactive or is otherwise
fouled up to me.

Michael,

Nope; it does not go interactive. It just runs a long time. I even
tested once more just to be sure. Everything is running as expected (and
as we have been seeing it work for the last couple of years).

Hmm…. I’m still suspicious.

There isn’t a command timeout anymore.

Perhaps the lack of output for some time caused a general timeout in sshd?

I would be curious if the same problems exist on the master branch (with sudo mode disabled, because master
branch non-sudo code is a LOT different).

Anyway, async completely demonizes a process, so that should not have any such behavior as a non-async’d process.
Lack of returning things there shouldn’t be an issue.

This is a 100% test machine. If you like, I can install your ssh public
key in root’s .ssh/authorized_keys file and send you the playbook and
let you give it a try. Even better, it is a cloud server so I can
re-initialize it from an image in about 2 minutes, which makes repeated
testing reasonably time-efficient.

Yeah, I am not going to do this for various (time/legal) reasons but thanks very much for the offer.

I would recommend trying to debug further.

John_Kleint · May 11, 2012, 2:57pm

I've seen this before, and I haven't fully grokked it, but I think the
problem is lots of output, not long-running commands. Try:

ansible machine -a 'cat /usr/share/dict/words'

which hangs, vs.

ansible machine -a 'cat /etc/issue'

which does not.

The hang is in runner._exec_command() reading the stderr "file" from
Paramiko. I think the problem is that there isn't really a stderr
(since we're using a pty), so lots of stdout fills up the channel's
buffer and blocks stderr waiting for you to read stdout. Not sure if
that's an issue with Paramiko or not, but I have a one-line fix: just
make "stderr" an empty string.

https://github.com/ansible/ansible/pull/365

The rest of the code seems prepared to handle this, and the tests
pass.

-John

Matt_Coddington · May 11, 2012, 3:54pm

FWIW, I've seen this behavior as well (devel from yesterday). I had
mistakenly coded "tar xzvf" instead of "tar xzf". The verbose one
hung.

I just verified that jkleint's patch fixes this for me.

matt

Michael_DeHaan · May 11, 2012, 4:02pm

Excellent, thx guys, will merge shortly.

-- Michael

Art_Zemon · May 11, 2012, 4:11pm

Great news, guys. I will be patient and try again when you get the patch
merged and pushed instead of continuing to look for a workaround.

Cheers,
-- Art Z.

Michael_DeHaan · May 11, 2012, 4:52pm

merged!

Art_Zemon · May 11, 2012, 6:24pm

Tested. Success! Thank you, gentlemen!

-- Art Z.

Topic		Replies	Views
Long Running Commands Ansible Project	0	0	May 11, 2012
Providing Progress for Long running tasks which do not support Async Ansible Project	1	24	August 12, 2020
Ansible async module issue Ansible Developer	1	14	April 2, 2020
async timeout/polling not picking up finished process? Ansible Project	5	31	September 12, 2013
Shell script doesn't run completely even on using async Ansible Project	1	9	February 15, 2022

Long-Running Commands Fail

Related topics