Debugging service module

I’m running ansible 1.5.5 against a CentOS 6 server, and trying to use the “service” module to manage an upstart job. However, I keep getting this error from ansible:

$ ansible bastion3.hadoop.ripe.net -i svn/gii/ansible_hosts -sK -m service -a ‘name=hdfs-sync state=stopped’
sudo password:
bastion3.hadoop.ripe.net | FAILED => failed to parse:
SUDO-SUCCESS-mwcdqmpymyognbdmbrqkqbkccfdmzwqs
Traceback (most recent call last):
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 2305, in
main()
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 1170, in main
service.get_service_status()
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 480, in get_service_status
rc, status_stdout, status_stderr = self.service_control()
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 722, in service_control
rc_state, stdout, stderr = self.execute_command(“%s %s %s” % (self.action, self.name, arguments), daemonize=True)
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 250, in execute_command
return json.loads(data)
File “/usr/lib64/python2.6/json/init.py”, line 307, in loads
return _default_decoder.decode(s)
File “/usr/lib64/python2.6/json/decoder.py”, line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib64/python2.6/json/decoder.py”, line 338, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded

If I log into the server, and run “stop hdfs-sync” or “status hdfs-sync” or “start hdfs-sync” it all works. My ansible setup also works against other CentOS 6 boxes just fine, and can start, stop and restart other upstart jobs just fine. So this is a weird case. How can I debug this more, and find out why ansible is failing with this specific upstart job on this server.

Anand

You could set an environment variable:

`
ANSIBLE_KEEP_REMOTE_FILES=1

`

and run ansible again. Once it fails, see what files were executed, log into the remote host and run the failed python script with:

python -m trace --trace script.py

Thanks for this tip. I did exactly as you describe, and ran the script on the host. To my surprise, it did what it was supposed to! The last few lines from the trace output show:

— modulename: encoder, funcname: _iterencode_dict
encoder.py(281): if markers is not None:
encoder.py(282): del markers[markerid]
encoder.py(368): return ‘’.join(chunks)
{“state”: “stopped”, “changed”: true, “name”: “hdfs-sync”}
service(2146): sys.exit(0)

The service stopped running, so the script did its work. I don’t understand why I get an error from ansible on my laptop then. Any ideas developers?

Anand

Actually, this looks like a bug that occurred with script and raw in some earlier versions of Ansible where the sudo success string was being leaked back. It was fixed, I’m now wondering if it was reverted in this case…

Can you upgrade to 1.6.1 and try it again?

Adam

Hi Adam, it still fails:

$ ansible --version
ansible 1.6.1
$ ansible bastion3.hadoop.ripe.net -sK -m service -a ‘name=hdfs-sync state=restarted’
sudo password:
bastion3.hadoop.ripe.net | FAILED => failed to parse:
SUDO-SUCCESS-dawyrikofupcjoyftolafyddywphopwj
Traceback (most recent call last):
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 2411, in
main()
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 1198, in main
service.get_service_status()
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 509, in get_service_status
rc, status_stdout, status_stderr = self.service_control()
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 750, in service_control
rc_state, stdout, stderr = self.execute_command(“%s %s %s” % (self.action, self.name, arguments), daemonize=True)
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 250, in execute_command
return json.loads(data)
File “/usr/lib64/python2.6/json/init.py”, line 307, in loads
return _default_decoder.decode(s)
File “/usr/lib64/python2.6/json/decoder.py”, line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib64/python2.6/json/decoder.py”, line 338, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded

There is an open bug report https://github.com/ansible/ansible/issues/7319 which sounds very similar…

Do you need to provide a password to use sudo on the remote host? Do you need to use sudo? I’m curious because that first line SUDO-SUCCESS … should be being eaten by part of ansible…

Adam

Hi Adam,

There is an open bug report https://github.com/ansible/ansible/issues/7319 which sounds very similar…

Yes, this bug report sounds a lot like what I’m experiencing.

Do you need to provide a password to use sudo on the remote host? Do you need to use sudo? I’m curious because that first line SUDO-SUCCESS … should be being eaten by part of ansible…

On the remote host, I need to use sudo to run commands as root, and I need to provide a password.

This isn't an upstart script is it?

I saw something very very similar if I tried to set enabled=no on an
upstart-managed
service on CentOS a few weeks back. Removing that clause made it work.

This is a traceback in the service module, certaintly.

Please be sure there’s a ticket open and if you don’t have the same issue, file a new ticket, and we’ll look into this promptly.

We take the position that a traceback is always a bug in nearly all cases, modules should return reasonable errors if they ever have to fail, etc.

Hi Dick,

This isn’t an upstart script is it?

I saw something very very similar if I tried to set enabled=no on an
upstart-managed
service on CentOS a few weeks back. Removing that clause made it work.

Yes, as my original message said, this is an upstart script.

Anand

Hi Michael,

This is a traceback in the service module, certaintly.

Please be sure there’s a ticket open and if you don’t have the same issue, file a new ticket, and we’ll look into this promptly.

I believe my issue is very similar, and possibly the same, as issue #7319.