Debugging service module

Anand_Buddhdev2 · May 7, 2014, 12:52pm

I’m running ansible 1.5.5 against a CentOS 6 server, and trying to use the “service” module to manage an upstart job. However, I keep getting this error from ansible:

$ ansible bastion3.hadoop.ripe.net -i svn/gii/ansible_hosts -sK -m service -a ‘name=hdfs-sync state=stopped’
sudo password:
bastion3.hadoop.ripe.net | FAILED => failed to parse:
SUDO-SUCCESS-mwcdqmpymyognbdmbrqkqbkccfdmzwqs
Traceback (most recent call last):
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 2305, in
main()
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 1170, in main
service.get_service_status()
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 480, in get_service_status
rc, status_stdout, status_stderr = self.service_control()
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 722, in service_control
rc_state, stdout, stderr = self.execute_command(“%s %s %s” % (self.action, self.name, arguments), daemonize=True)
File “/tmp/ansible-tmp-1399466909.49-7501828270786/service”, line 250, in execute_command
return json.loads(data)
File “/usr/lib64/python2.6/json/init.py”, line 307, in loads
return _default_decoder.decode(s)
File “/usr/lib64/python2.6/json/decoder.py”, line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib64/python2.6/json/decoder.py”, line 338, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded

If I log into the server, and run “stop hdfs-sync” or “status hdfs-sync” or “start hdfs-sync” it all works. My ansible setup also works against other CentOS 6 boxes just fine, and can start, stop and restart other upstart jobs just fine. So this is a weird case. How can I debug this more, and find out why ansible is failing with this specific upstart job on this server.

Anand

Strahinja_Kustudic2 · May 7, 2014, 10:57pm

You could set an environment variable:

`
ANSIBLE_KEEP_REMOTE_FILES=1

`

and run ansible again. Once it fails, see what files were executed, log into the remote host and run the failed python script with:

python -m trace --trace script.py

Anand_Buddhdev2 · May 8, 2014, 2:15pm

Thanks for this tip. I did exactly as you describe, and ran the script on the host. To my surprise, it did what it was supposed to! The last few lines from the trace output show:

— modulename: encoder, funcname: _iterencode_dict
encoder.py(281): if markers is not None:
encoder.py(282): del markers[markerid]
encoder.py(368): return ‘’.join(chunks)
{“state”: “stopped”, “changed”: true, “name”: “hdfs-sync”}
service(2146): sys.exit(0)

The service stopped running, so the script did its work. I don’t understand why I get an error from ansible on my laptop then. Any ideas developers?

Anand

Adam_Morris · May 8, 2014, 8:50pm

Actually, this looks like a bug that occurred with script and raw in some earlier versions of Ansible where the sudo success string was being leaked back. It was fixed, I’m now wondering if it was reverted in this case…

Can you upgrade to 1.6.1 and try it again?

Adam

Anand_Buddhdev2 · May 8, 2014, 8:56pm

Hi Adam, it still fails:

$ ansible --version
ansible 1.6.1
$ ansible bastion3.hadoop.ripe.net -sK -m service -a ‘name=hdfs-sync state=restarted’
sudo password:
bastion3.hadoop.ripe.net | FAILED => failed to parse:
SUDO-SUCCESS-dawyrikofupcjoyftolafyddywphopwj
Traceback (most recent call last):
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 2411, in
main()
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 1198, in main
service.get_service_status()
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 509, in get_service_status
rc, status_stdout, status_stderr = self.service_control()
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 750, in service_control
rc_state, stdout, stderr = self.execute_command(“%s %s %s” % (self.action, self.name, arguments), daemonize=True)
File “/tmp/ansible-tmp-1399582444.03-10583586416099/service”, line 250, in execute_command
return json.loads(data)
File “/usr/lib64/python2.6/json/init.py”, line 307, in loads
return _default_decoder.decode(s)
File “/usr/lib64/python2.6/json/decoder.py”, line 319, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/usr/lib64/python2.6/json/decoder.py”, line 338, in raw_decode
raise ValueError(“No JSON object could be decoded”)
ValueError: No JSON object could be decoded

Adam_Morris · May 8, 2014, 9:09pm

There is an open bug report https://github.com/ansible/ansible/issues/7319 which sounds very similar…

Do you need to provide a password to use sudo on the remote host? Do you need to use sudo? I’m curious because that first line SUDO-SUCCESS … should be being eaten by part of ansible…

Adam

Anand_Buddhdev2 · May 8, 2014, 9:28pm

Hi Adam,

There is an open bug report https://github.com/ansible/ansible/issues/7319 which sounds very similar…

Yes, this bug report sounds a lot like what I’m experiencing.

Do you need to provide a password to use sudo on the remote host? Do you need to use sudo? I’m curious because that first line SUDO-SUCCESS … should be being eaten by part of ansible…

On the remote host, I need to use sudo to run commands as root, and I need to provide a password.

Dick_Davies · May 9, 2014, 3:57pm

This isn't an upstart script is it?

I saw something very very similar if I tried to set enabled=no on an
upstart-managed
service on CentOS a few weeks back. Removing that clause made it work.

Michael_DeHaan1 · May 9, 2014, 11:44pm

This is a traceback in the service module, certaintly.

Please be sure there’s a ticket open and if you don’t have the same issue, file a new ticket, and we’ll look into this promptly.

We take the position that a traceback is always a bug in nearly all cases, modules should return reasonable errors if they ever have to fail, etc.

Anand_Buddhdev2 · May 10, 2014, 9:21am

Hi Dick,

This isn’t an upstart script is it?

I saw something very very similar if I tried to set enabled=no on an
upstart-managed
service on CentOS a few weeks back. Removing that clause made it work.

Yes, as my original message said, this is an upstart script.

Anand

Anand_Buddhdev2 · May 10, 2014, 9:22am

Hi Michael,

This is a traceback in the service module, certaintly.

Please be sure there’s a ticket open and if you don’t have the same issue, file a new ticket, and we’ll look into this promptly.

I believe my issue is very similar, and possibly the same, as issue #7319.

Topic		Replies	Views
Odd issue with the service module - How to get more visibility? Ansible Project	3	0	January 22, 2014
Ansible Service Module and sudo Ansible Project	1	6	January 26, 2018
How to troubleshoot service module? ( incorrectly reports successful start / RHEL 6 ) Ansible Project rhel	6	11	February 15, 2013
Incorrect sudo password on HP-UX Ansible Project	6	17	August 2, 2017
Playbook - Start Service Issue Ansible Project	4	2	January 20, 2013

Debugging service module

Related topics