(Windows) Service Fails to Stop in Actual Play - But Actually Stops On Test Server

Weird thing happening with my Windows Playbook for Splunk.

I run the play which includes stopping the Splunk Service. When it gets to the Splunk Service part with the service is supposed to stop, it seemingly fails. Now I checked before I ran the play and the Splunk service was running on my test box, and although the play fails, it’s in a stopped status after the play runs.

TASK [Stopping Splunkd Service in Preperation for Splunk Directory Removal] ***********************************************************************************************
fatal: [computername]: FAILED! => {“can_pause_and_continue”: false, “changed”: false, “depended_by”: , “dependencies”: , “description”: “Splunkd is the indexing and searching engine for Splunk, a data platform for operational intelligence. It is required for Splunk instances acting as an indexer. If it is stopped, Splunk will not process data and will be unavailable for search. Splunkweb depends on Splunkd. Please see www.splunk.com for more information. Questions can be submitted to www.splunk.com/answers or for supported customers www.splunk.com/page/submit_issue”, “desktop_interact”: false, “display_name”: “Splunkd Service”, “exists”: true, “msg”: “Service ‘Splunkd Service (Splunkd)’ cannot be stopped due to the following error: Cannot stop Splunkd service on computer ‘.’.”, “name”: “Splunkd”, “path”: “"C:\Program Files\Splunk\bin\splunkd.exe" service”, “start_mode”: “auto”, “state”: “running”, “username”: “LocalSystem”}

UPDATE - I reverted back to an older snapshot of the test box but that leads me to other questions about how to handle tasks when they’re not completed in a timely Ansible time. I was looking into the async piece where you can set a polling interval and such, http://docs.ansible.com/ansible/latest/playbooks_async.html when I asked my boss about it he said that this would be his last effort - is this not the proper way to handle situations like this where tasks fail for taking too long? I’m new so I don’t know :slight_smile:

When I leverage this playbook against 60+ hosts I’m not going to be able to manually troubleshoot ones that are taking a long time because x y z.

Ansible is a declarative language where you put the state of a resource, in this case you want a service to be stopped. If you were to run it again then it would just try to set the service to stopped or skip the task if it is already stopped.

This situation is a bit more difficult as Ansible is failing to stop the service for some reason, most likely the service is in use and is refusing to stop at that current point in time. You would need to look into the logs to find out why it didn’t stop when Ansible asked it to.

Thanks

Jordan

Thanks Jordan! (Again ;))

Are you talking about a specific Ansible log or just generic Windows system logs?

Heather

The Windows Event Log, I believe the service information is stored in either the System or Application event log and usually gives you more details as to why a service started or failed to stop. Unfortunately Ansible is limited to what info Powershell returns which in this case was “Cannot stop Splunkd service on computer ‘.’.” which is really not helpful at all.