Job Hangs and I can't delete either.

Hi guys,

I’ve configure a template and then when firing up the job it hangs and does not let me delete it either.

When I try to delete it I get back the following error:

Call to /api/v2/jobs/28/ failed. returned status: 500. A server error has occurred

Any idea how can I troubleshoot/fix this problem?

Thanks for your help in advance.

(attachments)

To troubleshoot:

If you are clustered, determine which host took the job (if not, you can skip this step):

curl -sL -H "Authorization: Bearer $TOKEN" https://<URL>/api/v2/jobs/<JOB_ID> | jq '."execution_node"'

Try to delete in the UI so you trigger the error and don’t have to go digging through the logs too deep.

On the execution node, look at the task container logs for the last few minutes. If that doesn’t show you the traceback that led to the error, do the same for the web container.

docker logs --since=5m awx_task

`
docker logs --since=5m awx_web

`

That should at least yield you a traceback that will either steer you towards the cause or at least something you can google. Most errors I’ve seen have already been reported as an issue in the AWX Github project.

–chris reisor

Thanks for your answer Chris, I wasn’t aware of the log’s command to get more information about the issue.

I resolved my problem by tearing down the containers and re-built them. Luckily this is just a test environment but this will be very useful for me to troubleshoot any issue in the future.

In case this helps anyone:

1 - To delete the containers

https://stackoverflow.com/questions/38041261/how-do-i-delete-all-running-docker-containers
docker stop $(docker ps -a -q)  # This will stop all your running containers. Do this only if you are running AWX containers.
docker rm $(docker ps -a -q)    # This will rm all your running containers. Do this only if you are running AWX containers.

2 - Rebuild the containers with the Ansible installation you can find in the AWX Repo