Job get stuck in execution state when awx_task container gets down

Hii

My job get stuck in execution state when awx_task container gets down. After restarting the awx_task container it throw a message “Task was marked as running in Tower but was not present in the job queue, so it has been marked as failed”. But i want Timeout or exception message should pop up after a few seconds.
Can anyone please confirm is this a valid behaviour or a bug.

What causes the task container to go down? It’s not suppose to

– You received this message because you are subscribed to the Google Groups “AWX Project” group. To unsubscribe from this group and stop receiving emails from it, send an email to . To view this discussion on the web visit .

Hello Wei-Yen-Tan

Actually we are going to use AWX in our production environment. Thats why we are executing few test cases to test the behaviour of AWX. This is one of the test case in which we are testing the behaviour of execution of job if any container get down.
This behaviour doesnt seems good for the production envioronment as we dont get any notification if awx_task container gets down but job stuck in execution state.

try restarting all containers or use docker-compose to stop/start application. In order to monitor containers, you will need external scripts/solution. This mostly wont be available in awx.

Wei-Yen-Tan,

We have code to deal with this situation. However, at least 1 AWX container must be running to execute the code that resolves these edge case situations/crashes. When it does find a job of this nature the job will be given a status of failed or error.

-Chris

please check df -Th – to check the space especially for /var