Hello,
I have observed a strange behavior in AWX 19.4 and AWX 19.5.1. AWX is running on a Kubernetes Cluster. The following changes after exactly 300 seconds to the status error without any response.
- ansible.windows.win_wait_for:
path: C:\temp\logtxt
timeout: 1500
register: win_wait
async: 25200
poll: 0
- name: Check on an async task
async_status:
jid: “{{ win_wait.ansible_job_id }}”
register: win_wait_result
until: win_wait_result.finished
retries: 84
delay: 300
I can see that the executing container disappears after 300 seconds. If I change the delay value to 299 this doesn’t happen, and the task works like expected. The problem is on Linux or Windows Remote hosts identical. Back on AWX 15.0.1 on Docker the tasks works with delay 300 fine.
Regards
Lars
Hey Lars,
We tried to reproduce this on 19.5.2 and was unable to reproduce. Can you provide us more info? What are you getting from the job output?
Cheers
AWX team
I remember reading about this. Iirc This has some thing to do with the k8s logs rotation
Hello,
the output from the job is really short, its just “Error” on 19.4 and “Job terminated due to error**”** 19.05 .
I have tried it with two different setups of AWX and the mentioned versions. Both show the identical behavior.
Which information you do else need?
Regards
Lars
You could be hitting a log rotation issue here, but I suggest you try this workaround as well: https://github.com/ansible/awx/issues/11521
Try setting:
vars:
ansible_async_dir: ‘/tmp/.ansible_async’
Hello Phil,
the workaround does not work . Problem still exists. We can see that after 300 seconds from somewhere the status cancel.
We have found a way to increase the log size. It fix’s some issues but this one still exists.