recurring problem: missing output on jobs

hi All,

I recently migrated to AWX 2.1.0, but this problem has been showing itself intermittently before, too (running 1.7.something).

  • Occasionally, jobs run, but show no output in the web interface. At some time, while it was running, I checked the process list in the task containers, and could recognise it was effectively running.
  • At some one time, I scaled down to 1 pod - we usually run 3 pods - then scaled to 0, and back to 1, and jobs started to show output again. Some time later scaling again to three pods, triggered the issue of not showing any output, again.
  • When I checked the job output text files on disk, they were also just empty files.
  • I was out-of-office two days, and this morning checking our setup, I notice latest jobs have output shown again.
  • Looking online, I only find references to these kind of issues, in old bug reports that AFAICS should be solved by now.
    I know this is not plain stdout, but the result of a special callback in AWX. But as this “seems” to happen intermittently, and I can’t pin-point a certain trigger or exact circumstance where this happens, I’m not sure where to go from here.

How can i troubleshoot this, when it happens? What are the mechanisms that may fail here?

Thanks!

Serge van Ginderachter