I would like to understand and address how Workflow Jobs are managed in AWX better.
I made a comment to a Github issue where I thought it was appropriate, but I realize the community would have a hard time finding it.
While the Github issue has the information and questions I have with an example architecture, I’m curious how others have managed to do maintenance in AWX on K8s while preventing unexpected job termination within workflows.
Even if jobs from a workflow are on AWX K8s nodes that are not undergoing maintenance, bringing down a controller node in AWX can terminate those workflow jobs across said other nodes.
Much more information in the relevant Github issue below:
Since I added the documentation tag, I figured I would include relevant docs I have been perusing which may apply here: