Two awx-task replicas and orphaned job reconnection

MonolithProjects · May 22, 2024, 8:01am

Hi all,
i am running AWX (24.3.1) in Kubernetes, managed by AWX Operator (2.16.1). awx-task is running in two replicas. We have also additional instance group with Execution Node (Receptor 1.4.7) outside of the Kubernetes cluster.

The idea was to run Kubespray (Kubernetes management using Ansible) to upgrade Kubernetes cluster where also AWX is deployed. The Kubespray job is running from Execution Node (so it’s outside of Kubernetes infrastructure and it’s not affected by draining of Kubernetes nodes).

The problem is when Kubespray is doing the upgrade (and thus drain) the Kubernetes node where the awx-task pod replica which created the job on Execution Node is running. When this job which is running the Kubespray Ansible becomes orphaned, receptor kills this job in next few seconds without trying to migrate the job to another, existing awx-task pod replica.

Is this behavior normal, or am i missing some settings? Using environment variable RECEPTOR_KUBE_SUPPORT_RECONNECT=enabled seems not not to resolve this issue.

Topic		Replies	Views
AWX Job Workflow Management Get Help awx , documentation	5	652	February 8, 2024
Receptor error when starting jobs in new AWX deployment in AKS Get Help awx , kubernetes	13	1168	May 30, 2024
Why does awx-operator scale my awx-task and awx-web to 0 replicas after startup Get Help awx , awx-operator , kubernetes	7	440	October 19, 2024
Job fails intermittently on AWX (Kubernetes) AWX Project awx , kubernetes	4	17	January 10, 2020
Question about receptor service (why is it missing in awx-task) AWX Project awx , kubernetes	13	252	September 12, 2023

Two awx-task replicas and orphaned job reconnection

Related topics