awx-web container crashing repeatedly

Hi Team,

I have come across an issue where in the awx-web container is crashing repeatedly with the below error logs -

2023-01-24 15:44:09,624 DEBUG [-] awx.main.wsbroadcast Connection from awx-67cb856d46-qxlwq to 172.29.70.2 attempt number 10.
2023-01-24 15:44:14,630 WARNING [-] awx.main.wsbroadcast Connection from awx-67cb856d46-qxlwq to 172.29.70.2 failed: ‘Cannot connect to host 172.29.70.2:8052 ssl:False [Connect call failed (‘172.29.70.2’, 8052)]’.
2023-01-24 15:44:14 ERROR rsyslogd was unresponsive: FileNotFoundError: [Errno 2] No such file or directory
Connection from awx-67cb856d46-qxlwq to 172.29.70.2 failed: ‘Cannot connect to host 172.29.70.2:8052 ssl:False [Connect call failed (‘172.29.70.2’, 8052)]’.
2023-01-24 15:44:14,633 DEBUG [-] awx.main.wsbroadcast Connection from awx-67cb856d46-qxlwq to 172.29.70.2 attempt number 11.
2023-01-24 15:44:19,639 WARNING [-] awx.main.wsbroadcast Connection from awx-67cb856d46-qxlwq to 172.29.70.2 failed: ‘Cannot connect to host 172.29.70.2:8052 ssl:False [Connect call failed (‘172.29.70.2’, 8052)]’.
2023-01-24 15:44:19 ERROR rsyslogd was unresponsive: FileNotFoundError: [Errno 2] No such file or directory
Connection from awx-67cb856d46-qxlwq to 172.29.70.2 failed: ‘Cannot connect to host 172.29.70.2:8052 ssl:False [Connect call failed (‘172.29.70.2’, 8052)]’.
2023-01-24 15:44:19,642 DEBUG [-] awx.main.wsbroadcast Connection from awx-67cb856d46-qxlwq to 172.29.70.2 attempt number 12.
2023-01-24 15:44:24,649 WARNING [-] awx.main.wsbroadcast Connection from awx-67cb856d46-qxlwq to 172.29.70.2 failed: ‘Cannot connect to host 172.29.70.2:8052 ssl:False [Connect call failed (‘172.29.70.2’, 8052)]’.

Could you please help resolve this?

The awx-operator v0.24.0 was used here.

Regards,
Kaushik

is the web container actually terminating (and then a new one starts up), or do you just see this stream of log messages in the output from same running container? Are you experiencing issues running jobs? What symptoms are you experiencing exactly (if the app is even working)?

AWX Team

I could see multiple pod restarts -

NAME READY STATUS RESTARTS AGE
awx-67cb856d46-qxlwq 4/4 Running 7 (6d23h ago) 6d23h

And also see the mentioned stream of error logs for the same running container.

I haven’t tried running any jobs on the AWX UI yet. Currently, just trying to get the UI up.

Please note that, the this wasn’t an issue earlier last week when the UI was up. This issue started occurring only since 01/26. No changes were done to the source code.

So is the UI up at this point, or not?

awx-67cb856d46-qxlwq 4/4 Running 7 (6d23h ago) 6d23h

to me this means the containers are up and running (include the awx-web container), and that they haven’t restarted since 6 days ago, does that sound right?

Those warning messages you posted earlier don’t stand out to me as problematic, at least in terms of the UI showing up.

AWX Team