Fresh Installation on AWX Operator 2.18.0 on K8s Fails

Fresh Installation on AWX Operator 2.18.0 on K8s Fails..

You should:

  • Ive running K8s setup trying to install AWX operator where web and task pods fails. When checking log its says name or service not known.

File “/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py”, line 45, in conninfo_attempts
raise e.OperationalError(str(last_exc))
psycopg.OperationalError: [Errno -2] Name or service not known

Appreciate some expertise here. Below is my setup .

Runs on Vsphere
Master - 1
Worker - 1

Both has 4CPU/16G Memory

NAME STATUS ROLES AGE VERSION
sesklk8maprd01 Ready control-plane 2d21h v1.28.15
sesklk8wkprd01 Ready 2d15h v1.28.15


local-storage-class.yaml

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
  namespace: awx
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer



pv-volume.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: awx-static-data-pv
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 15Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/data/k8s"
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - sesklk8wkprd01



[root]# kubectl get pv
NAME                 CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                       STORAGECLASS    REASON   AGE
awx-static-data-pv   15Gi       RWO            Retain           Bound    awx/postgres-15-ansible-awx-postgres-15-0   local-storage            109m
[root]#


[root]# kubectl get pvc
NAME                                    STATUS   VOLUME               CAPACITY   ACCESS MODES   STORAGECLASS    AGE
postgres-15-ansible-awx-postgres-15-0   Bound    awx-static-data-pv   15Gi       RWO            local-storage   103m
[root]]#


[ root ]# kubectl get pods
NAME                                               READY   STATUS             RESTARTS         AGE
ansible-awx-postgres-15-0                          1/1     Running            0                103m
ansible-awx-task-7fc885f7-sdh2g                    0/4     Init:0/2           0                103m
ansible-awx-web-5455586747-2j4dz                   2/3     CrashLoopBackOff   18 (3m53s ago)   103m
awx-operator-controller-manager-7d497b7874-hbkzc   2/2     Running            0                104m







 [root] kubectl describe pod ansible-awx-web-5455586747-2j4dz
Name:             ansible-awx-web-5455586747-2j4dz
Namespace:        awx
Priority:         0
Service Account:  ansible-awx
Node:             sesklk8wkprd01/10.x.x.x
Start Time:       Tue, 24 Dec 2024 10:23:33 -0500
Labels:           app.kubernetes.io/component=awx
                  app.kubernetes.io/managed-by=awx-operator
                  app.kubernetes.io/name=ansible-awx-web
                  app.kubernetes.io/operator-version=2.18.0
                  app.kubernetes.io/part-of=ansible-awx
                  app.kubernetes.io/version=24.5.0
                  pod-template-hash=5455586747
Annotations:      checksum-configmaps-config: 387a8a79b80def1e8b89f0acf9089497dc7f53e5
                  checksum-secret-bundle_cacert: da39a3ee5e6b4b0d3255bfef95601890afd80709
                  checksum-secret-ldap_cacert: da39a3ee5e6b4b0d3255bfef95601890afd80709
                  checksum-secret-receptor_ca: fc854dd41ac75afba442caf5376865d7195b91c0
                  checksum-secret-receptor_work_signing: 96ccc0b5ff54ecfc15e592a8174b46f3b8969b21
                  checksum-secret-route_tls: da39a3ee5e6b4b0d3255bfef95601890afd80709
                  checksum-secret-secret_key: 297ebfb5783cd2f38d6f71ad7cefeb25e146beff
                  checksum-secrets-app_credentials: ddb2f2e07f7afbbf1e93ad2fba2a4b06cd322dae
                  checksum-storage-persistent: adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
                  kubectl.kubernetes.io/default-container: ansible-awx-web
Status:           Running
IP:               192.168.1.25
IPs:
  IP:           192.168.1.25
Controlled By:  ReplicaSet/ansible-awx-web-5455586747
Containers:
  redis:
    Container ID:  containerd://3bbb8bc3bf8f067d43c631d84268e28bf5b89c8265ccddb9abe59d8fa0aa10dc
    Image:         docker.io/redis:7
    Image ID:      docker.io/library/redis@sha256:ea96c435dc17b011f54c6a883c3c45e7726242b075de61c6fe40a10ae6ae0f83
    Port:          <none>
    Host Port:     <none>
    Args:
      redis-server
      /etc/redis.conf
    State:          Running
      Started:      Tue, 24 Dec 2024 10:23:33 -0500
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        50m
      memory:     64Mi
    Environment:  <none>
    Mounts:
      /data from ansible-awx-redis-data (rw)
      /etc/redis.conf from ansible-awx-redis-config (ro,path="redis.conf")
      /var/run/redis from ansible-awx-redis-socket (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jcrt2 (ro)
  ansible-awx-web:
    Container ID:  containerd://354b9b035b3f2a64b551848be3d774d3b601fb633d1977e700b75ae785486d79
    Image:         quay.io/ansible/awx:24.5.0
    Image ID:      quay.io/ansible/awx@sha256:08fc15effce5d9f911e0bc253c54d2da57cba466d2e23fc8359a6034ebb98bb2
    Port:          8052/TCP
    Host Port:     0/TCP
    Args:
      /usr/bin/launch_awx_web.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 24 Dec 2024 12:01:37 -0500
      Finished:     Tue, 24 Dec 2024 12:03:20 -0500
    Ready:          False
    Restart Count:  18
    Requests:
      cpu:     100m
      memory:  128Mi
    Environment:
      AWX_COMPONENT:           web
      SUPERVISOR_CONFIG_PATH:  /etc/supervisord_web.conf
      MY_POD_NAMESPACE:        awx (v1:metadata.namespace)
      MY_POD_IP:                (v1:status.podIP)
      UWSGI_MOUNT_PATH:        /
    Mounts:
      /etc/nginx/nginx.conf from ansible-awx-nginx-conf (ro,path="nginx.conf")
      /etc/receptor/tls/ca/mesh-CA.crt from ansible-awx-receptor-ca (ro,path="tls.crt")
      /etc/receptor/tls/ca/mesh-CA.key from ansible-awx-receptor-ca (ro,path="tls.key")
      /etc/receptor/work_public_key.pem from ansible-awx-receptor-work-signing (ro,path="work-public-key.pem")
      /etc/tower/SECRET_KEY from ansible-awx-secret-keyt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Warning  BackOff  4m9s (x311 over 100m)  kubelet  Back-off restarting failed container ansible-awx-web in pod ansible-awx-web-5455586747-2j4dz_awx(d7ee2dd6-f22b-4697-8b41-8165ce353cd3)


[root]# kubectl logs ansible-awx-web-5455586747-2j4dz
2024-12-24 17:08:22,273 INFO RPC interface 'supervisor' initialized
2024-12-24 17:08:22,273 INFO RPC interface 'supervisor' initialized
2024-12-24 17:08:22,273 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2024-12-24 17:08:22,273 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2024-12-24 17:08:22,274 INFO supervisord started with pid 7
2024-12-24 17:08:22,274 INFO supervisord started with pid 7
2024-12-24 17:08:23,277 INFO spawned: 'superwatcher' with pid 13
2024-12-24 17:08:23,277 INFO spawned: 'superwatcher' with pid 13
2024-12-24 17:08:23,280 INFO spawned: 'nginx' with pid 14
2024-12-24 17:08:23,280 INFO spawned: 'nginx' with pid 14
2024-12-24 17:08:23,282 INFO spawned: 'uwsgi' with pid 15
2024-12-24 17:08:23,282 INFO spawned: 'uwsgi' with pid 15
2024-12-24 17:08:23,284 INFO spawned: 'daphne' with pid 16
2024-12-24 17:08:23,284 INFO spawned: 'daphne' with pid 16
2024-12-24 17:08:23,286 INFO spawned: 'awx-cache-clear' with pid 17
2024-12-24 17:08:23,286 INFO spawned: 'awx-cache-clear' with pid 17
2024-12-24 17:08:23,288 INFO spawned: 'ws-heartbeat' with pid 19
2024-12-24 17:08:23,288 INFO spawned: 'ws-heartbeat' with pid 19
READY
[uWSGI] getting INI configuration from /etc/tower/uwsgi.ini
*** Starting uWSGI 2.0.24 (64bit) on [Tue Dec 24 17:08:23 2024] ***
compiled with version: 11.4.1 20231218 (Red Hat 11.4.1-3) on 04 June 2024 19:37:52
os: Linux-5.14.0-284.30.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 16 09:55:41 UTC 2023
nodename: ansible-awx-web-5455586747-2j4dz
machine: x86_64
clock source: unix
detected number of CPU cores: 4
current working directory: /var/lib/awx
detected binary path: /var/lib/awx/venv/awx/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
your memory page size is 4096 bytes
detected max file descriptor number: 1073741816
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address 127.0.0.1:8050 fd 3
Python version: 3.11.7 (main, Jan 22 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)]
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x7f2f6e973558
your server socket listen backlog is limited to 128 connections
your mercy for graceful operations on workers is 60 seconds
mapped 609552 bytes (595 KB) for 5 cores
*** Operational MODE: preforking ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 15)
spawned uWSGI worker 1 (pid: 20, cores: 1)
spawned uWSGI worker 2 (pid: 21, cores: 1)
spawned uWSGI worker 3 (pid: 22, cores: 1)
spawned uWSGI worker 4 (pid: 23, cores: 1)
spawned uWSGI worker 5 (pid: 24, cores: 1)
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
2024-12-24 17:08:24,385 INFO success: superwatcher entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-12-24 17:08:24,385 INFO success: superwatcher entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
    self.connect()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
    self.connection = self.get_new_connection(conn_params)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
    connection = self.Database.connect(**conn_params)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/connection.py", line 728, in connect
    attempts = conninfo_attempts(params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py", line 45, in conninfo_attempts
    raise e.OperationalError(str(last_exc))
psycopg.OperationalError: [Errno -2] Name or service not known

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/bin/awx-manage", line 8, in <module>
    sys.exit(manage())
             ^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/__init__.py", line 161, in manage
    if (connection.pg_version // 10000) < 12:
        ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/connection.py", line 15, in __getattr__
    return getattr(self._connections[self._alias], item)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/functional.py", line 57, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
                                         ^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 436, in pg_version
    with self.temporary_connection():
  File "/usr/lib64/python3.11/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 705, in temporary_connection
    with self.cursor() as cursor:
         ^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 330, in cursor
    return self._cursor()
           ^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 306, in _cursor
    self.ensure_connection()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 288, in ensure_connection
    with self.wrap_database_errors:
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
    self.connect()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
    self.connection = self.get_new_connection(conn_params)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
    connection = self.Database.connect(**conn_params)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/connection.py", line 728, in connect
    attempts = conninfo_attempts(params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py", line 45, in conninfo_attempts
    raise e.OperationalError(str(last_exc))
django.db.utils.OperationalError: [Errno -2] Name or service not known
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
    self.connect()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
    self.connection = self.get_new_connection(conn_params)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
jango.db.utils.OperationalError: [Errno -2] Name or service not known
2024-12-24 17:09:07,274 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:07,274 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:07,284 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:07,284 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:09,288 INFO spawned: 'awx-cache-clear' with pid 77
2024-12-24 17:09:09,288 INFO spawned: 'awx-cache-clear' with pid 77
2024-12-24 17:09:09,290 INFO spawned: 'ws-heartbeat' with pid 78
2024-12-24 17:09:09,290 INFO spawned: 'ws-heartbeat' with pid 78
WSGI app 0 (mountpoint='/') ready in 63 seconds on interpreter 0x7f2f6e973558 pid: 21 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 23 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 22 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 20 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 24 (default app)
kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
    self.connection = self.get_new_connection(conn_params)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
    connection = self.Database.connect(**conn_params)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/connection.py", line 728, in connect
    attempts = conninfo_attempts(params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py", line 45, in conninfo_attempts
    raise e.OperationalError(str(last_exc))
django.db.utils.OperationalError: [Errno -2] Name or service not known
2024-12-24 17:09:53,901 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:53,901 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:53,907 INFO gave up: awx-cache-clear entered FATAL state, too many start retries too quickly
2024-12-24 17:09:53,907 INFO gave up: awx-cache-clear entered FATAL state, too many start retries too quickly
2024-12-24 17:09:53,907 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:53,907 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:54,909 INFO gave up: ws-heartbeat entered FATAL state, too many start retries too quickly
2024-12-24 17:09:54,909 INFO gave up: ws-heartbeat entered FATAL state, too many start retries too quickly
Processing Event: ver:3.0 server:supervisor serial:0 pool:superwatcher poolserial:0 eventname:PROCESS_STATE_FATAL len:72
2024-12-24 17:09:54,909 WARN received SIGQUIT indicating exit request
2024-12-24 17:09:54,909 WARN received SIGQUIT indicating exit request
2024-12-24 17:09:54,918 INFO waiting for superwatcher, nginx, uwsgi, daphne to die
2024-12-24 17:09:54,918 INFO waiting for superwatcher, nginx, uwsgi, daphne to die
...brutally killing workers...
2024-12-24 17:09:54,928 INFO stopped: nginx (exit status 0)
2024-12-24 17:09:54,928 INFO stopped: nginx (exit status 0)
worker 1 buried after 1 seconds
worker 2 buried after 1 seconds
worker 3 buried after 1 seconds
worker 4 buried after 1 seconds
worker 5 buried after 1 seconds
binary reloading uWSGI...
chdir() to /var/lib/awx
closing all non-uwsgi socket fds > 2 (max_fd = 1073741816)...
found fd 3 mapped to socket 0 (127.0.0.1:8050)
2024-12-24 17:09:55,915 WARN stopped: daphne (terminated by SIGTERM)
2024-12-24 17:09:55,915 WARN stopped: daphne (terminated by SIGTERM)
2024-12-24 17:09:58,919 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:09:58,919 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:01,924 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:01,924 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:04,926 WARN killing 'uwsgi' (15) with SIGKILL
2024-12-24 17:10:04,926 WARN killing 'uwsgi' (15) with SIGKILL
2024-12-24 17:10:04,926 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:04,926 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:05,928 WARN stopped: uwsgi (terminated by SIGKILL)
2024-12-24 17:10:05,928 WARN stopped: uwsgi (terminated by SIGKILL)
2024-12-24 17:10:05,929 WARN stopped: superwatcher (terminated by SIGTERM)
2024-12-24 17:10:05,929 WARN stopped: superwatcher (terminated by SIGTERM)
[root]#