Fresh Installation on AWX Operator 2.18.0 on K8s Fails..
You should:
- Ive running K8s setup trying to install AWX operator where web and task pods fails. When checking log its says name or service not known.
File “/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py”, line 45, in conninfo_attempts
raise e.OperationalError(str(last_exc))
psycopg.OperationalError: [Errno -2] Name or service not known
Appreciate some expertise here. Below is my setup .
Runs on Vsphere
Master - 1
Worker - 1
Both has 4CPU/16G Memory
NAME STATUS ROLES AGE VERSION
sesklk8maprd01 Ready control-plane 2d21h v1.28.15
sesklk8wkprd01 Ready 2d15h v1.28.15
local-storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
namespace: awx
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
pv-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: awx-static-data-pv
labels:
type: local
spec:
storageClassName: local-storage
capacity:
storage: 15Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/k8s"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- sesklk8wkprd01
[root]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
awx-static-data-pv 15Gi RWO Retain Bound awx/postgres-15-ansible-awx-postgres-15-0 local-storage 109m
[root]#
[root]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
postgres-15-ansible-awx-postgres-15-0 Bound awx-static-data-pv 15Gi RWO local-storage 103m
[root]]#
[ root ]# kubectl get pods
NAME READY STATUS RESTARTS AGE
ansible-awx-postgres-15-0 1/1 Running 0 103m
ansible-awx-task-7fc885f7-sdh2g 0/4 Init:0/2 0 103m
ansible-awx-web-5455586747-2j4dz 2/3 CrashLoopBackOff 18 (3m53s ago) 103m
awx-operator-controller-manager-7d497b7874-hbkzc 2/2 Running 0 104m
[root] kubectl describe pod ansible-awx-web-5455586747-2j4dz
Name: ansible-awx-web-5455586747-2j4dz
Namespace: awx
Priority: 0
Service Account: ansible-awx
Node: sesklk8wkprd01/10.x.x.x
Start Time: Tue, 24 Dec 2024 10:23:33 -0500
Labels: app.kubernetes.io/component=awx
app.kubernetes.io/managed-by=awx-operator
app.kubernetes.io/name=ansible-awx-web
app.kubernetes.io/operator-version=2.18.0
app.kubernetes.io/part-of=ansible-awx
app.kubernetes.io/version=24.5.0
pod-template-hash=5455586747
Annotations: checksum-configmaps-config: 387a8a79b80def1e8b89f0acf9089497dc7f53e5
checksum-secret-bundle_cacert: da39a3ee5e6b4b0d3255bfef95601890afd80709
checksum-secret-ldap_cacert: da39a3ee5e6b4b0d3255bfef95601890afd80709
checksum-secret-receptor_ca: fc854dd41ac75afba442caf5376865d7195b91c0
checksum-secret-receptor_work_signing: 96ccc0b5ff54ecfc15e592a8174b46f3b8969b21
checksum-secret-route_tls: da39a3ee5e6b4b0d3255bfef95601890afd80709
checksum-secret-secret_key: 297ebfb5783cd2f38d6f71ad7cefeb25e146beff
checksum-secrets-app_credentials: ddb2f2e07f7afbbf1e93ad2fba2a4b06cd322dae
checksum-storage-persistent: adc83b19e793491b1c6ea0fd8b46cd9f32e592fc
kubectl.kubernetes.io/default-container: ansible-awx-web
Status: Running
IP: 192.168.1.25
IPs:
IP: 192.168.1.25
Controlled By: ReplicaSet/ansible-awx-web-5455586747
Containers:
redis:
Container ID: containerd://3bbb8bc3bf8f067d43c631d84268e28bf5b89c8265ccddb9abe59d8fa0aa10dc
Image: docker.io/redis:7
Image ID: docker.io/library/redis@sha256:ea96c435dc17b011f54c6a883c3c45e7726242b075de61c6fe40a10ae6ae0f83
Port: <none>
Host Port: <none>
Args:
redis-server
/etc/redis.conf
State: Running
Started: Tue, 24 Dec 2024 10:23:33 -0500
Ready: True
Restart Count: 0
Requests:
cpu: 50m
memory: 64Mi
Environment: <none>
Mounts:
/data from ansible-awx-redis-data (rw)
/etc/redis.conf from ansible-awx-redis-config (ro,path="redis.conf")
/var/run/redis from ansible-awx-redis-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jcrt2 (ro)
ansible-awx-web:
Container ID: containerd://354b9b035b3f2a64b551848be3d774d3b601fb633d1977e700b75ae785486d79
Image: quay.io/ansible/awx:24.5.0
Image ID: quay.io/ansible/awx@sha256:08fc15effce5d9f911e0bc253c54d2da57cba466d2e23fc8359a6034ebb98bb2
Port: 8052/TCP
Host Port: 0/TCP
Args:
/usr/bin/launch_awx_web.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 24 Dec 2024 12:01:37 -0500
Finished: Tue, 24 Dec 2024 12:03:20 -0500
Ready: False
Restart Count: 18
Requests:
cpu: 100m
memory: 128Mi
Environment:
AWX_COMPONENT: web
SUPERVISOR_CONFIG_PATH: /etc/supervisord_web.conf
MY_POD_NAMESPACE: awx (v1:metadata.namespace)
MY_POD_IP: (v1:status.podIP)
UWSGI_MOUNT_PATH: /
Mounts:
/etc/nginx/nginx.conf from ansible-awx-nginx-conf (ro,path="nginx.conf")
/etc/receptor/tls/ca/mesh-CA.crt from ansible-awx-receptor-ca (ro,path="tls.crt")
/etc/receptor/tls/ca/mesh-CA.key from ansible-awx-receptor-ca (ro,path="tls.key")
/etc/receptor/work_public_key.pem from ansible-awx-receptor-work-signing (ro,path="work-public-key.pem")
/etc/tower/SECRET_KEY from ansible-awx-secret-keyt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m9s (x311 over 100m) kubelet Back-off restarting failed container ansible-awx-web in pod ansible-awx-web-5455586747-2j4dz_awx(d7ee2dd6-f22b-4697-8b41-8165ce353cd3)
[root]# kubectl logs ansible-awx-web-5455586747-2j4dz
2024-12-24 17:08:22,273 INFO RPC interface 'supervisor' initialized
2024-12-24 17:08:22,273 INFO RPC interface 'supervisor' initialized
2024-12-24 17:08:22,273 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2024-12-24 17:08:22,273 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2024-12-24 17:08:22,274 INFO supervisord started with pid 7
2024-12-24 17:08:22,274 INFO supervisord started with pid 7
2024-12-24 17:08:23,277 INFO spawned: 'superwatcher' with pid 13
2024-12-24 17:08:23,277 INFO spawned: 'superwatcher' with pid 13
2024-12-24 17:08:23,280 INFO spawned: 'nginx' with pid 14
2024-12-24 17:08:23,280 INFO spawned: 'nginx' with pid 14
2024-12-24 17:08:23,282 INFO spawned: 'uwsgi' with pid 15
2024-12-24 17:08:23,282 INFO spawned: 'uwsgi' with pid 15
2024-12-24 17:08:23,284 INFO spawned: 'daphne' with pid 16
2024-12-24 17:08:23,284 INFO spawned: 'daphne' with pid 16
2024-12-24 17:08:23,286 INFO spawned: 'awx-cache-clear' with pid 17
2024-12-24 17:08:23,286 INFO spawned: 'awx-cache-clear' with pid 17
2024-12-24 17:08:23,288 INFO spawned: 'ws-heartbeat' with pid 19
2024-12-24 17:08:23,288 INFO spawned: 'ws-heartbeat' with pid 19
READY
[uWSGI] getting INI configuration from /etc/tower/uwsgi.ini
*** Starting uWSGI 2.0.24 (64bit) on [Tue Dec 24 17:08:23 2024] ***
compiled with version: 11.4.1 20231218 (Red Hat 11.4.1-3) on 04 June 2024 19:37:52
os: Linux-5.14.0-284.30.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 16 09:55:41 UTC 2023
nodename: ansible-awx-web-5455586747-2j4dz
machine: x86_64
clock source: unix
detected number of CPU cores: 4
current working directory: /var/lib/awx
detected binary path: /var/lib/awx/venv/awx/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
your memory page size is 4096 bytes
detected max file descriptor number: 1073741816
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address 127.0.0.1:8050 fd 3
Python version: 3.11.7 (main, Jan 22 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)]
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x7f2f6e973558
your server socket listen backlog is limited to 128 connections
your mercy for graceful operations on workers is 60 seconds
mapped 609552 bytes (595 KB) for 5 cores
*** Operational MODE: preforking ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 15)
spawned uWSGI worker 1 (pid: 20, cores: 1)
spawned uWSGI worker 2 (pid: 21, cores: 1)
spawned uWSGI worker 3 (pid: 22, cores: 1)
spawned uWSGI worker 4 (pid: 23, cores: 1)
spawned uWSGI worker 5 (pid: 24, cores: 1)
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
mounting awx.wsgi:application on /
2024-12-24 17:08:24,385 INFO success: superwatcher entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-12-24 17:08:24,385 INFO success: superwatcher entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Traceback (most recent call last):
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
self.connect()
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
self.connection = self.get_new_connection(conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/connection.py", line 728, in connect
attempts = conninfo_attempts(params)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py", line 45, in conninfo_attempts
raise e.OperationalError(str(last_exc))
psycopg.OperationalError: [Errno -2] Name or service not known
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/bin/awx-manage", line 8, in <module>
sys.exit(manage())
^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/__init__.py", line 161, in manage
if (connection.pg_version // 10000) < 12:
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/connection.py", line 15, in __getattr__
return getattr(self._connections[self._alias], item)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/functional.py", line 57, in __get__
res = instance.__dict__[self.name] = self.func(instance)
^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 436, in pg_version
with self.temporary_connection():
File "/usr/lib64/python3.11/contextlib.py", line 137, in __enter__
return next(self.gen)
^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 705, in temporary_connection
with self.cursor() as cursor:
^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 330, in cursor
return self._cursor()
^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 306, in _cursor
self.ensure_connection()
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 288, in ensure_connection
with self.wrap_database_errors:
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
self.connect()
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
self.connection = self.get_new_connection(conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/connection.py", line 728, in connect
attempts = conninfo_attempts(params)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py", line 45, in conninfo_attempts
raise e.OperationalError(str(last_exc))
django.db.utils.OperationalError: [Errno -2] Name or service not known
Traceback (most recent call last):
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
self.connect()
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
self.connection = self.get_new_connection(conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
jango.db.utils.OperationalError: [Errno -2] Name or service not known
2024-12-24 17:09:07,274 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:07,274 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:07,284 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:07,284 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:09,288 INFO spawned: 'awx-cache-clear' with pid 77
2024-12-24 17:09:09,288 INFO spawned: 'awx-cache-clear' with pid 77
2024-12-24 17:09:09,290 INFO spawned: 'ws-heartbeat' with pid 78
2024-12-24 17:09:09,290 INFO spawned: 'ws-heartbeat' with pid 78
WSGI app 0 (mountpoint='/') ready in 63 seconds on interpreter 0x7f2f6e973558 pid: 21 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 23 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 22 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 20 (default app)
WSGI app 0 (mountpoint='/') ready in 64 seconds on interpreter 0x7f2f6e973558 pid: 24 (default app)
kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/base/base.py", line 270, in connect
self.connection = self.get_new_connection(conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/connection.py", line 728, in connect
attempts = conninfo_attempts(params)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/_conninfo_attempts.py", line 45, in conninfo_attempts
raise e.OperationalError(str(last_exc))
django.db.utils.OperationalError: [Errno -2] Name or service not known
2024-12-24 17:09:53,901 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:53,901 WARN exited: awx-cache-clear (exit status 1; not expected)
2024-12-24 17:09:53,907 INFO gave up: awx-cache-clear entered FATAL state, too many start retries too quickly
2024-12-24 17:09:53,907 INFO gave up: awx-cache-clear entered FATAL state, too many start retries too quickly
2024-12-24 17:09:53,907 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:53,907 WARN exited: ws-heartbeat (exit status 1; not expected)
2024-12-24 17:09:54,909 INFO gave up: ws-heartbeat entered FATAL state, too many start retries too quickly
2024-12-24 17:09:54,909 INFO gave up: ws-heartbeat entered FATAL state, too many start retries too quickly
Processing Event: ver:3.0 server:supervisor serial:0 pool:superwatcher poolserial:0 eventname:PROCESS_STATE_FATAL len:72
2024-12-24 17:09:54,909 WARN received SIGQUIT indicating exit request
2024-12-24 17:09:54,909 WARN received SIGQUIT indicating exit request
2024-12-24 17:09:54,918 INFO waiting for superwatcher, nginx, uwsgi, daphne to die
2024-12-24 17:09:54,918 INFO waiting for superwatcher, nginx, uwsgi, daphne to die
...brutally killing workers...
2024-12-24 17:09:54,928 INFO stopped: nginx (exit status 0)
2024-12-24 17:09:54,928 INFO stopped: nginx (exit status 0)
worker 1 buried after 1 seconds
worker 2 buried after 1 seconds
worker 3 buried after 1 seconds
worker 4 buried after 1 seconds
worker 5 buried after 1 seconds
binary reloading uWSGI...
chdir() to /var/lib/awx
closing all non-uwsgi socket fds > 2 (max_fd = 1073741816)...
found fd 3 mapped to socket 0 (127.0.0.1:8050)
2024-12-24 17:09:55,915 WARN stopped: daphne (terminated by SIGTERM)
2024-12-24 17:09:55,915 WARN stopped: daphne (terminated by SIGTERM)
2024-12-24 17:09:58,919 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:09:58,919 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:01,924 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:01,924 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:04,926 WARN killing 'uwsgi' (15) with SIGKILL
2024-12-24 17:10:04,926 WARN killing 'uwsgi' (15) with SIGKILL
2024-12-24 17:10:04,926 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:04,926 INFO waiting for superwatcher, uwsgi to die
2024-12-24 17:10:05,928 WARN stopped: uwsgi (terminated by SIGKILL)
2024-12-24 17:10:05,928 WARN stopped: uwsgi (terminated by SIGKILL)
2024-12-24 17:10:05,929 WARN stopped: superwatcher (terminated by SIGTERM)
2024-12-24 17:10:05,929 WARN stopped: superwatcher (terminated by SIGTERM)
[root]#