Upgrade issues 0.23.0 -> 0.30.0

Hi all,

…I’m trying to upgrade our installation from 0.23.0 to 0.30.0, but I just don’t get it done :frowning: I set up awx-operator and awx according to:

https://computingforgeeks.com/how-to-install-ansible-awx-on-ubuntu-linux/

They have a upgrade guide, too: https://computingforgeeks.com/how-to-install-ansible-awx-on-ubuntu-linux/

I can upgrade the operator and the new operator replaces the old postgres with opstgres-13-0, but the rest of the containers fail:

Every 1.0s: kubectl -n awx get pods awx-troubleshooting: Wed Oct 5 12:47:26 2022

NAME READY STATUS RESTARTS AGE
awx-operator-controller-manager-fcf6db67c-zwd9c 2/2 Running 2 (19m ago) 28m
awx-postgres-13-0 1/1 Running 0 6m7s
awx-b7c7c79b8-4tnhj 0/4 Init:CrashLoopBackOff 5 (2m33s ago) 5m35s

I’m somehow a bit lost, since I don’t know where to look for an error. Any hints highly appreciated!

Thanks,

Andreas

a describe of the pod gives me:

$ kubectl describe pod awx-b7c7c79b8-4tnhj

awx-b7c7c79b8-4tnhj
Name: awx-b7c7c79b8-4tnhj
Namespace: awx
Priority: 0
Node: awx-troubleshooting/10.0.121.9
Start Time: Wed, 05 Oct 2022 12:41:56 +0200
Labels: app.kubernetes.io/component=awx
app.kubernetes.io/managed-by=awx-operator
app.kubernetes.io/name=awx
app.kubernetes.io/operator-version=0.30.0
app.kubernetes.io/part-of=awx
app.kubernetes.io/version=21.7.0
pod-template-hash=b7c7c79b8
Annotations:
Status: Pending
IP: 10.42.0.65
IPs:
IP: 10.42.0.65
Controlled By: ReplicaSet/awx-b7c7c79b8
Init Containers:
init:
Container ID: containerd://722025590f7f703a7fa437daa25d8aa38ef243ae6d99d2cd0cd0e0f98b11e9ff
Image: quay.io/ansible/awx-ee:latest
Image ID: quay.io/ansible/awx-ee@sha256:1b9564e397a5059b53d304a697fcc48250fb805caf828a8006ef799d69a8dd21
Port:
Host Port:
Command:
/bin/sh
-c
hostname=$MY_POD_NAME
receptor --cert-makereq bits=2048 commonname=$hostname dnsname=$hostname nodeid=$hostname outreq=/etc/receptor/tls/receptor.req outkey=/etc/receptor/tls/receptor.key
receptor --cert-signreq req=/etc/receptor/tls/receptor.req cacert=/etc/receptor/tls/ca/receptor-ca.crt cakey=/etc/receptor/tls/ca/receptor-ca.key outcert=/etc/receptor/tls/receptor.crt verify=yes
chmod 775 /var/lib/awx/projects
chgrp 1000 /var/lib/awx/projects

State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 05 Oct 2022 12:47:44 +0200
Finished: Wed, 05 Oct 2022 12:47:44 +0200
Ready: False
Restart Count: 6
Environment:
MY_POD_NAME: awx-b7c7c79b8-4tnhj (v1:metadata.name)
Mounts:
/etc/receptor/tls/ from awx-receptor-tls (rw)
/etc/receptor/tls/ca/receptor-ca.crt from awx-receptor-ca (ro,path=“tls.crt”)
/etc/receptor/tls/ca/receptor-ca.key from awx-receptor-ca (ro,path=“tls.key”)
/var/lib/awx/projects from awx-projects (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln7t4 (ro)
Containers:
redis:
Container ID:
Image: docker.io/redis:7
Image ID:
Port:
Host Port:
Args:
redis-server
/etc/redis.conf
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 50m
memory: 64Mi
Environment:
Mounts:
/data from awx-redis-data (rw)
/etc/redis.conf from awx-redis-config (ro,path=“redis.conf”)
/var/run/redis from awx-redis-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln7t4 (ro)
awx-web:
Container ID:
Image: quay.io/ansible/awx:21.7.0
Image ID:
Port: 8052/TCP
Host Port: 0/TCP
Args:
/usr/bin/launch_awx.sh
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 128Mi
Environment:
MY_POD_NAMESPACE: awx (v1:metadata.namespace)
UWSGI_MOUNT_PATH: /
Mounts:
/etc/nginx/nginx.conf from awx-nginx-conf (ro,path=“nginx.conf”)
/etc/receptor/signing/work-public-key.pem from awx-receptor-work-signing (ro,path=“work-public-key.pem”)
/etc/receptor/tls/ca/receptor-ca.crt from awx-receptor-ca (ro,path=“tls.crt”)
/etc/receptor/tls/ca/receptor-ca.key from awx-receptor-ca (ro,path=“tls.key”)
/etc/tower/SECRET_KEY from awx-secret-key (ro,path=“SECRET_KEY”)
/etc/tower/conf.d/credentials.py from awx-application-credentials (ro,path=“credentials.py”)
/etc/tower/conf.d/execution_environments.py from awx-application-credentials (ro,path=“execution_environments.py”)
/etc/tower/conf.d/ldap.py from awx-application-credentials (ro,path=“ldap.py”)
/etc/tower/settings.py from awx-settings (ro,path=“settings.py”)
/var/lib/awx/projects from awx-projects (rw)
/var/lib/awx/rsyslog from rsyslog-dir (rw)
/var/lib/projects from static-data (rw)
/var/run/awx-rsyslog from rsyslog-socket (rw)
/var/run/redis from awx-redis-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln7t4 (ro)
/var/run/supervisor from supervisor-socket (rw)
awx-task:
Container ID:
Image: quay.io/ansible/awx:21.7.0
Image ID:
Port:
Host Port:
Args:
/usr/bin/launch_awx_task.sh
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 128Mi
Environment:
SUPERVISOR_WEB_CONFIG_PATH: /etc/supervisord.conf
AWX_SKIP_MIGRATIONS: 1
MY_POD_UID: (v1:metadata.uid)
MY_POD_IP: (v1:status.podIP)
MY_POD_NAMESPACE: awx (v1:metadata.namespace)
Mounts:
/etc/receptor/ from awx-receptor-config (rw)
/etc/receptor/signing/work-private-key.pem from awx-receptor-work-signing (ro,path=“work-private-key.pem”)
/etc/tower/SECRET_KEY from awx-secret-key (ro,path=“SECRET_KEY”)
/etc/tower/conf.d/credentials.py from awx-application-credentials (ro,path=“credentials.py”)
/etc/tower/conf.d/execution_environments.py from awx-application-credentials (ro,path=“execution_environments.py”)
/etc/tower/conf.d/ldap.py from awx-application-credentials (ro,path=“ldap.py”)
/etc/tower/settings.py from awx-settings (ro,path=“settings.py”)
/var/lib/awx/projects from awx-projects (rw)
/var/lib/awx/rsyslog from rsyslog-dir (rw)
/var/run/awx-rsyslog from rsyslog-socket (rw)
/var/run/receptor from receptor-socket (rw)
/var/run/redis from awx-redis-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln7t4 (ro)
/var/run/supervisor from supervisor-socket (rw)
awx-ee:
Container ID:
Image: quay.io/ansible/awx-ee:latest
Image ID:
Port:
Host Port:
Args:
/bin/sh
-c
if [ ! -f /etc/receptor/receptor.conf ]; then
cp /etc/receptor/receptor-default.conf /etc/receptor/receptor.conf
sed -i “s/HOSTNAME/$HOSTNAME/g” /etc/receptor/receptor.conf
fi
exec receptor --config /etc/receptor/receptor.conf

State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 64Mi
Environment:
Mounts:
/etc/receptor/ from awx-receptor-config (rw)
/etc/receptor/receptor-default.conf from awx-default-receptor-config (rw,path=“receptor.conf”)
/etc/receptor/signing/work-private-key.pem from awx-receptor-work-signing (ro,path=“work-private-key.pem”)
/etc/receptor/tls/ from awx-receptor-tls (rw)
/etc/receptor/tls/ca/receptor-ca.crt from awx-receptor-ca (ro,path=“tls.crt”)
/var/lib/awx/projects from awx-projects (rw)
/var/run/receptor from receptor-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ln7t4 (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
awx-application-credentials:
Type: Secret (a volume populated by a Secret)
SecretName: awx-app-credentials
Optional: false
awx-receptor-tls:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
awx-receptor-ca:
Type: Secret (a volume populated by a Secret)
SecretName: awx-receptor-ca
Optional: false
awx-receptor-work-signing:
Type: Secret (a volume populated by a Secret)
SecretName: awx-receptor-work-signing
Optional: false
awx-secret-key:
Type: Secret (a volume populated by a Secret)
SecretName: awx-secret-key
Optional: false
awx-settings:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: awx-awx-configmap
Optional: false
awx-nginx-conf:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: awx-awx-configmap
Optional: false
awx-redis-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: awx-awx-configmap
Optional: false
awx-redis-socket:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
awx-redis-data:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
supervisor-socket:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
rsyslog-socket:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
receptor-socket:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
rsyslog-dir:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
awx-receptor-config:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
awx-default-receptor-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: awx-awx-configmap
Optional: false
awx-projects:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: awx-projects-claim
ReadOnly: false
static-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: static-data-pvc
ReadOnly: false
kube-api-access-ln7t4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Normal Scheduled 6m6s default-scheduler Successfully assigned awx/awx-b7c7c79b8-4tnhj to awx-troubleshooting
Normal Pulled 4m33s (x5 over 6m5s) kubelet Container image “quay.io/ansible/awx-ee:latest” already present on machine
Normal Created 4m33s (x5 over 6m5s) kubelet Created container init
Normal Started 4m33s (x5 over 6m5s) kubelet Started container init
Warning BackOff 54s (x26 over 6m4s) kubelet Back-off restarting failed container
root@awx-troubleshooting:~/awx-operator#

Hi,

You might be having this issue:

https://github.com/ansible/awx-operator/issues/1055

There is a workaround provided in the comments.

Hope that helps.

Regards,

Antuelle78

HI,

…thanks a lot - will try it and report back!

Thanks,

Andreas

Careful with the latest version though, there is this issue:

https://github.com/ansible/awx/issues/13002

Ha - it’s working ! Have been trying to solve this myself for hours… if not days:

Every 1.0s: kubectl -n awx get pods awx-troubleshooting: Wed Oct 5 14:49:36 2022

NAME READY STATUS RESTARTS AGE
awx-operator-controller-manager-fcf6db67c-zwd9c 2/2 Running 2 (141m ago) 151m
awx-postgres-13-0 1/1 Running 0 2m37s
awx-b7c7c79b8-chnsp 4/4 Running 0 2m5s

Thanks so much!

ah ok - thanks for the pointer… so better stick with operator 0.29.0 ?

Yep, I would recommend operator 0.29.0 with awx 21.6.0.

I haven’t found any issues in my testing so far.