postgres pod Failed Scheduling - - 1 node(s) didn't find available persistent volumes to bind.

Hello AWX Team,

ENV is a kubernetes cluster(not minikube), on oracle linux 8.
I have tried the same on Ubuntu and get into the same issue.

I’ve followed all the guides and it seems like the postgress pod fails scheduling, while the PVC waits for the pod to be scheduled. The PVC/PV & Storage Class are all created and I even have another PV/PVC/SC created for the projects storage, that are bound correctly.

Here are all the details I could muster.
If I can provide more, I am happy to. I’ve obfuscated some sensitive details with ****

[sysadmin@dev-awx-01 k8awx]$ kubectl describe pod -postgres-13-0
Name: -postgres-13-0
Namespace: awx
Priority: 0
Node:
Labels: app.kubernetes.io/component=database
app.kubernetes.io/instance=postgres-13-

app.kubernetes.io/managed-by=awx-operator
app.kubernetes.io/name=postgres-13
app.kubernetes.io/part-of=

controller-revision-hash=-postgres-13-8677ccdd5d
statefulset.kubernetes.io/pod-name=
-postgres-13-0
Annotations:
Status: Pending
IP:
IPs:
Controlled By: StatefulSet/-postgres-13
Containers:
postgres:
Image: postgres:13
Port: 5432/TCP
Host Port: 0/TCP
Requests:
cpu: 10m
memory: 64Mi
Environment:
POSTGRESQL_DATABASE: <set to the key ‘database’ in secret '
-postgres-configuration’> Optional: false
POSTGRESQL_USER: <set to the key ‘username’ in secret ‘-postgres-configuration’> Optional: false
POSTGRESQL_PASSWORD: <set to the key ‘password’ in secret '
-postgres-configuration’> Optional: false
POSTGRES_DB: <set to the key ‘database’ in secret ‘-postgres-configuration’> Optional: false
POSTGRES_USER: <set to the key ‘username’ in secret '
-postgres-configuration’> Optional: false
POSTGRES_PASSWORD: <set to the key ‘password’ in secret '-postgres-configuration’> Optional: false
PGDATA: /var/lib/postgresql/data/pgdata
POSTGRES_INITDB_ARGS: --auth-host=scram-sha-256
POSTGRES_HOST_AUTH_METHOD: scram-sha-256
Mounts:
/var/lib/postgresql/data from postgres-13 (rw,path=“data”)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-glhch (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
postgres-13:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: postgres-13-
-postgres-13-0
ReadOnly: false
kube-api-access-glhch:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning FailedScheduling 9s (x2 over 96s) default-scheduler 0/1 nodes are available: 1 node(s) didn’t find available persistent volumes to bind.

I have two storage classes created. They are the same, except the bind mode. If I set the bind mode to Immediate, for the postgres SC, the postgres pod fails to scheduled with pod has unbound immediate PersistentVolumeClaims.
The projects-storage will bind regardless of binding mode.

[sysadmin@dev-awx-01 k8awx]$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
postgres-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 2m53s
projects-storage kubernetes.io/no-provisioner Delete Immediate false 2m53s

The PVs are the same. They are just named differently and point to their own storage class.

[sysadmin@dev-awx-01 k8awx]$ cat projects-storage-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: projects-storage-pv
namespace: awx
spec:
accessModes:

  • ReadWriteMany
    persistentVolumeReclaimPolicy: Retain
    capacity:
    storage: 20Gi
    volumeMode: Filesystem
    storageClassName: projects-storage
    local:
    path: /data/awx
    nodeAffinity:
    required:
    nodeSelectorTerms:
  • matchExpressions:
  • key: kubernetes.io/hostname
    operator: In
    values:
  • dev-awx-0

[sysadmin@dev-awx-01 k8awx]$ cat postgres-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-storage-pv
spec:
accessModes:

  • ReadWriteMany
    persistentVolumeReclaimPolicy: Retain
    capacity:
    storage: 20Gi
    volumeMode: Filesystem
    storageClassName: postgres-storage
    hostPath:
    path: /data/postgress
    nodeAffinity:
    required:
    nodeSelectorTerms:
  • matchExpressions:
  • key: kubernetes.io/hostname
    operator: In
    values:
  • dev-awx-0

The PVs get created properly.

We are using local storage for this:
[sysadmin@dev-awx-01 k8awx]$ df -h
/dev/mapper/ansible-awx–storage 25G 211M 25G 1% /data/awx
/dev/mapper/ansible-postgres–storage 25G 211M 25G 1% /data/postgress
[sysadmin@dev-awx-01 k8awx]$

[sysadmin@dev-awx-01 k8awx]$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
postgres-storage-pv 20Gi RWX Retain Available postgres-storage 3m
projects-storage-pv 20Gi RWX Retain Bound awx/projects-storage-pvc projects-storage 3m

However, the PVC for postgres-13-*****-postgres-13-0 will not bind.

[sysadmin@dev-awx-01 k8awx]$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
postgres-13-*****-postgres-13-0 Pending postgres-storage 2m40s
projects-storage-pvc Bound projects-storage-pv 20Gi RWX projects-storage 3m3s

Describing the PVC, shows it’s waiting for the POD to be scheduled.

[sysadmin@dev-awx-01 k8awx]$ kubectl describe pvc postgres-13--postgres-13-0
Name: postgres-13-
-postgres-13-0
Namespace: awx
StorageClass: postgres-storage
Status: Pending
Volume:
Labels: app.kubernetes.io/component=database
app.kubernetes.io/instance=postgres-13-****
app.kubernetes.io/managed-by=awx-operator
app.kubernetes.io/name=postgres-13
Annotations:
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: ****-postgres-13-0
Events:
Type Reason Age From Message


Normal WaitForFirstConsumer 9m29s persistentvolume-controller waiting for first consumer to be created before binding
Normal WaitForPodScheduled 3m25s (x25 over 9m25s) persistentvolume-controller waiting for pod ****-postgres-13-0 to be scheduled

Below is the awx yaml and the kustomization.yaml.

I noticed that there is a typo in your hostPath for the postgres PVC, perhaps that could be part of the issue. (/data/postgres instead of /data/postgress).

I am a bit confused here. The StorageClass should dynamically provision the PVC as need for your when AWX is deployed. Perhaps we could ass postgres_pvc_claim as an option on the AWX spec so that users could pre-create their own pv and pvc for postgres. Please open an issue in the awx-operator repo if that is something you would find useful.

Thanks,
AWX Team

Hi there,

Thank you for getting back to me.

The typo should not matter, as this is the name of the path /data/postgress instead of /data/postgres. I accidentally typed another s in there. The manifests all point to the correct path.

“The StorageClass should dynamically provision the PVC as need for your when AWX is deployed.”

That’s the thing. It doesn’t. If you can tell me there is something wrong with my manifests, I’m more than willing to redo them.

"Perhaps we could ass postgres_pvc_claim as an option on the AWX spec so that users could pre-create their own pv and pvc for postgres. "

I actually looked for that and could not find it and wondered why it does not exist. In my case, I am deploying on bare-metal/vms and I don’t necessarily have access to cloud storage types that would make a lot of sense, performance-wise, in my deployment.

In any case, I was able to bypass this with an external postgresql database. I scripted the setup so whenever I deploy a new AWX instance, I also get a new psql instance running with a new db, role, access permissions and all that jazz.

"Please open an issue in the awx-operator repo if that is something you would find useful. "

I will. :slight_smile:

Cheers!