Error when installing AWX Operator

I am receiving an error when creating the AWX Operator. I’m getting the logs using:

kubectl logs deployments/awx-operator-controller-manager -c awx-manager -f -n awx

I get the following error:

TASK [installer : Set the resource pod name as a variable.] ********************\r\ntask path: /opt/ansible/roles/installer/tasks/migrate_data.yml:31\nfatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: list object has no element 0. list object has no element 0\n\nThe error appears to be in ‘/opt/ansible/roles/installer/tasks/migrate_data.yml’: line 31, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Set the resource pod name as a variable.\n ^ here\n"}\n\r\nPLAY RECAP *********************************************************************\r\nlocalhost : ok=54 changed=0 unreachable=0 failed=1 skipped=25 rescued=0 ignored=0

Here is the YAML I’m using:

apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
namespace: awx
spec:
old_postgres_configuration_secret: awx-old-postgres-configuration
postgres_configuration_secret: awx-old-postgres-configuration
admin_user: admin
admin_password_secret: awx-admin-password
secret_key_secret: awx-secret-key
projects_persistence: true
projects_existing_claim: “awx-data”
service_type: NodePort
replicas: 3
hostname: awx.nl.mdb-lab.com
ingress_type: ingress
ingress_class_name: avi-lb
ingress_tls_secret: awx.nl.mdb-lab.com

Can anyone see what I’m missing?

Since you specified an old postgres configuration secret, the installer expects a postgres pod to be running, but none was found, so the data cannot be migrated. I suspect you should not define the new postgres configuration to use the old one. When you create the new instance to migrate from the old one, a new configuration will be created from the old one. You also need to specify the secret key that was used by the old instance or the data may be inaccessible.

Migration docs:
awx-operator/docs/migration/migration.md at devel · ansible/awx-operator (github.com)

Code where your error occurred:
awx-operator/roles/installer/tasks/migrate_data.yml#L31 at devel · ansible/awx-operator (github.com)

Edit: You also should only need to do this if you’re migrating from versions prior to 18 to versions afterwards. There is no need to specify an old configuration secret to do in-place upgrades, and migrations can re-use existing secrets.

In that case, you will need that line. You will also need make sure your new instance has the secret_key_secret as shown the migration doc. I would not define postgres_configuration_secret:, let the operator create that.

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  old_postgres_configuration_secret: awx-old-postgres-configuration
  # postgres_configuration_secret: awx-old-postgres-configuration
  admin_user: admin
  admin_password_secret: awx-admin-password
  secret_key_secret: awx-secret-key # create this ahead of time
  projects_persistence: true
  projects_existing_claim: “awx-data”
  service_type: NodePort
  replicas: 3
  hostname: awx.nl.mdb-lab.com
  ingress_type: ingress
  ingress_class_name: avi-lb
  ingress_tls_secret: awx.nl.mdb-lab.com

Hi,

When I do this, it creates the awx-postgres-13-0 pods which I don’t want/need…

Are you using an external postgresql database?

Edit: if yes, what version are you running?

Yes, I am using an external Postgres database, v14

In that case, the operator just appears to be doing a pg_dump and pg_restore from an old instance to a new instance when you define old_postgres_configuration_secret. So no special database operations are occurring. AWX requires a minimum of postgresql 13, so you’re good there as well.

Since your database is not being migrated as part of the upgrade, you do not need to define the old configuration secret. You will still need to pre-create the secret_key_secret using the old secret though.

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  postgres_configuration_secret: awx-old-postgres-configuration
  admin_user: admin
  admin_password_secret: awx-admin-password
  secret_key_secret: awx-old-secret-key
  projects_persistence: true
  projects_existing_claim: “awx-data”
  service_type: NodePort
  replicas: 3
  hostname: awx.nl.mdb-lab.com
  ingress_type: ingress
  ingress_class_name: avi-lb
  ingress_tls_secret: awx.nl.mdb-lab.com

We were finally on our way but the deployment it just a mess:

The common error is:

The node was low on resource: ephemeral-storage. Threshold quantity: 2094169733, available: 57212Ki. Container awx-task was using 240Ki, request is 0, has larger consumption of ephemeral-storage. Container awx-ee was using 60Ki, request is 0, has larger consumption of ephemeral-storage. Container awx-rsyslog was using 108Ki, request is 0, has larger consumption of ephemeral-storage. Container redis was using 40Ki, request is 0, has larger consumption of ephemeral-storage

Not having any other issues with any other application - just AWX :frowning:

That sounds like a kubernetes problem you need to address, but we might be able to alleviate the symptoms in AWX. You can tune the resource requests and limits for pretty much every container of AWX, including the ephemeral-storage.

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx
  namespace: awx
spec:
  postgres_configuration_secret: awx-old-postgres-configuration
  admin_user: admin
  admin_password_secret: awx-admin-password
  secret_key_secret: awx-old-secret-key
  projects_persistence: true
  projects_existing_claim: “awx-data”
  service_type: NodePort
  replicas: 3
  hostname: awx.nl.mdb-lab.com
  ingress_type: ingress
  ingress_class_name: avi-lb
  ingress_tls_secret: awx.nl.mdb-lab.com
  task_resource_requirements:
    requests:
      cpu: 100m
      memory: 128Mi
      ephemeral-storage: 100M
    limits:
      cpu: 2000m
      memory: 4Gi
      ephemeral-storage: 500M
  web_resource_requirements:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 1000m
      memory: 4Gi
  ee_resource_requirements:
    requests:
      cpu: 100m
      memory: 64Mi
    limits:
      cpu: 1000m
      memory: 4Gi
  redis_resource_requirements:
    requests:
      cpu: 50m
      memory: 64Mi
    limits:
      cpu: 1000m
      memory: 4Gi
  rsyslog_resource_requirements:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 1000m
      memory: 2Gi
  init_container_resource_requirements:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 1000m
      memory: 2Gi
  postgres_init_container_resource_requirements:
    requests:
      cpu: 10m
      memory: 64Mi
    limits:
      cpu: 1000m
      memory: 2Gi

That said, you probably should look into extending the underlying filesystem of your kubernetes nodes. Unfortunately, that gets outside my realm of expertise. If you could share some details about your platform, someone else here may be able to help. You might want to make a separate Get Help post about it.

1 Like

You were absolutely right. I needed to extend the ephemeral-storage on the underlying cluster. Once I’d done that:

Massive thanks for all your help!

1 Like