Some eda instance pods in native kubernetes not starting correctly

Hello everyone,
i have a 3 nodes, rocky 9 based, Kubernetes native cluster version 1.31.0.
I am using the GitHub - ansible/eda-server-operator documentation and i use the config/default configuration for deploying operator. Everything works OK. When i try to create an eda instance for testing purposes based ob below yaml file some pods are starting correctly as shown below but some are not. When i check the failed ones the root cause looks like is inside the configure-bundle-ca-cert container where the logs shows:

ln: failed to create symbolic link '/etc/pki/ca-trust/extracted/pem/directory-hash/ca-certificates.crt': Permission denied

When i check inside the pod indeed the sim link is already present inside the container. The container image used is eda-server:main
Did anyone else encounter this error?
eda instance yaml:

apiVersion: eda.ansible.com/v1alpha1
kind: EDA
metadata:
  name: eda-sat
  namespace: eda-sat
spec:
  automation_server_url: https://awx.awx-test.example.com
  service_type: ClusterIP
  bundle_cacert_secret: eda-sat-custom-certs
  extra_settings:
    - setting: EDA_MAX_RUNNING_ACTIVATIONS
      value: "12"
  database:
    resource_requirements:
      requests:
        cpu: 10m
        memory: 64Mi
    storage_requirements:
      requests:
        storage: 8Gi
    postgres_storage_class: nfs-csi

the pods status inside the namespace:

NAME                                                     READY   STATUS                  RESTARTS         AGE
eda-sat-activation-worker-697998d7bf-c9qxl               0/1     Init:CrashLoopBackOff   11 (3m16s ago)   35m
eda-sat-activation-worker-697998d7bf-k8zh2               0/1     Init:CrashLoopBackOff   11 (3m40s ago)   35m
eda-sat-activation-worker-cc8f795d6-qqtn8                0/1     Init:CrashLoopBackOff   11 (43s ago)     32m
eda-sat-api-75b8dfbd74-6hl26                             0/2     Init:CrashLoopBackOff   11 (3m33s ago)   35m
eda-sat-api-7fdd6f64b7-d48v9                             0/2     Init:CrashLoopBackOff   11 (75s ago)     32m
eda-sat-default-worker-664c4848d4-hjghb                  0/1     Init:CrashLoopBackOff   11 (3m35s ago)   35m
eda-sat-default-worker-664c4848d4-r8x2p                  0/1     Init:CrashLoopBackOff   11 (3m37s ago)   35m
eda-sat-default-worker-769b996b67-d2cst                  0/1     Init:CrashLoopBackOff   11 (74s ago)     32m
eda-sat-postgres-15-0                                    1/1     Running                 0                36m
eda-sat-redis-589f6779b8-gcn6q                           1/1     Running                 0                36m
eda-sat-scheduler-5fc7dd55d-8tlv2                        1/1     Running                 0                35m
eda-sat-scheduler-5fc7dd55d-mrg8m                        1/1     Running                 0                35m
eda-sat-ui-779f7f5db5-q9swx                              1/1     Running                 0                35m
eda-server-operator-controller-manager-f888ffc68-57mnl   2/2     Running                 0                43m

Did someone encounter this error?
For example my awx instance which is deployed inside the same k8s cluster works flawlessly.
Thank you,
Victor

Hi,
an update from my side
I have did the same steps on a kind based Kubernetes 1.30.0 cluster and the issue is also there.
Can someone try to deploy an eda instance based on the current eda-server:main image to see if they encounter the same error?

Thank you,
Victor

Hi,

In case it helps we had a similar issue.
We managed to fix it by changing the deployment to use awx-ee:24.6.1 instead of latest - so perhaps a bug in the latest image.

Thanks

Hi Rob,
Thank you for the information, will test and write here if is successfull the test.

Thank you,
Victor

1 Like