Okta saml auth error social 'RelayState'

I am trying to set up Okta auth via saml on awx 2.4.0. I have set everything up as per documentation on the okta side and created a cert/key using


openssl req -new -x509 -days 3652 -nodes -out sp.crt -keyout sp.key

{
“okta”: {
“attr_user_permanent_id”: “urn:oid:1.3.6.1.4.1.5555.610.2.2.1.11”,
“attr_username”: “urn:oid:1.3.6.1.4.1.5555.610.2.2.1.11”,
“entity_id”: “”,
“url”: “”,
“x509cert”: “”
}
}

When I try to auth with a valid account I get an error in logs

social 'RelayState'.
``` and it just redirects me to the login page again. I do Okta was successful. 

VERSIONS
DEFAULT_AWX_VERSION=22.5.0
OPERATOR_VERSION=2.4.0


Logs:

I see the following on my logs 

k logs awx-mgmt-web-ff8fc657d-g9w65 awx-mgmt-web -f
2024-04-11 22:39:53,465 ERROR [892518e519f4483eae21e9ef0a88d4a1] social ‘RelayState’.
2024-04-11 22:39:53,473 DEBUG [892518e519f4483eae21e9ef0a88d4a1] awx.analytics.performance request: <WSGIRequest: GET ‘/sso/complete/saml/’>, response_time: 0.089s

10.160.1.28 - - [11/Apr/2024:22:39:53 +0000] “GET /sso/complete/saml/ HTTP/1.1” 302 0 “” [Thu Apr 11 22:39:53 2024] GET /sso/complete/saml/ => generated 0 bytes in 91 msecs (HTTP/1.1 302) 10 headers in 461 bytes (1 switches on core 0)
2024-04-11 22:39:53,704 DEBUG [8c498d2fa8a6410fba930942b2ace3f0] awx.analytics.performance request: <WSGIRequest: GET ‘/sso/error/’>, response_time: 0.070s

Any help would be appreciated, curious if there is another log with more information.

Please check this article for help resolving this issue. It’s a bit dated, and some doesn’t apply to OCP, but there is some helpful information in it.

Resolution

Verify that the Enabled IdP Provider url in the Ansible Tower configuration matches that of the IdP metadata. This field should be the SSO service’s location attribute.

If the Ansible Tower application nodes are behind a load balancer, you will need to add the following lines into /etc/tower/conf.d/custom.py on each application node followed by ansible-tower-service restart:

USE_X_FORWARDED_PORT = True
USE_X_FORWARDED_HOST = True

Root Cause

Ansible Tower hands off authentication to the third party SSO. The SSO successfully authenticates, but fails a relay state check when handing auth back to Tower.

Diagnostic Steps

/var/log/tower/tower.log:

ERROR social "'RelayState'"

I hope this helps.
Thanks,
Jeff

1 Like

ooops, I reread your description and saw you have additional issues that should be addressed, too.

{
  “okta”: {
    “attr_user_permanent_id”: “urn:oid:1.3.6.1.4.1.5555.610.2.2.1.11”,
    “attr_username”: “urn:oid:1.3.6.1.4.1.5555.610.2.2.1.11”,
    “entity_id”: “”, 
    “url”: “”,
    “x509cert”: “”
    }
}

You appear to be missing some required fields. The following docs should show you what you are missing. All fields MUST be defined.
https://docs.ansible.com/automation-controller/latest/html/administration/ent_auth.html#saml-settings:~:text="onelogin"%3A,email" %20%20} }

Also, the following article should help as it explains how to do exactly what you are trying to do. It is also a bit dated, but the same information applies and should help you out here.

I hope this helps.
Jeff

1 Like

Hi Jeff! Thank you for your reply. Sorry for late reply I was out on a work trip.

I was able to follow the steps you suggested. by adding the following to my awx config.

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx-mgmt
spec:
  secret_key_secret: awx-mgmt-secret-key
  extra_settings:
    - setting: USE_X_FORWARDED_HOST
      value: "True"
---
apiVersion: v1
kind: Secret
metadata:
  name: awx-mgmt-secret-key
  namespace: awx
stringData:
  secret_key: 
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: ca
  namespace: awx
spec:
  securityPolicy:
    name: internal-networks
---
apiVersion: v1
kind: Service
metadata:
  annotations:
    beta.cloud.google.com/backend-config: '{"ports": {"80":"ca"}}'
  name: awx-mgmt-manual-service
  namespace: awx
  labels:
    app: awx-mgmt
spec:
  ports:
    - port: 80
      name: http
      targetPort: 8052
      protocol: TCP
  selector:
    app.kubernetes.io/component: awx
    app.kubernetes.io/managed-by: awx-operator
    app.kubernetes.io/name: awx-mgmt-web
  type: NodePort
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: awx-mgmt-manual-ingress
  namespace: awx
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt
    acme.cert-manager.io/http01-edit-in-place: "true"
    kubernetes.io/ingress.allow-http: "true"
spec:
  tls:
    - hosts:
        - awx-mgmt.com
      secretName: awx-mgmt-tls
  rules:
    - host: awx-mgmt.prd.it.com
      http:
        paths:
          - path: /*
            pathType: ImplementationSpecific
            backend:
              service:
                name: awx-mgmt-manual-service
                port:
                  name: http


here is settings.py file

bash-5.1$ tail -n 10 /etc/tower/settings.py

USE_X_FORWARDED_PORT = True
BROADCAST_WEBSOCKET_PORT = 8052
BROADCAST_WEBSOCKET_PROTOCOL = 'http'


RECEPTOR_LOG_LEVEL = 'info'


USE_X_FORWARDED_HOST = True

Regarding the missing field I removed them for security but I will paste below what I have now with some dummy data.

{
  "okta": {
    "attr_user_permanent_id": "name_id",
    "attr_first_name": "User.FirstName",
    "attr_last_name": "User.LastName",
    "attr_username": "User.email",
    "attr_email": "User.email",
    "entity_id": "http://www.okta.com/<ID>",
    "url": "https://okta.com/app/<NAME>/<ID>/sso/saml",
    "x509cert": "<CERT FROM OKTA>"
  }
}

I also went ahead and restarted my node and it was to the same results as before relay error.

I tried upgrading our dev to current latest version and I also seem to be getting an error there but that one seems a bit different

That error takes me to the following page instead.

I am starting to think this is perhaps an issue on how I set my ingress/service. I will keep playing around with it. If I find something Ill post here.

Dev I setup a bit different using AWX operator to set up ingress see below.

---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
  name: awx-poc
spec:
  no_log: false
  csrf_cookie_secure: 'False'
  session_cookie_secure: 'False'
  secret_key_secret: awx-poc-secret-key
  service_type: NodePort
  service_annotations: |
    environment: sandbox
    beta.cloud.google.com/backend-config: '{"ports": {"80":"ca"}}'
  ingress_type: ingress
  ingress_hosts:
    - hostname: awx-poc.com
      tls_secret: sample-tls-secret
  ingress_annotations: |
    environment: sandbox
    cert-manager.io/cluster-issuer: letsencrypt
    acme.cert-manager.io/http01-edit-in-place: "true"
    kubernetes.io/ingress.allow-http: "true"
  extra_settings:
    - setting: USE_X_FORWARDED_HOST
      value: "True"
    - setting: LOG_AGGREGATOR_LEVEL
      value: "'DEBUG'"
  # error with debug level
  extra_volumes: |
    - name: awx-web-debug
      emptyDir: {}
  web_extra_volume_mounts: |
    - name: awx-web-debug
      mountPath: "/var/log/tower"
---
apiVersion: v1
kind: Secret
metadata:
  name: awx-poc-secret-key
  namespace: awx-poc
stringData:
  secret_key: key
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: ca
  namespace: awx-poc
spec:
  securityPolicy:
    name: internal-networks

Thank you,
jdp1

After a lot of testing I went back and had a call with our OKTA admin. Seems that the error was on their side. I did not need any extra settings other than what I had listed above for saml config.

This article helped him set things up on his side correctly. A trailing “/” was what was breaking this before.

1 Like