Help needed with an AWX 19.2.1 Kerberos issue...

Hi all, help needed with an AWX 19.2.1 Kerberos issue that is driving me crazy. My setup is K8s (one master and 2 worker nodes) on 10.0.0.0/16 subnet and I have my domain controller on my internal network 192.168.5.0/24 (along with all my other servers Linux and Windows). AWX is setup to use metallb load balancer and have an IP on the 192.168.5.0 subnet. No issues connecting to the Web UI, and all my linux tests and playbooks works fine for linux servers residing on my internal network. For a while now I have been trying to get kerberos to work but I keep getting the following error when I try to do a win_ping to any of my windows servers (all residing on the 192.168.5.0/24 subnet):

Kerberos auth failure for principal @ with pexpect:
Cannot find KDC for realm "" while getting initial credentials

All my containers inside the AWX pod have krb5.conf set to use my domain (UPPERCASE) and they also have my internal DNS servers in resolv.conf. From the containers I have no problems pinging servers on my internal network (192.168.0.0), and even using kinit @ works - I do get a kerberos ticket. However, when I try to run a win_ping from the web interface I get the error shown above.

The Execution Environment is v 0.4.0 (also tried with my own customized EE)
Other than the use of Metallb LB, and bringing in krb5.conf and a resolv.conf for DNS on my internal LAN - everything is pretty much standard.

Here is my krb5.conf file:

To opt out of the system crypto-policies configuration of krb5, remove the

symlink at /etc/krb5.conf.d/crypto-policies which will not be recreated.

includedir /etc/krb5.conf.d/

[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[libdefaults]
default_realm =
dns_lookup_realm = true
dns_lookup_kdc = true
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt
spake_preauth_groups = edwards25519
default_ccache_name = KEYRING:persistent:%{uid}

[realms]
= {
kdc = .
admin_server = .
}

[domain_realm]
. =
=

I cannot for the life of me get this to work - any tips/help on how to get this to work?

I don’t know if it helps, try commenting/removing this line:

default_ccache_name = KEYRING:persistent:%{uid}

I tried removing that line from the krb5.conf file - but still getting the same error running win_ping.
One additional piece of information - that probably doesn’t make any difference, in my kubernetes environment I use Cilium for the networking. Given that I do get a kerberos ticket using kinit from the AWX EE container - and that everything else I run in the environment works as expected, I don’t think the networking is the issue.

Curious to know how you resolved this issue.

Thanks

I still cannot get this to work - I haven’t had much time to spend on it lately, but I have tried to upgrade to the latest AWX version.

If anyone have any ideas that could help - it would be greatly appreciated.

Thanks

Hi,

Not sure whether it helps but have a look at the following.

I had a similar issue in the past but that has to do with the K8s cluster failing to resolve DNS names. And the error also somewhat suggests that, AWX is unable to find the KDC for realm “DOMAIN”.

So I’d suggest you to check and ensure the K8s DNS is working fine and able to resolve FQDN.

The second thing would be to check the krb5.conf file of the worker nodes on which AWX containers run. Try running the kinit user@DOMAIN and see if it is successful or not.

Regards,
Vibin

Finally Kerberos settings are working with custom EE image(s). See https://groups.google.com/g/awx-project/c/7UV0ZABsH_I for details.

And my frustration is growing - having finally spent some time on this issue again. I have tried just about every suggestion, but still not getting this to work.
What I have done since I initially posted this - is to upgrade to AWX 19.5.0. I have played around with the krb5.conf settings based on many of the comments I have received - but I am still getting the same error when I try to run a win_ping (either thru Run Command or a playbook) on my domain joined windows servers.

The error is still:

Kerberos auth failure for principal @ with pexpect:
Cannot find KDC for realm "" while getting initial credentials

What does work is if I go into the EE pod and run a kinit to get a kerberos ticket with the same user I use in the GUI. And then run “ansible <server.DOMAIN> -m win_ping -i ” - I get a successful results. But running the same thing thru the Web interface - I get the error above.
Given the success running it from the actual EE pod - tells me DNS is working, and there are no firewalls blocking anything.

Bottom line - I have no clue why this does not work using the Web interface.
Would be greatly appreciated if anyone have any suggestions, or ideas as on how to resolve this.

Thanks

Have you tried running kinit, klist, etc. in debug mode as it could be that a firewall port is blocking the time sync, ldap, krb, etc. ?

As I mentioned - running KINIT and win_ping in the EE pod works like a charm. And I look at my domain controller and see the events I expect to see in the event logs. However, trying to run win_ping from the UI - I see nothing logged in event viewer on the domain controller, not even errors.
My question is - what is the difference between the EE pod, and the temp pod that awx spins up on execution from the UI? Shouldn’t the temp pod be the same as the EE pod, in regards to krb5.conf, resolv.conf, installed python modules etc etc?

I really have no idea what to try next. My old AWX install - version 17, running on Docker works just fine.
If anybody have any ideas, on commands to run, config settings etc - let me know where/how to run it :slight_smile:

What happens if you put the name/IP address of your realm controllers in /etc/hosts and set:

dns_lookup_realm = false
dns_lookup_kdc = false

[realms]
TEST.LOCAL = {
kdc = winserver.test.local:88
admin_server = winserver.test.local:749
default_domain = test.local
}

I had a similar problem with Samba whereby for some reason the Kerberos lookup to Windows failed and this was the only solution.

I am still having the same issue. From the EE pod I run a KINIT with my AD user, then I can run a successful ad-hoc command from there against a domain joined Windows server.
As soon as I use the Web interface - I get the same error as I have always gotten, it can’t find the KDC.
And again - I have an installation of AWX 17, where it works like a charm.

Any help would be greatly appreciated

The only thing I can think is that your DNS name resolution within the pod is somehow broken. I suggest trying to use ping and dig/nslookup inside your pod.

I am not sure - but given that I can successfully connect using Kerberos from the EE pod, why would the configuration of temp pod spun up during execution be any different? Is there any way to configure AWX to use the actual EE pod that is running instead of spinning up a temporary pod? If so - how can I do that?

I recall running into a similar issue and this is what I tried and it still is working fine in 19.2.0 environment.

  1. Created a Configmap with krb5.conf as mentioned at https://github.com/ansible/awx/issues/9807
  2. Used extra_volumes mount option in the AWX instance configuration
  3. Only difference with our EE pod is that it’s using a custom docker image(had to do this to add custom CA and LDAP utilities for our jobs) instead of the default one and the same image is used for the temp automation/EE pod as well

Give this a try to see if it helps.

http://weiyentan.github.io/2021/Installing-awx-kubernetes/
I wrote about this and what the yaml should look like when mounting the config map too if that helps

How can additional collections be installed so that he can use AWX, e.g.

https://docs.ansible.com/ansible/latest/collections/ansible/builtin/add_host_module.html

I am using the project: https://github.com/kurokobo/awx-on-k3s

Two ways. As part of the requirements in the project or in the ee.

I wrote about that too.

http://weiyentan.github.io/2021/creating-execution-environments/