Windows Domain/Ansible Kerberos Auth Issues Still

(I’ve posted a bit about this before, but I want to revisit it because its frustrating as I try to optimize my playbooks)

I have a playbook where I build servers from vmware templates using vmware_guest and I join the domain using that module. Once the servers are built I have an extremely long “wait_for_connection”:

  • name: Wait until server becomes available to connect
    wait_for_connection:
    delay: 900 #Wait 10 minutes before trying
    sleep: 30 #After 10 minutes, try every 30 seconds
    timeout: 1200 #Maximum amount of time to wait

After this wait, I start running tasks on the new hosts. Initially, those tasks will run fine, but one-by-one, randomly, the servers will start failing with Kerberos errors. During this time I can confirm im able to login to these servers using the same credentials, so the authentication doesn’t seem to be failing outside of ansible, but it fails within ansible for some reason.

The longer I wait after building the servers, the less likely this issue occurs. It just seems insane that I have to keep adding more wait time.

Here’s me running the playbook against 4 servers. Each task runs against all four servers but the red lines highlighed show the kerberos failures and the eventual atrophy of the playbook entirely because of the kerberos errors:

TASK [Registry fix to enable solution for CVE-2017-8529 Part 1] ****************
Monday 08 June 2020 16:32:22 +0000 (0:00:09.368) 0:33:29.081 ***********
changed: [server4.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
changed: [server1.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
changed: [server3.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
changed: [server2.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
TASK [Registry fix to enable solution for CVE-2017-8529 Part 2] ****************
Monday 08 June 2020 16:32:25 +0000 (0:00:03.635) 0:33:32.717 ***********
changed: [server1.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
changed: [server4.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
changed: [server2.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
changed: [server3.fqdn] => {“changed”: true, “data_changed”: false, “data_type_changed”: false}
TASK Configure UAC] *************************************************************
Monday 08 June 2020 16:32:29 +0000 (0:00:03.388) 0:33:36.105 ***********
fatal: [server3.fqdn]: UNREACHABLE! => {“changed”: false, “msg”: “kerberos: the specified credentials were rejected by the server”, “unreachable”: true}
changed: [server1.fqdn] => {“changed”: true, “data_changed”: true, “data_type_changed”: false}
changed: [server2.fqdn] => {“changed”: true, “data_changed”: true, “data_type_changed”: false}
changed: [server4.fqdn] => {“changed”: true, “data_changed”: true, “data_type_changed”: false}
TASK [Initialize Disk 1] *******************************************************
Monday 08 June 2020 16:32:32 +0000 (0:00:03.335) 0:33:39.440 ***********
changed: [server4.fqdn] => {“changed”: true, “cmd”: “Initialize-Disk -Number 1”, “delta”: “0:00:04.105311”, “end”: “2020-06-08 04:32:39.137372”, “rc”: 0, “start”: “2020-06-08 04:32:35.032060”, “stderr”: “”, “stderr_lines”: , “stdout”: “”, “stdout_lines”: }
changed: [server1.fqdn] => {“changed”: true, “cmd”: “Initialize-Disk -Number 1”, “delta”: “0:00:03.903042”, “end”: “2020-06-08 04:32:39.527549”, “rc”: 0, “start”: “2020-06-08 04:32:35.624506”, “stderr”: “”, “stderr_lines”: , “stdout”: “”, “stdout_lines”: }
changed: [server2.fqdn] => {“changed”: true, “cmd”: “Initialize-Disk -Number 1”, “delta”: “0:00:05.007749”, “end”: “2020-06-08 04:32:40.903429”, “rc”: 0, “start”: “2020-06-08 04:32:35.895680”, “stderr”: “”, “stderr_lines”: , “stdout”: “”, “stdout_lines”: }
TASK [Wait 15 seconds for disk initilization] **********************************
Monday 08 June 2020 16:32:41 +0000 (0:00:08.457) 0:33:47.898 ***********
Pausing for 15 seconds
(ctrl+C then ‘C’ = continue early, ctrl+C then ‘A’ = abort)
ok: [server1.fqdn] => {“changed”: false, “delta”: 15, “echo”: true, “rc”: 0, “start”: “2020-06-08 16:32:41.126472”, “stderr”: “”, “stdout”: “Paused for 15.0 seconds”, “stop”: “2020-06-08 16:32:56.126843”, “user_input”: “”}
TASK [Partition Disk 1] ********************************************************
Monday 08 June 2020 16:32:56 +0000 (0:00:15.051) 0:34:02.949 ***********
changed: [server4.fqdn] => {“changed”: true}
changed: [server1.fqdn] => {“changed”: true}
changed: [server2.fqdn] => {“changed”: true}
TASK [Format Disk 1 as E drive] ************************************************
Monday 08 June 2020 16:33:03 +0000 (0:00:06.888) 0:34:09.838 ***********
changed: [server4.fqdn] => {“changed”: true}
changed: [server1.fqdn] => {“changed”: true}
changed: [server2.fqdn] => {“changed”: true}
TASK [Stage AV Setup Binaries to e:\admin\binaries] ******************
Monday 08 June 2020 16:33:39 +0000 (0:00:24.463) 0:34:46.237 ***********
changed: [server4.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\AVAgent\”, “operation”: “folder_copy”, “size”: 27713762, “src”: “\\reposerver\Applications\Production\AV”}
changed: [server1.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\AVAgent\”, “operation”: “folder_copy”, “size”: 27713762, “src”: “\\reposerver\Applications\Production\AV”}
changed: [server2.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\AVAgent\”, “operation”: “folder_copy”, “size”: 27713762, “src”: “\\reposerver\Applications\Production\AV”}
TASK [Stage SecScan Setup Binaries to e:\admin\binaries] ***********************
Monday 08 June 2020 16:33:42 +0000 (0:00:03.402) 0:34:49.639 ***********
changed: [server1.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\SecScan\64bit”, “operation”: “folder_copy”, “size”: 23530139, “src”: “\\reposerver\Applications\Production\SecScan”}
changed: [server4.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\SecScan\64bit”, “operation”: “folder_copy”, “size”: 23530139, “src”: “\\reposerver\Applications\Production\SecScan”}
changed: [server2.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\SecScan\64bit”, “operation”: “folder_copy”, “size”: 23530139, “src”: “\\reposerver\Applications\Production\SecScan”}
TASK [Stage LAPS Setup Binaries to e:\admin\binaries] *************************
Monday 08 June 2020 16:33:46 +0000 (0:00:03.674) 0:34:53.314 ***********
fatal: [server1.fqdn]: UNREACHABLE! => {“changed”: false, “msg”: “kerberos: the specified credentials were rejected by the server”, “unreachable”: true}
changed: [server2.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\LAPSAgent\x64”, “operation”: “folder_copy”, “size”: 1019904, “src”: “\\reposerver\Applications\Production\Microsoft\LAPS”}
changed: [server4.fqdn] => {“changed”: true, “dest”: “e:\admin\binaries\LAPSAgent\x64”, “operation”: “folder_copy”, “size”: 1019904, “src”: “\\reposerver\Applications\Production\Microsoft\LAPS”}
TASK [Ensure LAPS is installed] ************************************************
Monday 08 June 2020 16:33:49 +0000 (0:00:03.291) 0:34:56.606 ***********
changed: [server4.fqdn] => {“changed”: true, “rc”: 0, “reboot_required”: false}
changed: [server2.fqdn] => {“changed”: true, “rc”: 0, “reboot_required”: false}
TASK [Ensure Agent is installed] **********************************************
Monday 08 June 2020 16:33:54 +0000 (0:00:04.571) 0:35:01.177 ***********
fatal: [server2.fqdn]: UNREACHABLE! => {“changed”: false, “msg”: “kerberos: the specified credentials were rejected by the server”, “unreachable”: true}
changed: [server4.fqdn] => {“changed”: true, “rc”: 0, “reboot_required”: false}
TASK [Ensure Agent is installed] ************************************************
Monday 08 June 2020 16:34:03 +0000 (0:00:09.009) 0:35:10.187 ***********
changed: [server4.fqdn] => {“changed”: true, “rc”: 0, “reboot_required”: false}
TASK [Ensure AV is installed] ******************************************
Monday 08 June 2020 16:34:08 +0000 (0:00:04.973) 0:35:15.161 ***********
fatal: [server4.fqdn]: UNREACHABLE! => {“changed”: false, “msg”: “kerberos: the specified credentials were rejected by the server”, “unreachable”: true}

I’m a bit new to the Linux world, is it possible this is a bug within something on the linux node I run ansible/ansible tower off of? I initially thought it was something with AD replication, but I can authenticate fine against these servers within minutes of them being added to the domain through normal windows/microsoft processes.

Thanks in advance for any advice!

Are these Linux Machines?
how many Domain Controllers are in your Environment if you have more then one it may be doing a round-robin on the Kerberos and failing on one Domain and not on the other.: you need to start restricting down to only allow your Linux Server to Connect to one AD.

The machines being managed here are windows machines - but the ansible tower server itself is linux (obviously) - I wonder if the kerberos configuration on the tower machine may be running into a flavor of what you’re suggesting - but im not sure exactly how I would point the tower server directly to just one DC for authentication

Hi Dave,

I didn’t understand the problem, but I would to like help you.

My wait server is:

  • name: waiting Windows server
    wait_for:
    port: 5985
    sleep: 50
    timeout: 500
    host: “{{ groups[item[1]][-1] }}”
    when:

  • item[0].value.os_type | lower == ‘windows’

  • item[0].key == item[1]
    loop_control:
    index_var: idx
    with_nested:

  • “{{ virtual_machine|dict2items }}”

  • “{{ groups }}”

  • name: waiting linux server
    wait_for:
    port: 22
    sleep: 10
    timeout: 300
    host: “{{ groups[item[1]][-1] }}”
    when:

  • item[0].value.os_type | lower == ‘linux’

  • item[0].key == item[1]
    loop_control:
    index_var: idx
    with_nested:

  • “{{ virtual_machine|dict2items }}”

  • “{{ groups }}”

About your problem authentication, in Kerberos/sssd is possibly defined the default domain if you working more domains but you need pass user+domain for authentication.