Windows Server install updates playbook inconsistent

Hi everyone, I’m new to the forum so if there is a post like this already, I apologize.
I have a few windows servers with no access to the internet, in workgroup, not join to the domain.
My monthly process is to manually download the KBs from MS catalog and install them on those servers.
Below is one of the tasks on my current playbook. The issue I am seeing, after the 1st reboot the job template (MOST TIMES) would never connect so it would never move to the 2nd task in my playbook. But there are also times when the playbook executes flowless all the way to the end without issues. I have played with various module with the same results. Any ideas/thoughts?

  • name: Install 1st Update
    ansible.windows.win_powershell:
    script: Add-WindowsPackage -Online -PackagePath:“{{ KB1 }}” -NoRestart -LogPath “C:\windows\temp\installkb1.log”
    tags: 1st

  • name: Reboot (1st)
    ansible.windows.win_reboot:
    reboot_timeout: 600 # Wait up to 10 minutes for the system to reboot
    test_command: echo “OK” # Ensure the system is back and accepting commands
    tags: 1stReboot

  • name: Wait for SSH to be available after reboot (port 22)
    ansible.windows.win_wait_for:
    host: “{{ target_host }}”
    port: 22 # SSH port
    state: started
    delay: 15 # Wait 15 seconds before checking
    timeout: 300 # Timeout after 5 minutes
    tags: WaitForReboot

I’m not clear whether the system is not rebooting or whether it’s rebooting and AWX does not detect it coming back. In either case, you have a 10 minute timer set. What happens when that expires. Do you get the failure you expect?

Hey Kevin,
I should have explained little better.
The win_powershell module executes and the 1st KB gets installed.

The win_reboot also starts which triggers the restart of the VM. The server reboots, comes back to the logging screen but the task will not complete, and it won’t move on to the next one, despite that I can ssh from another machine Ansible job still continues to run/hanging at the reboot task. I have seen the job template running at that state for maybe 7-9hours. I actually have to cancel the job.

After I posted this, I increase the verbosity to level 5 and I noticed that win_reboot is trying to validate the reboot by trying to retrieve system boot time.
Here are the last few lines of the log before canceling the job.

debug2: Received exit status from master 0
ansible.windows.win_reboot: last boot time: 133766322995000000
ansible.windows.win_reboot: last boot time check fail _TestCommandFailure ‘boot time has not changed’, retrying in 1.96 seconds…
Traceback (most recent call last):
File “/usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/plugin_utils/_reboot.py”, line 392, in _do_until_success_or_condition
res = func(*args, **kwargs)
File “/usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/plugin_utils/_reboot.py”, line 305, in _check_boot_time
raise _TestCommandFailure(“boot time has not changed”)
ansible_collections.ansible.windows.plugins.plugin_utils._reboot._TestCommandFailure: boot time has not changed
ansible.windows.win_reboot: attempting to get system boot time
ansible.windows.win_reboot: getting boot time
ansible.windows.win_reboot: running command: (Get-CimInstance -ClassName Win32_OperatingSystem -Property LastBootUpTime).LastBootUpTime.ToFileTime()

I’m afraid I don’t use Ansible to patch Windows servers, so I’m not a lot of help on this. I might try to break this one task into a reboot that does not test for success, then a pause, and then another task that tests the server to see whether it’s up. Kind of clunky, but maybe more resilient?

Check whether both the AWX and the windows server are consistently synced via NTP.

They are not. I will try to adjust the time and try it again. I will update once I am done.

The win_reboot module works by checking whether the last reboot time is before or after the time of the ansible node. This is a common issue with it when NTP isn’t implemented.

Thanks for the clue. After matching the time of the VMs with the time of AWX all the jobs, on 5 VMs completed successfully.

You are welcome. I noticed that when googling “ansible win_reboot hangs” and the likes this solution isn’t found. Maybe you can edit the title of this post?

I cannot figure out how to change the title. It’s probably because I marked it as solved.