Hi everyone, I’m new to the forum so if there is a post like this already, I apologize.
I have a few windows servers with no access to the internet, in workgroup, not join to the domain.
My monthly process is to manually download the KBs from MS catalog and install them on those servers.
Below is one of the tasks on my current playbook. The issue I am seeing, after the 1st reboot the job template (MOST TIMES) would never connect so it would never move to the 2nd task in my playbook. But there are also times when the playbook executes flowless all the way to the end without issues. I have played with various module with the same results. Any ideas/thoughts?
name: Reboot (1st)
ansible.windows.win_reboot:
reboot_timeout: 600 # Wait up to 10 minutes for the system to reboot
test_command: echo “OK” # Ensure the system is back and accepting commands
tags: 1stReboot
name: Wait for SSH to be available after reboot (port 22)
ansible.windows.win_wait_for:
host: “{{ target_host }}”
port: 22 # SSH port
state: started
delay: 15 # Wait 15 seconds before checking
timeout: 300 # Timeout after 5 minutes
tags: WaitForReboot
I’m not clear whether the system is not rebooting or whether it’s rebooting and AWX does not detect it coming back. In either case, you have a 10 minute timer set. What happens when that expires. Do you get the failure you expect?
Hey Kevin,
I should have explained little better.
The win_powershell module executes and the 1st KB gets installed.
The win_reboot also starts which triggers the restart of the VM. The server reboots, comes back to the logging screen but the task will not complete, and it won’t move on to the next one, despite that I can ssh from another machine Ansible job still continues to run/hanging at the reboot task. I have seen the job template running at that state for maybe 7-9hours. I actually have to cancel the job.
After I posted this, I increase the verbosity to level 5 and I noticed that win_reboot is trying to validate the reboot by trying to retrieve system boot time.
Here are the last few lines of the log before canceling the job.
debug2: Received exit status from master 0
ansible.windows.win_reboot: last boot time: 133766322995000000
ansible.windows.win_reboot: last boot time check fail _TestCommandFailure ‘boot time has not changed’, retrying in 1.96 seconds…
Traceback (most recent call last):
File “/usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/plugin_utils/_reboot.py”, line 392, in _do_until_success_or_condition
res = func(*args, **kwargs)
File “/usr/share/ansible/collections/ansible_collections/ansible/windows/plugins/plugin_utils/_reboot.py”, line 305, in _check_boot_time
raise _TestCommandFailure(“boot time has not changed”)
ansible_collections.ansible.windows.plugins.plugin_utils._reboot._TestCommandFailure: boot time has not changed
ansible.windows.win_reboot: attempting to get system boot time
ansible.windows.win_reboot: getting boot time
ansible.windows.win_reboot: running command: (Get-CimInstance -ClassName Win32_OperatingSystem -Property LastBootUpTime).LastBootUpTime.ToFileTime()
I’m afraid I don’t use Ansible to patch Windows servers, so I’m not a lot of help on this. I might try to break this one task into a reboot that does not test for success, then a pause, and then another task that tests the server to see whether it’s up. Kind of clunky, but maybe more resilient?
The win_reboot module works by checking whether the last reboot time is before or after the time of the ansible node. This is a common issue with it when NTP isn’t implemented.
You are welcome. I noticed that when googling “ansible win_reboot hangs” and the likes this solution isn’t found. Maybe you can edit the title of this post?