Ansible WinRM connection to Windows machines hangs often

Hi!

I have some Windows machines set up on virt-manager on Ubuntu and they work great to login to etc. But when I run ansible against it to install things, create an AD domain etc, sometimes ansible does not succeed in connecting to the machine with WinRM even though the WinRM service is running on the machine and the port is open(if I check with netstat). So then I try to restart the machines, and sometimes ansible can then connect to it after reboot but sometimes two or three reboots needed.

Why is this the case? I really want to fix it because otherwise I can’t write a bash script that first runs terraform to create the machines and then ansible to provision them. I tried to reboot all machines in virt-manager after terraform created them, but still it happens that ansible gets stuck at connecting to WinRM for some specific tasks. It may also succeed in creating some tasks but then some fail because that connection hangs and I have to “ctrl+c” and do it again.

What task are you using when connecting to the Windows machines “hang”? How long have you waited?

Sometimes a task takes time to complete, especially something like a post-domain join reboot. In that case, group policies and other configuration takes place on restart.

What are the task timeouts set to? What errors do you get?

Intermittent connection issues like this with WinRM following a recent domain join are common, since the service may not be completely configured or finished starting.

After domain join, the WinRM service needs to register the WSMAN service principal names (SPN) with Active Directory. A restart of the service after the first boot up often works, or else you can try connecting to the HOST SPN instead. Run Get-ADComputer -Identity WinVmName -Properties ServicePrincipalNames to see if the WSMAN SPNs are present.