I have 2 environments in AWS, each environment contains 2 Windows servers that I am running a playbook against. Ansible itself is running in a docker container and connecting to the instances over VPC Peering. So The Docker Server and Ansible Container are in 1 VPC, using winrm/credssp to connect to windows servers in another VPC.
The playbook keeps randomly failing with the above errors. It will happen at different places in the playbook, so I cant narrow it down to any one specific section. That file copy above is for a 1.8K file, so its not like it was too large.
I run this same setup, for thousands of linux servers and have no issue, so I assume its something with winrm, and VPC peering that it just doesnt like.
I know the best solution would be to run ansible inside the same VPC but its not an option, so is there anything that can be done? These ansible failures are putting a halt to the project, and my fear is that winrm is not going to allow this to work.
TASK [roles/ansible-role-dc : configure AD CS certification authority] ********* An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ReadTimeout: HTTPSConnectionPool(host='10.254.64.5', port=5986): Read timed out. (read timeout=30) fatal: [10.254.64.5]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""} to retry, use: --limit @/home/ubuntu/workspace/sales-demo/demo-idauto-salesdemo-prd-inf/ansible_dc.retry
As you can see it failed at a completely different point in the role.
‘“The Certification Authority is already installed” not in (pri_adcs_enrollment_config.stderr|regex_replace(“\r\n”, “”))’
become: yes
become_user: SYSTEM
become_method: runas
`
I also see using PSRP it falls back to a read timeout of 30 seconds, there is no way to increase it that I can see. It also appears with a read failure, it doesn’t attempt to try the connection again.