Not repeatable login error with winrm

New to Ansbile and trying to do a POC and learn at the same time. I’m using centos for my ansbile server and using the local admin account on windows when running my remote tasks. I’m able to run a bunch of tasks like remotely installing MSIs and joining a domain and installing some features. The first time I attempt to copy a file I get the below error. (I used the -vvv option to get some more verbose output). What is weird is if I immediately run my playbook again it works and completes the rest of the tasks without issue. I don’t understand how this could be as this step is way after the domain join. Currently I join the domain, reboot, add some roles reboot, and add some features all before this step. I’m also not sure if the step that is failing is the win_copy or the script. Either way it runs the second time. Any help would be appreciated.

Steps in config

  • name: Setup IIS/Certificate
    win_copy:
    src: files/certs/mycert_2020.pfx
    dest: C:\scripts\
  • script: files/certs/install_cert.bat
    args:
    creates: c:\scripts\certinstalled.txt

<10.152.2.137> PUT “/etc/ansible/files/certs/mycert.pfx” TO “C:\Users\Administrator\AppData\Local\Temp\ansible-tmp-1513116318.66-1186697384671\source”
Using module file /usr/lib/python2.7/site-packages/ansible/modules/windows/win_copy.ps1
EXEC (via pipeline wrapper)
The full traceback is:
Traceback (most recent call last):
File “/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py”, line 125, in run
res = self._execute()
File “/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py”, line 521, in _execute
result = self._handler.run(task_vars=variables)
File “/usr/lib/python2.7/site-packages/ansible/plugins/action/win_copy.py”, line 490, in run
copy_result = self._copy_single_file(file_src, dest, file_dest, task_vars)
File “/usr/lib/python2.7/site-packages/ansible/plugins/action/win_copy.py”, line 281, in _copy_single_file
copy_result = self._execute_module(module_name=“copy”, module_args=copy_args, task_vars=task_vars)
File “/usr/lib/python2.7/site-packages/ansible/plugins/action/init.py”, line 737, in _execute_module
res = self._low_level_execute_command(cmd, sudoable=sudoable, in_data=in_data)
File “/usr/lib/python2.7/site-packages/ansible/plugins/action/init.py”, line 886, in _low_level_execute_command
rc, stdout, stderr = self._connection.exec_command(cmd, in_data=in_data, sudoable=sudoable)
File “/usr/lib/python2.7/site-packages/ansible/plugins/connection/winrm.py”, line 357, in exec_command
result = self._winrm_exec(cmd_parts[0], cmd_parts[1:], from_exec=True, stdin_iterator=stdin_iterator)
File “/usr/lib/python2.7/site-packages/ansible/plugins/connection/winrm.py”, line 308, in _winrm_exec
self.protocol.cleanup_command(self.shell_id, command_id)
File “/usr/lib/python2.7/site-packages/winrm/protocol.py”, line 314, in cleanup_command
res = self.send_message(xmltodict.unparse(req))
File “/usr/lib/python2.7/site-packages/winrm/protocol.py”, line 214, in send_message
return self.transport.send_message(message)
File “/usr/lib/python2.7/site-packages/winrm/transport.py”, line 229, in send_message
response = self._send_message_request(prepared_request, message)
File “/usr/lib/python2.7/site-packages/winrm/transport.py”, line 239, in _send_message_request
raise InvalidCredentialsError(“the specified credentials were rejected by the server”)
InvalidCredentialsError: the specified credentials were rejected by the server

fatal: [10.152.2.137]: FAILED! => {
“failed”: true,
“msg”: “Unexpected failure during module execution.”,
“stdout”: “”
}
to retry, use: --limit @/etc/ansible/ASGTEST.retry

If the previous task before the copy one is to install some features and reboot then potentially the WinRM service comes back online but it reboots one more time that isn’t caught. What I would potentially do is

`

  • name: install problematic features
    win_feature:
    name: …
    register: feature_install

  • name: reboot after feature install
    win_reboot:
    when: feature_install.reboot_required

  • block:

  • name: copy file
    win_copy:
    src: files/certs/mycert_2020.pfx
    dest: C:\scripts\

rescue:

  • name: copy failed wait for connection to become stable
    wait_for_connection:

  • name: copy the file again
    win_copy:
    src: files/certs/mycert_2020.pfx
    dest: C:\scripts\

  • name: run setup script
    script: files/certs/install_cert.bat
    args:
    creates: c:\scripts\certinstalled.txt

`

The block/rescue may not work as the failure happened in the connection plugin and not the module and this usually is a fatal exception that is not recoverable. But because win_copy is an action plugin it may work I just can’t confirm right now. If it doesn’t work then you may need to look into the logs a bit more to see why it may be rejecting your credentials or as a precaution put a pause after the win_reboot step and then wait_for_connection straight after to try and catch these exceptions.

Actually after looking at the error it does still seem to be a fatal error so I don’t think the block/rescue will work in this case so I think putting wait_for_connection after your reboot stage might be best.