Ansible 2.9 failed to transfer AnsiballZ_setup.py when gathering facts

Hi everyone,

I’m struggling to get consistent builds working with ansible 2.9.15 + packer 1.6, using python 3.6.10. Specifically, I get failure messages at the gather_facts step, where the AnsiballZ_setup.py file doesn’t get transferred properly. In Ansible 2.7, this has been working properly, albeit using setup instead of gather_facts .

I’ve tried running the playbook with -vvvv, the error is :

"msg": "failed to transfer file to /Users/davidzausner/.ansible/tmp/ansible-local-80071d_tih8lv/tmpvlcwfdvx ~davidzausner/.ansible/tmp/ansible-tmp-1606694025.7735732-80403-78405525784653/AnsiballZ_setup.py:\n\nExecuting: program /usr/bin/ssh host 127.0.0.1%!(PACKER_COMMA) user (unspecified)%!(PACKER_COMMA) command scp -v -t '~davidzausner/.ansible/tmp/ansible-tmp-1606694025.7735732-80403-78405525784653/AnsiballZ_setup.py'\nOpenSSH_8.1p1%!(PACKER_COMMA) LibreSSL 2.7.3\r\ndebug1: Reading configuration data /Users/davidzausner/.ssh/config\r\ndebug1: /Users/davidzausner/.ssh/config line 1: Applying options for *\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 47: Applying options for *\r\ndebug2: resolve_canonicalize: hostname 127.0.0.1 is address\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local%!(PACKER_COMMA) 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 80407\r\ndebug3: mux_client_request_session: session request sent\r\nSending file modes: C0600 265393 tmpvlcwfdvx\n\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 1\r\n"

A fuller log can be seen at https://gist.github.com/zausnerd/0fe896ffe44620933b044eb4b314bd9b.

A few of the things I tried:

  • Setting higher timeouts with ControlMaster=auto -o ControlPersist=30m in ansible.cfg and packer.log

  • Setting gather_facts with parallel: False

  • Using setup instead of gather_facts

  • Adding an ansible task to delete .ansible/tmp/ files between builds (I’m not sure why they’re not being cleaned up automatically as per https://github.com/ansible/ansible/pull/57845)

  • Setting ansible_python_interpreter in ansible.cfg, and in the packer json files

  • Setting ansible_scp_if_ssh=False in the packer json file. Also setting scp_if_ssh to False in the ansible.cfg file

  • using paramiko with transport = paramiko (was getting amazon-ebs: paramiko.ssh_exception.SSHException: Server connection dropped)

  • Turning off pipelining ANSIBLE_PIPELINING=False
    What is most odd is that this error will happen many times in a row, then not at all for a few runs, and then again. When building a single AMI, this error will come up sparingly. However, when building multiple AMIs (up to 5), this error will come up much more frequently.

Looking at other posts in the group:
https://groups.google.com/g/ansible-project/c/ukNHUKfY5Zw/m/b_wYImfHEAAJ
This was the closest I could find, however turning off pipelining via ANSIBLE_PIPELINING did not work unfortunately.

Thanks for reading, appreciate any help with this!

Best,

David