Wacky SSH timeout bug with Ansible

tl;dr; Seeing an Ansible bug where commands take lots of time to start, and then complete quickly; seems to affect Ubuntu VMs.

Full description: I’m using Ansible to orchestrate code deployments. These deployments frequently take much longer than expected, because individual steps that should take only a few seconds to run (like copying a small file or getting facts) take up to 4 minutes, where the time is in SSH waits.

The Ansible profiler output shows that commands almost always take some exact multiple of 60 seconds, plus a small, reasonable, expected delta. Here’s an example:

Monday 08 August 2016 11:26:18 -0700 (0:00:00.650) 0:13:51.907 *********

On a whim, I tried setting pipelining to true, and found that it dramatically reduced the incidence of these issues, but still did not resolve them entirely.

I still see examples of simple commands that take 60+1 seconds to complete.

If anyone is seeing the same issues, definitely speak up. Thanks!