Hi all, I’ve recently been playing around with using threads instead of multiprocessing.Process objects for the workers in Ansible, and I think it’s ready for some real-world testing.
https://github.com/ansible/ansible/compare/threading_instead_of_forking
Here are a few synthetic test results.
First, I have a playbook that does 100 debug calls against localhost. Since debug occurs only as an action plugin, this does not hit the connection plugin layer or remote side in anyway. All of these numbers are the average of five runs:
threading = 1.208s
devel = 2.394s
stable-2.3 = 1.968s
stable-1.9 = 0.474s
Next, a playbook that does 100 pings against localhost. This fully involves the connection layer and module execution paths of the executor engine:
threading = 15.44s
devel = 16.53s
stable-2.3 = 14.84s
stable-1.9 = 7.62s
Finally, a larger setup in which ~1000 hosts with 19k variables in group_vars/all. This runs with a limit of all[0]
(the first host in inventory) and the playbook has a single task, which does - debug: var=hostvars[inventory_hostname]
:
threading = 5.18s
devel = 5.08s
stable-2.3 = 4.81s
stable-1.9 = 2.23s
So, what do these tests tell us? First, 1.x was extremely fast, though we’ve made a lot of strides since 2.1/2.2 in closing the gap from the major overhaul of the engine. Second, it looks like devel has introduced a little bit of slowness from 2.3 (probably because of the inventory rewrite) which may require some tweaking before 2.4 is released.
But what these numbers don’t reflect is memory usage, and that’s the real benefit to using threads. With forking, we’re copying all of the in-memory data to a new process space, and because of the way CPython does ref count updating we’re probably quickly incurring memory copies and not benefiting much from copy-on-write (COW, there’s an Ansible joke in there somewhere…).
With threading, this should not happen, as the memory space is shared and not copied. So if you’ve got a playbook/env. that shows a lot of memory pressure from Ansible, I’d be really interested in hearing if/how much the threading branch improves this.
Eventually, as we move past Python 2.x, we’d look at moving these threading bits to the asyncio framework in Py3.5+, but that would be quite a ways down the road.
Thanks!
James Cammarata