optimizing ansible-playbook boot up time

Hi,

I was looking into why ansible-playbook felt so slow to run, so I ran a profile on just displaying the help text:

$ python2 -m cProfile -o /tmp/ansible.profile $(which ansible-playbook )

$ python2 -m pstats /tmp/ansible.profile
Welcome to the profile statistics browser.
/tmp/ansible.profile% sort cumulative
/tmp/ansible.profile% stats 50
Sat Jun 10 12:43:27 2017 /tmp/ansible.profile

1472930 function calls (1466266 primitive calls) in 0.817 seconds

Ordered by: cumulative time
List reduced from 1823 to 50 due to restriction <50>

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 0.818 0.818 /usr/bin/ansible-playbook:21()
1 0.002 0.002 0.692 0.692 /usr/lib/python2.7/site-packages/pkg_resources/init.py:16()
2 0.000 0.000 0.606 0.303 /usr/lib/python2.7/site-packages/pkg_resources/init.py:3020(_call_aside)
1 0.000 0.000 0.606 0.606 /usr/lib/python2.7/site-packages/pkg_resources/init.py:3037(_initialize_master_working_set)
20 0.000 0.000 0.605 0.030 /usr/lib/python2.7/site-packages/pkg_resources/init.py:683(add_entry)
118 0.006 0.000 0.603 0.005 /usr/lib/python2.7/site-packages/pkg_resources/init.py:2004(find_on_path)
21 0.079 0.004 0.562 0.027 {sorted}
16 0.000 0.000 0.562 0.035 /usr/lib/python2.7/site-packages/pkg_resources/init.py:1978(_by_version_descending)
1 0.000 0.000 0.265 0.265 /usr/lib/python2.7/site-packages/pkg_resources/init.py:641(_build_master)
1 0.000 0.000 0.265 0.265 /usr/lib/python2.7/site-packages/pkg_resources/init.py:628(init)
9128 0.030 0.000 0.260 0.000 /usr/lib/python2.7/site-packages/pkg_resources/init.py:1993(_by_version)
21462 0.032 0.000 0.213 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:24(parse)
184302 0.091 0.000 0.157 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:65(_compare)
33/19 0.001 0.000 0.156 0.008 {import}
21216 0.016 0.000 0.139 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:74(init)
21216 0.053 0.000 0.123 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:131(_legacy_cmpkey)
95554 0.035 0.000 0.117 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:53(eq)
88748 0.031 0.000 0.106 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:47(lt)
460 0.002 0.000 0.104 0.000 /usr/lib/python2.7/re.py:230(_compile)
239 0.000 0.000 0.103 0.000 /usr/lib/python2.7/re.py:192(compile)
148 0.001 0.000 0.102 0.001 /usr/lib/python2.7/sre_compile.py:567(compile)
1 0.000 0.000 0.082 0.082 /usr/lib/python2.7/site-packages/ansible/cli/playbook.py:20()
1 0.000 0.000 0.081 0.081 /usr/lib/python2.7/site-packages/ansible/executor/playbook_executor.py:19()
1 0.001 0.001 0.081 0.081 /usr/lib/python2.7/site-packages/ansible/executor/task_queue_manager.py:19()
1 0.000 0.000 0.078 0.078 /usr/lib/python2.7/site-packages/ansible/executor/play_iterator.py:19()
1 0.000 0.000 0.077 0.077 /usr/lib/python2.7/site-packages/ansible/playbook/init.py:19()
1 0.000 0.000 0.076 0.076 /usr/lib/python2.7/site-packages/ansible/playbook/play.py:19()
1 0.000 0.000 0.073 0.073 /usr/lib/python2.7/site-packages/ansible/playbook/base.py:19()
148 0.001 0.000 0.059 0.000 /usr/lib/python2.7/sre_parse.py:706(parse)
674/148 0.002 0.000 0.058 0.000 /usr/lib/python2.7/sre_parse.py:317(_parse_sub)
1029/168 0.019 0.000 0.058 0.000 /usr/lib/python2.7/sre_parse.py:395(_parse)
60688 0.036 0.000 0.056 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:114(_parse_version_parts)
1 0.001 0.001 0.053 0.053 /usr/lib/python2.7/site-packages/ansible/parsing/dataloader.py:19()
8 0.000 0.000 0.052 0.006 /usr/lib/python2.7/site-packages/pkg_resources/extern/init.py:35(load_module)
21664 0.022 0.000 0.050 0.000 /usr/lib/python2.7/site-packages/packaging/version.py:198(init)
148 0.000 0.000 0.042 0.000 /usr/lib/python2.7/sre_compile.py:552(_code)
1 0.000 0.000 0.035 0.035 /usr/lib/python2.7/site-packages/ansible/parsing/yaml/loader.py:19()
1 0.000 0.000 0.034 0.034 /usr/lib/python2.7/site-packages/ansible/parsing/yaml/constructor.py:19()
1 0.001 0.001 0.033 0.033 /usr/lib/python2.7/site-packages/packaging/requirements.py:4()
1 0.001 0.001 0.033 0.033 /usr/lib/python2.7/site-packages/ansible/vars/init.py:19()
199276/199275 0.032 0.000 0.033 0.000 {isinstance}
1 0.000 0.000 0.032 0.032 /usr/lib/python2.7/site-packages/ansible/inventory/init.py:19()
1 0.000 0.000 0.031 0.031 /usr/lib/python2.7/site-packages/ansible/inventory/dir.py:20()
1975/148 0.010 0.000 0.029 0.000 /usr/lib/python2.7/sre_compile.py:64(_compile)
24570 0.007 0.000 0.025 0.000 /usr/lib/python2.7/sre_parse.py:212(get)
1 0.000 0.000 0.024 0.024 /usr/lib/python2.7/site-packages/ansible/inventory/ini.py:19()
102 0.001 0.000 0.024 0.000 /usr/lib/python2.7/site-packages/pkg_resources/init.py:2436(from_location)
1 0.000 0.000 0.023 0.023 /usr/lib/python2.7/site-packages/ansible/parsing/utils/addresses.py:19()
1 0.000 0.000 0.020 0.020 /usr/lib/python2.7/site-packages/ansible/cli/init.py:20()
27076 0.017 0.000 0.020 0.000 /usr/lib/python2.7/sre_parse.py:193(__next)

It looks like there’s a ton of time spent in pkg_resources before it even got to ansible code, so I was digging around into what that was doing. I saw that in the ansible-playbook entry script, it had this code:

try:
import pkg_resources
except Exception:

Use pkg_resources to find the correct versions of libraries and set

sys.path appropriately when there are multiversion installs. But we

have code that better expresses the errors in the places where the code

is actually used (the deps are optional for many code paths) so we don’t

want to fail here.

pass

From the comment, it sounds like this import isn’t really needed in all cases, so I tried removing it and the execution time dropped from 817ms to 171ms.

Would it be worth removing this code altogether and handling multiversion installs in a different way? IMO, the gain in responsiveness in the common case installs would be worth more complexity/documentation needed on multiversion installs.

.Carlo

pkg_resources is so complex that I’m sure we don’t want to implement it ourselves. And on top of that, when implementing it ourselves, we wouldn’t be guaranteed to be any faster (it would likely depend on how many assumptions we’d want to hardcode versus what pkg_resources does).

What it gives us are access to newer pycrypto and jinja2 libraries. The newer pycrypto is required to run. the newer jinja2 is nice to have (as some useful filters like map aren’t available in older jinja2). This is mostly useful on EPEL6 and other packagesets where the base is a long term stable distro and therefore has old versions of the packages as their default imports.

In Ansible-2.4, users should have the option of using python-cryptography instead of pycrypto (but for people getting Ansible out of EPEL for RHEL6 there isn’t a cryptography rpm package available…). Users can make use of older jinja2 on EPEL6 but that means they don’t have the option to use the newer filter.

A few other thoughts…

  • pkg_resources is used by cryptography (and also in jinja2 code that I think we aren’t accessing). So probably the speedup you see from removing pkg_resources from our code will disappear once people start using cryptography in Ansible-2.4.
  • a startup time savings of less than a second seems like it would be very nice to implement but not a hotspot. In ansible-playbook it will quickly be dwarfed by running tasks. It’ll be more apparent when running /usr/bin/ansible, though… But we still need jinja2 for inventory vars so I don’t think there’s an easy answer to just omitting pkg_resource import for /usr/bin/ansible
  • Also, I’m not able to get corroborating results. I ran:

/usr/bin/time -p ansible-playbook --version

Two times so that all of the files being accessed were in the filesystem cache for the second run. Came out with a real time of around 0.65. Removed the pkg_resources import and tried again. Real time was also around 0.65. So it could be a problem only on the very first run, before the files enter the OS’s fs cache or possibly an SSD versus spinning disks difference.

I’m going on vacation in a few hours but if you have followup I think one of the other devs will be able to reply.

-Toshio