Playbook slowness when using dependencies

Hello,

While doing work to get our playbooks, tasks, roles, and callbacks all compatible with Ansible 2.x, I’ve noticed an inherent slowness in which the playbooks take to complete on Ansible 2.x. Today I had some time trying to tweak things to try to ensure Ansible 2.x is performing as fast as possible. There are times we are pushing Ansible to hundreds of hosts and these additional seconds can really add up when pushing multiple roles. I’ve finally narrowed down the issue to how we are “chain-loading” a total of 66 roles from one role called common in the main.yml of the meta folder. We use this to keep the playbooks clean, as well a way to ensure all the roles we define in meta of common get played out in our various playbooks with just one line. We only push the playbook when we first provision a host and often use tags defined on the roles in common’s meta as a way to specifically target a role we are interested in. I did find one bug (https://github.com/ansible/ansible/issues/14112) that has been issued and seems related. I applied the patch and that seems to help but performance is still not the greatest. Below are my timings of our dns role that has six tasks using the following modules: Template, Stat, File, and Copy. There were a total of 6 hosts that I was hitting, and to optimize for speed, I forked out ansible-playbook by 6. I played out four tests: 1) all roles listed in our “shell” role common as dependencies, 2) Same as 1 with the patch I found relating to inter-dependency slowness, 3) moving all the roles directly to the playbook from common’s meta main.yml, and 4) only adding the dns role to the playbook. The speed of Ansible 2.x significantly increase as I made my way from from scenario 1 to scenario 4. Since this has pretty good speeds on Ansbile 1.9.2 should I be expecting similar speeds? Is this a known issue? Below I provided my timings from the same command across the different scenarios.

All roles listed in meta of common

time ansible-playbook -f6 -i hosts-nixtest -t dns playbook-common.yml

Version

User

System

Total

1.9.2

5.35

1.3

5.152

2.0.2

13.76

2.96

12.093

2.1.0

13.46

2.87

12.472

All roles listed in meta of common

time ansible-playbook -f6 -i hosts-nixtest -t dns playbook-common.yml

Patched with https://github.com/ansible/ansible/issues/14112

Version

User

System

Total

1.9.2

5.95

1.3

6.287

2.0.2

12.76

2.68

10.869

2.1.0

13.29

2.87

11.872

2.1.0 (Patched)

9.1

2.51

10.71

Moved dependencies listed in meta of common role to the playbook

time ansible-playbook -f6 -i hosts-nixtest -t dns playbook-common.yml

Version

User

System

Total

1.9.2

5.94

1.36

6.703

2.0.2

5.36

2.09

8.654

2.1.0

5.83

2.34

8.774

Just the single DNS role in playbook

time ansible-playbook -f6 -i hosts-nixtest -t dns playbook-common.yml

Version

User

System

Total

1.9.2

5.58

1.37

5.956

2.0.2

4.39

2.69

16.978

2.1.0

2.89

2.03

5.341

Thanks,

Chris

Working your way back...

I would expect 4 to be faster than 3 because your playbook is significantly shorter. In 3 Ansible is having to check every task for the tag and then run or not run it. In 4 it is doing the same, but with significantly fewer tasks.

3 and 2 you would think should be the same, but it is having to build a list of tasks from multiple roles, and making sure it adds all of the tasks just once. It seems from another question about dependencies that the way they are handled has changed.
ý
In addition the Strategies you are using could be causing issues. You might want to look at http://docs.ansible.com/ansible/playbooks_strategies.html too.

I just tried the free strategy on the Ansbile 2.1.0 and the timing is better, yet scenario 3 is still faster as well as easier on the eyes in the default linear strategy. I have yet to actually look and dig in at the dependency code and realize there were tons of rewrites done to optimize development. I guess I was looking more for an answer like, “we didn’t anticipate someone using dependencies this way” or “the dependency algorithm still needs optimizations”. Which I believe the patch I applied was a step in the direction of better optimization. It’s a huge jump going from 5.152 seconds on 1.9.2 to 12.472 seconds on 2.1.0, which is an increase of 142% for this particular role. This may not seem like a huge deal for 6 hosts but at times we are using Ansible pushing out to over 1000 hosts.

All roles listed in meta of common, Using Free Strategy

time ansible-playbook -f6 -i hosts-nixtest -t dns playbook-common.yml

Version

User

System

Total

1.9.2

The dependency algorithm still needs optimisation... :slight_smile:

I suspect that some of the issues are due to added functionality... Some things like the addition of the strategies probably added extra requirements. There is another post where someone else moving from 1.9 to 2.0 now gets an error about circular dependencies so I think that they have probably also increased the dependency checking.

I am not sure I agree with the way that you are using dependencies myself... In my opinion a dependency would be something like some of our servers need perl on them, all of our web servers also need perl on them. Installing perl would be a role, and the web server role which requires perl in order to work would have a dependency.

In my site.yml I group servers by OS, and then call different roles for Debian based, RedHat6 and Redhat7. the Redhat6 and Redhat7 roles both depend on a Redhat Common role. Finally All Linux calls a LinuxCommon role.

I could call a linux common role instead and have that depend conditionally on Debian, Redhat6 and Redhat7 roles with the last two being dependent on Redhat Common. I might change to just that, particularly as I need to add AIX support.

Once the OS is configured using those base roles, individual roles are used to configure application specific things. But they do not have dependencies on the base OS roles.