Feedback after " include/include_role: maximum recursion depth exceeded #23609" as been closed

As requested by James Cammarata at the end of the github issue “include/include_role: maximum recursion depth exceeded”, I’m posting my feedback at the ansible-devel list.

I’ve tested the patched version of ansible with this stupid playbook that emulates a useless chain of roles that include each other using include_role (r1 → include r2 that includes r3 that includes r4 … and so on).
With the patched version, my proof of concept now successfully runs with 26 roles that include each other when it was merely impossible to use before that patch.

If I emulate more roles (just for fun) that include each other in a chained manner, the bug still occurs after 27 included roles. This is quite a huge number that allows me to say that I can now start to use the include_role statement in our site production playbooks. Thanks for work guys !

Another thing I’ve noticed is the fact that ansible becomes very slow the more role I chain. When it comes to something like the 18th role included, it take something like 5 seconds to just print the debug message of the next role and go on. For the next included role, it goes even slower and so on…
When it comes to unstack the roles and print the “after include_role” debug statement, this goes blazingly fast.

That’s all for my feedback.

Again, thanks for the work guys.

Have you tested with import_role? Since it runs at 'static' it should
show a very different profile than the 'dynamic' include_role.

Indeed it shows a very different profile.

I’ve just tested and updated my POC playbook to use import_role and do the same test.

Here is the result using 8 chained/nested roles:
``

$ ansible-playbook import_role.yml
[...]
PLAY [localhost] ***************************************************************************************************************************

TASK [import_r1 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r1"
}

TASK [import_r2 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r2"
}

TASK [import_r3 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r3"
}

TASK [import_r4 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r4"
}

TASK [import_r5 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r5"
}

TASK [import_r6 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r6"
}

TASK [import_r7 : debug] *******************************************************************************************************************
ok: [localhost] => {
    "attempts": 1, 
    "msg": "Start of import_r7"
}
ERROR! Unexpected Exception, this is probably a bug: maximum recursion depth exceeded
to see the full traceback, use -vvv

I’m still using the latest devel ansible version in my tests.

It seems that the import_role is triggering the bug even sooner than include_role does.

Cheers

Rémi

Another update on this thread after Matt Martz update on the Gihub issue #23609.

The merge of https://github.com/ansible/ansible/pull/36470 indeed solved the whole **include_role** problem and everything is now working fast and as expected.
Thanks to everyone that has spent time working on this !

The problem with **import_role** seems to be always present and still makes it difficult to use in production.

I’ve updated my proof-of-concept that easily allows one to trigger the **import_role** bug accordingly.