So I'm playing around with some performance testing, and a realization hit me as I was looking at the code and passing ridiculously high fork counts.
Ansible will create as many forks as you ask for, before processing anything, even if your fork count is ridiculously higher than your host count. This can result in a penalty for each task -- the penalty to spin up all the forks and then shut them all down again after all the hosts have been ran through the task.
I my testing, I've got 9 tasks, most of which are quick, but one or two may take a while. There is a big benefit to having enough forks to handle all the hosts I want to loop over (in my current tests, 6 hosts). The optimum forks appears to be the number of hosts, so 6. With 6 forks, my playbook completes in 25 seconds. With 800 forks the playbook takes over a minute to finish.
Looking through the code, it doesn't appear that it would take much code to have an option that creates forks based on the number of hosts in a run. That way I wouldn't have to fiddle with the -f option each time I increase or decrease hosts, I can have a static option that will keep things lined up. Would such an option be entertained? Would it be a new option like --forks-by-hosts or would it be a special value to the existing --forks option?
-jlk
It seems like it would be fine to cap forks at the number of hosts that are sent down into Runner (no problems at all), but it would not be good to auto-scale the fork number to whatever as the user should still want to specify, thus --forks becomes an upper limit.
Another thing I just thought of – when implementing this, take care to notice how Runner objects are reused in playbooks.
Using the add host module, we might want to allocate more forks.
This may mean we actually do want to make an option like --forks=“<50” etc?
Alternative – play with where the multiprocess queue is initialized, allowing it to be possibly passed in from Playbook code. If you’re paying startup cost, it’s likely it’s happening more than once.
I was thinking something like this (per play).
if forks > #numhosts or forks == 0:forks == #numhosts
so no new command line option, just setting forks=0 makes it ‘smart’ and we get both features in 1 shot.
That could work. I'm not sure how often the _parallel_exec() function out of the run() function is called, but I suspected it was per task. I was going to do some testing today to verify.
I think it might be nice to have a if 0, set to # of hosts, but also have a routine that ensures you don't spin up more forks than you have hosts, as those seem to be wasted processes, so always set a cap on the # of forks to be equal to the number of hosts in that run().
Thoughts?
-jlk
I think it’s worth benchmarking to make sure fork creation overhead is not paid per task, and the idea of contructing the multiprocessing pool in Playbook.py, just to make sure it does not have any effect.
And yes, test with one host and a play that adds several via add host, and make sure it gets the right fork count.
This could be as simple as constructing the fork pool in the playbook and then if the needed count grows over the previous pool size allocate a new pool. (Assume play B is larger than play A).
Should be pretty easy to figure out what the cost of initing the multiprocessing library is, but I think we can probably pay it less.
It seems like it would be fine to cap forks at the number of hosts that are sent down into Runner (no problems at all), but it would not be good to auto-scale the fork number to whatever as the user should still want to specify, thus --forks becomes an upper limit.
For some reason that’s what I thought it did to begin with. I’ll welcome that being the actual behavior. If I get a chance I’ll look back at the documentation and see if I find a patch might be warranted for that. (Probably after I’m done working on getting my tig patches into shape and accepted, and I’m done reviewing a couple of patches for git.)
While I like the simplicity of this approach, I also remember that often times setting a “limit” (if that’s what “forks” should be) option to “0” means “no limit,” which doesn’t appear to be exactly what this conversation is aimed at doing. This would be worth documenting clearly.
I’ve not peeked into the code to check, but what does “forks” default to if it is not set on the command line? (This may be in the manual, but I don’t remember off the top of my head, nor remember where I might find it.)
In cases like this some other tools have chosen to specify what would otherwise be a nonsensical value (like “h”) as the option value to prevent confusion. I would hope that setting “–forks=h” would be memorable enough for the user to not have to look back and find that “h” means “hosts” in that context. (That would definitely need to be added to the manual however.)
made a pr with this:
https://github.com/ansible/ansible/pull/3650
0 does becomes ‘no limit’ or ‘fork as many as possible and sensible’.
The other part of the patch makes sure it will spawn as many forks as hosts are targeted, be it 1 or 500 as it makes little sense to fork more than the hosts you are going to target.
Now if you have 50 hosts but you set fork to 500 you will only spawn 50 for the current task as the other 450 forks would not be used anyways.
fork defaults to 5, it is settable at the command line, config file and plays.
correction: it only does this as long as fork is set to 0 or more than the number of hosts, if less the fork management will stay at the level set (i.e: forks 5 and hosts 50 will still only fork 5 at a time).
0 should really not be a special value, anything less than 1 should be treated as 1.
Really, let’s just treat --forks as an upper bound. It makes more sense that way.