Parallel execution of 'command' task

I’d like to be able to run a command line tool ~500 times, 4 at once, per server, to work through a (per server) list.

We currently have this in our playbook:

  • name: list existing projects
    shell: find /var/define/projects/ -mindepth 2 -maxdepth 2 -name VERSION -type f -printf “%h\n” | sort
    register: list_projects
    changed_when: False

  • name: upgrade existing projects
    when: define_upgrade is defined and define_upgrade | changed
    command: /opt/define/bin/define-project-admin {{ item }} upgrade --no-backup
    with_items: list_projects.stdout_lines

There are around 500 items in list_projects.stdout_lines so this takes a while, and seems to be particularly affected by the ssh roundtripping from ansible on the client to our servers. We’ll be starting to run ansible from within the server’s network, and/or using the accelerated mode, but we also want to start running this command with some parallelism. I can imagine an extra argument to ‘command’: “parallel=4” to run 4 commands at a time in threads/child processes. This would also help us if one of the “upgrades” takes a lot of time - we want other threads to continue working through the list.

I thought I remembered that the ‘yum’ task can read the with_items list in one go, but I don’t find evidence of that now. Is it true? If so, I could potentially enhance ‘command’ to internally work through the list using internal queues?

Another way would be to write our own action_plugin, learning from https://github.com/ansible/ansible/blob/devel/lib/ansible/runner/action_plugins/ examples. Would we still use with_items, or would we pass list_projects.stdout_lines to our new plugin as the argument?

I’ve tried adding “async: 1”, but at the moment ansible stops with: “lookup plugins (with_*) cannot be used with async tasks”. I’m reading http://docs.ansible.com/playbooks_async.html - but even if lookup plugins did work here, it’s not clear to me if this would run multiple processes on each server. Maybe it would, in the case of --forks being larger than the number of servers I have? I tried testing this out, which is what made me find you can’t use lookup plugins with async tasks. I see https://github.com/ansible/ansible/issues/5841 is regarding this limitation, and it looks like it’s not designed to be fixed so I should find another way.

I’m looking for advice on the most “ansible” way forward! Ideally I’d fix/provide an ansible feature. I think if I were to progress further myself without advice, I’d probably start on an action_plugin, which takes the list directly as a single argument. I’d include that into my local folder like I do with the callback_plugin we have. Probably also I should combine /that/ with “async” to get polling as it would be a long-running task.

Regards,

Nick

I would expect one to be able to run async tasks with lookup plugins. This restriction comes as an unpleasant surprise for the user. It does not make sense. One should be able to run a loop that e.g. fires and forgets a number of tasks. What is the reason for not allowing this?

It’s poor form to phrase a question “what is the reason for not allowing this” when referring to a feature you’d like requested.

Now let me address the original post.

First off, this appears to be a user question asked on the development list. Please post these to -project in the future.

That feature is highly relevant to the original post, it has been referred by the original poster himself, and would definitely allow him to do what he wants to do. Also, that feature has already been requested by another person on github (issue #5841) and you have answered that the code “says” that it is not going to be implemented. So, I asked why this is not going to be implemented. I read on your last post that this is an implementation detail. Could you please elaborate on this?

Hi Michael, thanks for your reply to this.

First off, this appears to be a user question asked on the development list. Please post these to -project in the future.

I came in with the expectation that I’d be writing a patch or new feature, however you’re right that the initial question of “How might I do this?” is a user one, so I shall be more accurate on the list next time.

I’d like to be able to run a command line tool ~500 times, 4 at once, per server, to work through a (per server) list.

Ansible is naturally a parallel system so if you have 500 hosts, that’s what is accomplished by using the host loop. If you want to limit things to run 4 times at once, use the “serial: N” keyword on the play and set it to 4.

We have 10 hosts, and 500 tasks (generated by a with_items) for each host to perform. Each takes a couple of seconds, but the host could cope with running several in parallel, if we can find a way for ansible to do that.

We currently have this in our playbook:

  • name: list existing projects
    shell: find /var/define/projects/ -mindepth 2 -maxdepth 2 -name VERSION -type f -printf “%h\n” | sort
    register: list_projects
    changed_when: False

  • name: upgrade existing projects
    when: define_upgrade is defined and define_upgrade | changed
    command: /opt/define/bin/define-project-admin {{ item }} upgrade --no-backup
    with_items: list_projects.stdout_lines

There are around 500 items in list_projects.stdout_lines so this takes a while, and seems to be particularly affected by the ssh roundtripping from ansible on the client to our servers. We’ll be starting to run ansible from within the server’s network, and/or using the accelerated mode,

You should definitely look into Control Persist and pipelining if you are running something other than EL 5/6, otherwise accelerated mode is great.

It is EL6.

but we also want to start running this command with some parallelism. I can imagine an extra argument to ‘command’: “parallel=4” to run 4 commands at a time in threads/child processes. This would also help us if one of the “upgrades” takes a lot of time - we want other threads to continue working through the list.

Each task in Ansible is executed in parallel across servers, but it is not written to execute all tasks in a playbook at the same time. That’s simply not what it does.

You can of course use “fire and forget” async tasks to spin off 4 tasks if you don’t need to wait on them for completion.

I would like to wait for completion, so that we can collect information about the success (we use a callback plugin for that.)

I thought I remembered that the ‘yum’ task can read the with_items list in one go, but I don’t find evidence of that now. Is it true? If so, I could potentially enhance ‘command’ to internally work through the list using internal queues?

It’s been true for a long time.

if you are doing:

  • yum: name={{item}}
    with_items: list

These are all installed in one transaction.

I thought so, but I had trouble understanding how that might work given what I know of Modules and lookup plugins. I found it now though: https://github.com/ansible/ansible/blob/d9df6079724267a2cfe274ef17cc6effb1c29bfb/lib/ansible/runner/__init__.py#L625 . As it’s a special case, it means that Modules in general can’t “see” the full with_items results; instead, the Module is called once for each item. It means that a Module can’t do its own internal parallel work, so I couldn’t extend the Command module to have a “parallel=n” argument.

Another way would be to write our own action_plugin, learning from https://github.com/ansible/ansible/blob/devel/lib/ansible/runner/action_plugins/ examples. Would we still use with_items, or would we pass list_projects.stdout_lines to our new plugin as the argument?

You’ve lost me a bit here, what is your specific question and how did we get into the yum part of this discussion? Let’s talk about the use case before we dive into implementations and plugins.

The specific question is: I have 10 hosts, 500 tasks per host, which are generated by a with_items and using the ‘command’ module. I’d like to run several tasks at once, per host, so that if one of the tasks takes a long time then the remaining 499 can still be worked through while the slow one is underway.

I spoke of the yum module because I saw that it can “see” the full with_items list, and I wondered how that worked.

I’ve tried adding “async: 1”, but at the moment ansible stops with: “lookup plugins (with_*) cannot be used with async tasks”. I’m reading http://docs.ansible.com/playbooks_async.html - but even if lookup plugins did work here, it’s not clear to me if this would run multiple processes on each server. Maybe it would, in the case of --forks being larger than the number of servers I have? I tried testing this out, which is what made me find you can’t use lookup plugins with async tasks. I see https://github.com/ansible/ansible/issues/5841 is regarding this limitation, and it looks like it’s not designed to be fixed so I should find another way.

What this means is that you can’t loop to spawn async tasks currently. This is an implementation detail.

Most operations in Ansible don’t require async - so it’s usually not much of a thing.

You may wish to spawn off something using the script module that does something very specific if you have something very specific in mind, which could launch 4 things at once or whatever.

Yep, that’s the way I probably will end up going if I can’t do this in ansible (or help improve ansible to do it, if it makes sense to you and I can write a reasonable patch). The downside is that the log information we get (via our callback plugin) is nicely structured, as each “thing” is a task which was logged individually. If I spawn a custom script, I think we lose the possibility of ansible’s existing infrastructure “seeing” information about the success/failure of each action.

I’m looking for advice on the most “ansible” way forward! Ideally I’d fix/provide an ansible feature. I think if I were to progress further myself without advice, I’d probably start on an action_plugin, which takes the list directly as a single argument. I’d include that into my local folder like I do with the callback_plugin we have. Probably also I should combine /that/ with “async” to get polling as it would be a long-running task.

If you want to make it possible to use loops to spawn async tasks this could be done with some fiddling in runner/init.py most likely. It hasn’t been a pressing issue for us, but if it can be done cleanly, that’s fine.

I have absolutely no idea why you are talking about action plugins - what that would even do - and that seems like the wrong direction to go in.

I got that way because I saw ‘script’ was an action plugin, and I didn’t appreciate the difference because an Ansible Module and an action plugin. I see it now though: a module is moved to the server and executed there (normally once per task, after with_items, although yum/apt/… is different); an action plugin runs locally and so can move content to the server (and then invoke a module, if necessary).