Multi threading support in Ansible

Jagadeeshkumar_Ditta · June 4, 2020, 6:18pm

I am a newbie to ansible but I got to explore how to run a tasks in parallel by spawing a thread for each task instead of a process. My requirement is to run the playbook on my localhost and there is no remote task execution needed.
I also would like to wait for all threads to complete before I move on to a task that has to be serialised.

Can I chose thread vs process when it comes to parallel task execution?
If it is possible to spawn threads from ansible, are they equivalent to python greenthreads or pthreads or something else?

Thank you in advance!

sivel · June 4, 2020, 6:30pm

The only current process model is forking. There has been some work done to add a threaded process model, but there are some large hurdles to overcome.

In practice, it is not necessarily more performant, and in many cases it was less performant, as it causes more CPU contention on a single core that is already resource constrained.

Jagadeeshkumar_Ditta · June 4, 2020, 6:50pm

Thank you for the prompt reply… Just a curious question: Is the threading work that is underway based on python threads or pthreads or any other threading mechanism? As you mentioned that the threading model is not going to be performant, was the reason being the python’s GIL?

sivel · June 4, 2020, 6:58pm

Yes, it would utilize the threading library in Python. The GIL is a primary cause to the CPU restrictions. Our main process that orchestrates all of the task executions is already heavily CPU bound, so adding additional threads to the same core can cause a decrease in performance. Assuming we create a process model plugin type, other process models are possible, such as using asyncio, concurrent.futures, gevent, etc. But I don’t expect this work to be complete any time soon.

So for now, consider forking the only process model for the near future.

Jagadeeshkumar_Ditta · June 5, 2020, 4:20am

Thank you Matt for the detailed and quick reply… Much appreciated the support from the community.

Jagadeeshkumar_Ditta · June 11, 2020, 12:00pm

@Matt,

Got another question in concurrency support in Ansible.
Is there any way to limit the number of processes that could be spawned on a given host?
My requirement is not to execute the commands/scripts remotely. In my case, the whole play needs to be executed on locahost only.
I have tried a simple test program and noticed that there are as many as 6 processes are spawned to execute ‘sleep 20’ asynchronously.

Please kindly revert. Thank you inadvance.

Command: ansible-playbook test_playbook.yml --forks=1

Processes:

root 69484 34309 9 04:50 pts/10 00:00:00 /usr/bin/python2 /usr/bin/ansible-playbook test_playbook.yml --forks=1

root 69509 1 0 04:50 ? 00:00:00 /usr/bin/python2 /root/.ansible/tmp/ansible-tmp-1591876209.82-38354017880191/async_wrapper.py 198806654079 50 /root/.ansible/tmp/ansible-tmp-1591876209.82-38354017880191/command.py _

root 69510 69509 0 04:50 ? 00:00:00 /usr/bin/python2 /root/.ansible/tmp/ansible-tmp-1591876209.82-38354017880191/async_wrapper.py 198806654079 50 /root/.ansible/tmp/ansible-tmp-1591876209.82-38354017880191/command.py _

root 69511 69510 0 04:50 ? 00:00:00 /usr/bin/python2 /root/.ansible/tmp/ansible-tmp-1591876209.82-38354017880191/command.py

root 69512 69511 1 04:50 ? 00:00:00 /usr/bin/python2 /tmp/ansible_f9ckPD/ansible_module_command.py

root 69520 69484 3 04:50 pts/10 00:00:00 /usr/bin/python2 /usr/bin/ansible-playbook test_playbook.yml --forks=1

Code:

[root@oracle-siha file_copy_test]# cat test_playbook.yml

name: Testing processes

gather_facts: no

hosts: localhost

tasks:

name: run sleep command

async: 50

poll: 0

command: sleep 20

register: res

name: wait for the completion

async_status:

jid: “{{ res.ansible_job_id }}”

register: output

until: output.finished

delay: 5

retries: 10

sivel · June 11, 2020, 2:23pm

There are a number of steps involved here.

The primary playbook process spawns a worker
The worker executes the async_wrapper for the command module
The async_wrapper forks to daemonize
The async_wrapper executes the transferred module
The actual module is contained within what we call AnsiballZ which is a compressed archive, and it extracts and executes the actual python code
Actual module executing.

forks only limits how many workers can be launched by the primary playbook process, not how many processes will be spawned as a result of the worker.

Jagadeeshkumar_Ditta · June 11, 2020, 2:42pm

Thank you Matt!
In the above example I have explicitly passed --forks=1 but still there are 2 worker processes(PIDs 69484 and 69520) were spawned, that means there will be minimum 2 workers get spawned and we can’t limit that to one? I understand that there is no control to limit the total number of processes will be spawned by the workers.

sivel · June 11, 2020, 3:02pm

You have 1 worker process. One ansible-playbook process is the control process, the other is the worker.

Topic		Replies	Views
multi process Ansible Project	0	4	February 4, 2020
Threading vs. forking Ansible Developer	4	5	July 6, 2017
Threadsafety with Ansible Python API Ansible Project	2	19	April 21, 2014
FORKS --- not working as expected and also discrepecies in playbook profile task prints Ansible Project	5	44	November 4, 2021
Parallel execution Ansible Project	2	7	June 20, 2016

Multi threading support in Ansible

Related topics