Parallel deployment issue

Hi,

We are using Ansible to deploy a lot of differents services on a lot of servers.

We developped a backend which is starting ansible playbook when the user send a rest request.

In our project, we deploy entire platforms and we have a web GUI to monitor the deployment.

As we need a “per service” or a “per host” granularity to get some informations during the deployment of each service ( success, failure, etc ), we decided to run one ansible-playbook process per inventory host, to be able to get the return code from each process in our manager.

The problem is that when we deploy more than 20 servers, there is 20 ansible-playbook parent processes and they are VERY resources consuming (load = 50) and then some processes are killed because of oom issues.

So we decided to use the “strategy free” deployment to run only one playbook for all hosts, but then we lost the “per host” return code granularity and we really need this.

We could add more CPU/RAM, but it doesn’t seems to be a scalable solution.

Our goal is to deploy 100+ hosts simultaneously in the fastest way.

We don’t want to wait the end of the playbook to detect errors on some hosts, we prefer to be able to detect errors as soon as possible to re-run only the failed hosts

Is ansible tower solving this issue ? Else how could we solve this please ?

Thanks.

In cloud land you could, in theory, have a lambda (or whatever comparable GCE offers) that schedules a task/container that runs your playbook… Fully isolated, so one play won’t kill another or prevent it from running… That way, you can scale it horizontally rather than vertically across many (cheap) instances…

Just an idea…

Alex

@oMgSufod

I have similar question since I want to setup a cluster of hundreds of VM on cloud land. And there is a dedicated VM installed with ansible.
Have you got some conclusion at last?

Or is there any best practice on cpu/memory configuration which depends on number of nodes in cluster?

Thanks

在 2016年6月17日星期五 UTC+8上午1:36:32,oMgSufod写道: