synchronization among nodes in a group

I am curious if there is any mechanism to perform synchronous tasks among a group of nodes (or nodes in a role, etc.)

There is one example online to perform rolling updates to a group of web servers, one at a time. This makes perfect sense, since in this case you don’t want to shut down all of them to keep the service up.

But in another scenario, where I need to shut down all the service nodes, upgrade database schema, upgrade software code, and then restart the nodes… So, I want to have a play book that can do:

  • stop all nodes in a group/role
  • perform db migration
  • perform software upgrade for all nodes
  • start service one by one

Is there a way to do this? Thanks…

  • hosts: webnodes
    tasks:

  • shutdown

  • hosts: db
    tasks:

  • migration

  • hosts: webnodes
    tasks:

  • upgrade

… you can see where I’m going

forgot the last:

  • hsots: webnodes
    serial: 1
    tasks:
  • startup

I am new to ansible, obviously :slight_smile:

How are we defining “shutdown” task? Should that be under a role?

it was just a placeholder for however you manage the webnodes, in my case it is:

service: name=nginx state=stopped

that should work for anything that uses the system’s init apps

Yep!!!

This is a MAJOR feature of Ansible to build Continuous Deployment systems.

You can see an example using haproxy here:

https://github.com/ansible/ansible-examples/tree/master/lamp_haproxy

You can do this even simpler with an F5, Elastic Load Balancer, Citrix Netscaler, or whatever for swapping out the relevant modules – though haproxy is free, so it makes a good example.

See also

http://www.ansibleworks.com/docs/playbooks_delegation.html

You can talk to as many systems in parallel as you want by specifying --forks, ex: --forks 100 for 100 systems at a time.

If you set “serial: 20” that means “20 must be fully configured before moving on to the next set”, so if you had 500 systems with serial 50 (and forks --50 for maximum speed), it completes in 5 rolling update batches with only losing 10% capacity on your web farm in each batch.

Lots of users use this in combination with Jenkins to do as many as 5-10 updates an hour, without any user impacts.

There’s also going to be a nice guide coming soon (WIP):

https://github.com/ansible/ansible/pull/5221

Hi!

I am new to Ansible too… I don’t think that is what Xu wanted… I want something similar… I need an “all or nothing” approach like this:

  • Take a lvm snapshot on all servers
  • If I can’t take a snapshot of a single server, do not proceed
  • Do whatever I need to do on all servers
  • If a single server fails, rollback the snapshot on all servers

If one server is blocking everyone, I can decide to take it out entirely (from the inventory and from the network) before running the playbook again.

I have been reading the documentation and doing some simple tests but I can’t find a way of doing this. Ansible’s blocks with the rescue sections looked promising but the rescue will only run for the node that failed and not for all nodes…

Do you think it is possible to do something like this with Ansible?

Respectfully,
Amir Samary

I managed to implement this thanks to this stack overflow post:

http://stackoverflow.com/questions/39300734/how-can-i-configure-an-all-or-nothing-ansible-playbook?answertab=votes#tab-top

I hope it helps.
Kind regards,
AS