Example of multi-datacenter deployments using Ansible

Hello,

We currently have three datacenters, and they’re all three basically the same with some very minor differences. Up until now, what I’ve done is created a playbook for each, and a hosts file for each datacenter. This was primarily because the ethernet interfaces slightly differed and we didn’t want to have to remember to provide forgettable variables on the command line.

Now, we’re finally migrating our systems where all datacenters will be exactly the same, so we can use the same playbook. To keep it simple, lets say that each datacenter has one load balancer, and two web servers.

DC1

[webservers]
1.1.1.20
1.1.1.30

[lbservers]
1.1.1.10

DC2

[webservers]

1.1.2.20

1.1.2.30

[lbservers]

1.1.2.10

DC3

[webservers]

1.1.3.20

1.1.3.30

[lbservers]

1.1.3.10

Typically, in our previous setup where each datacenter had its own hosts file and playbook, we’d do the following to deploy all the installation tasks:

ansible-playbook -i /ansible/dc1/hosts.txt /ansible/dc1/all_full_deploy.yml

Since our datacenters will basically be the same, and the playbook can now be the same, I understand that I could use just one playbook, and change out the hosts file and it’ll work to isolate deployments to that datacenter. The issue, and why I’m writing this is that I’d also be able to, for example, deploy our website to all [webservers] regardless of which datacenter it’s in—and it has to perform a few tasks on the respective load balancer when it does it (taking it out and adding it back).

So, what is the recommended way to have a multi-datacenter hosts file so that we can work with a single datacenter, or all of them ideally using the same hosts file and the same playbook?

Thank you guys in advance for any advice you can provide.

Sincerely,
Joel Strellner

Joel,

This is untested, but might work (depending on what your playbooks are
doing). Try using multiple groups and parent groups with children.

Assuming your playbooks ran against the webservers or lbserver goup:

if you wanted updated only dc1 :

ansible-playbook myplabook.yml --limit dc1

if you wanted to run against both datacenters:

ansible-playbook myplaybook.yml

hosts file:

[dc1:children]
dc1_webservers
dc1_lbservers

[dc1_webservers]
1.1.1.20
1.1.1.30

[dc1_lbservers]
1.1.1.10

[dc2:children]
dc2_webservers
dc2_lbservers

[dc1_webservers]
1.1.2.20
1.1.2.30

[dc1_lbservers]
1.1.2.10

[webservers:children]
dc1_webservers
dc2_webservers

[lbservers:children]
dc1_lbservers
dc2_lbservers

- James

Hi James,

First, thanks for the response.

Would the example you provided still work when the .yml files declare there hosts as:

  • hosts: lbservers

Or would it have to be changed to say:

  • hosts: dc1_lbservers

We have specific tasks that would need to be run on the lbservers vs the web servers.

Thanks,

-Joel

Joel,

It would be:

- hosts: lbservers

In theory, the --limit dc1 would limit the playbook to the lbservers
in dc1. Just make sure you don't forget to limit!

- James

Hi James,

If I didn’t limit, would it run the tasks for each lbservers for each webserver, or would it properly know that dc1’s webservers should only interact with dc1’s load balancer?

I am trying to make it so that I could run a playbook for a specific datacenter, or I could run it for all at the same time (like a code deploy). The sticking point seems to be that the lbserver in dc1 should only know about the webservers in dc1.

Joel,

I think I understand what you mean. It would help to see the
playbook(s). Perhaps you need an inventory variable for the
load balancers that define what group they are in charge of?
I don't think ansible is capable of automatically inferring it on its own.
You might be able to write some logic around it in your
playbook/template as well. Declaring it might be faster and easier to
understand.

- James

Hi James,

They’re pretty complicated, and we have several depending on the scenario. The most common that we run though, and simplest, is a code deploy, and it looks like this:

gather facts from monitoring nodes for iptables rules

  • hosts: webservers
    serial: 1
    accelerate: true

These are the tasks to run before applying updates:

pre_tasks:

  • name: disable the staging server in haproxy
    shell: echo “disable server web_staging/{{ ansible_hostname }}” | socat stdio /var/lib/haproxy/stats
    delegate_to: “{{ item }}”
    with_items: groups.lbservers
  • name: disable the production server in haproxy
    shell: echo “disable server web_production/{{ ansible_hostname }}” | socat stdio /var/lib/haproxy/stats
    delegate_to: “{{ item }}”
    with_items: groups.lbservers
    roles:
  • deploy_web

These tasks run after the roles:

post_tasks:

  • name: Enable the staging server in haproxy
    shell: echo “enable server web_staging/{{ ansible_hostname }}” | socat stdio /var/lib/haproxy/stats
    delegate_to: “{{ item }}”
    with_items: groups.lbservers
  • name: Enable the production server in haproxy
    shell: echo “enable server web_production/{{ ansible_hostname }}” | socat stdio /var/lib/haproxy/stats
    delegate_to: “{{ item }}”
    with_items: groups.lbservers

In short, it removes the web server from HA Proxy, then does a code update (deploy_web role is basically a git pull), then it adds the server back to HA Proxy.

I have not considered an inventory variable. I’ll look into it to see if it can be done in a way that makes sense to me.

I’m starting to think the cleanest way might just be three hosts files, and then just running the playbook three times. :frowning:

-Joel

Thanks for the playbook.

So based on what I was saying before.. Make a group_var for the
webservers that tell them what lbserver group owns them. Maybe call
it {{ lb_server_group }}

then your delegate operations could look like:

   delegate_to: "{{ item }}"
    with_items: groups.{{ lb_server_group }}

It does seem cleaner to do it with different hosts files.

- James

Thanks James,

I’ll play with doing it that way, and I’ll probably also do the multiple hosts file. Whatever ends up working the best and is least likely to allow errors to happen is what I’ll go with.

Thanks again.

-Joel

Hello Joel! I ran into the same identical problem (cfr. http://serverfault.com/questions/693469/multi-datacenter-ansible-load-balancer-template) and was wondering if you could share whatever solution you ended up using?

Thank you so much!
D

Hi Davide,

We ended up going with multiple host files, one for each datacenter, each containing only the relevant information for that datacenters information. The added benefit of going this route is that we can deploy to a single datacenter, verify that all is well, and then roll out to the others (like upgrading or installing new packages). To get around the annoyance for certain playbooks that I know can always be run in all datacenters—like most of our code deploys—on our ansible server we have alias commands in our bash profile that can run the ansible command on all datacenters.

I wish there was a better way, but I just couldn’t find one.

Surprising !
There is still better way available to do code deploy using Ansible ..

Task should looks like

1) remove webserver from lb - use module haproxy (wait till connection drain)
2) silent monitoring for web health check - use nagios/zabbix or any another monitoring module
3) deploy code
4) do manual web health check or use nagios socket
5) add back in lb
6) unshut web health check in monitoring

If step 4 fails revert to previous version of deployment and continue the steps 5 and 6

We had the multi dc and multi env setup where 5 dc and 3 env used

For that we have used group_vars to keep all our dc specific folders and the env as yml var file, just to use the extra-vars dc=cool env=prod solved the issues.
In these env var files we have added all our info, just like if we have prod env running at cool dc

lb: lbserver
web: "web1, web2"
watchdog: nagioserver

In task header just to add the following and each task delegate to the addressed server.

host:{{target}}
var_files: {{dc}}

cmd:
ansible-playbook deploy.yml -e "dc=colo env=prod"