multiple instance for AWX instalation (HA)

Philipp_Wiesner · March 8, 2018, 6:36pm

Hi Dnc92301,

you should see both your nodes below the AWX instance group.

Here is a link how Ansible Tower is setup as a cluster: http://docs.ansible.com/ansible-tower/latest/html/administration/clustering.html#job-runtime-behavior

The image shows, that on each node RabbitMQ is running, which are providing the cluster functionality. With AWX you have on each node then docker containers running with the awx_web, awx_task and memcache images. For each node the settings.py is in this way configured, that the rabbitmq_host points to it one machine. Meaning that when you have two nodes: node1 and node2, that on node1 the settings.py rabbitmq_host points to node1 and on node2 the settings.py rabbitmq_host points to node2. The same goes for the celery worker. Therefore we have added this cluster_node inventory variable. We have the installation directory on each seperate node, set the cluster_node to the machine name and run the installation playbook on each node.

The cluster work in this way, that when the capacity on one node is full, the next triggered job will be started on the next node of the cluster with free capacity. A failover of a job like mentioned in your example will not work, as the playbook run is triggered by one worker process on one node. When you restart this node, the worker process is lost.

The page I provided to you gives you a good overview, of what the cluster functionality of Ansible Tower/AWX provides and how it works. For our setup I have basically reenginered this setup into AWX.

matburt · March 8, 2018, 7:26pm

It’s probably worth pointing out at this point that clustered AWX is now supported on Openshift and Kubernetes without needing to hand-roll your own solution.

Dong_Yang · March 8, 2018, 8:01pm

I did take your suggested recommendation by updating all config files - settings.py and others .

I’m looking at tower HA configuration . API/2/ping it shows both nodes within the instance group . I’m AWX it lists awx within instance_group , ps ef |grep celery - shows celery@awx (on both nodes ) .I think a working setup should be showing as celery@hostname

Thanks for all your help ..

Philipp_Wiesner · March 8, 2018, 8:20pm

I Think you are on the right path, when the API already shows both nodes within the instance group.

There is this file:

image_build/files/supervisor_task.conf

In the installation image_build role. If you adjust the celery worker command to:
command = /var/lib/awx/venv/awx/bin/celery worker -A awx -l DEBUG --autoscale=``50``,``4 -Ofair -Q
tower_scheduler,tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s -n celery@%(ENV_CLUSTER_NODE)s

ps ef | grep celery should also show the correct output. Please be aware of the ENV_ in front of the environment variable, this is needed by supervisord for reading environment variables.

Dong_Yang · March 8, 2018, 8:38pm

It seems the variable is not being passed corrected in the supervisor_task.conf file . ...tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s - n celery@%(ENV_CLUSTER_NODE)s .

As celeryd starts up as ,tower_broadcast_all,tower,awx -n celery@localhost .

In tower , it is *tower_broadcast_all,tower,hostname -n celery@hostname

Within inventory i tried with cluster_node , wig and without FQDN and also with cluster_node and CLUSTER_NODE.

It’s odd

Philipp_Wiesner · March 9, 2018, 10:30am

Are you also setting the CLUSTER_NODE variable in your env?

Here an except from our local_docker/tasks/main.yml file:

env:
CLUSTER_NODE: “{{ cluster_node | default(‘localhost’) }}”

This was set for the awx task and web container, providing the cluster_node variable to the environment. So supervisor_task.conf file can read this inside the container.

Dong_Yang · March 17, 2018, 12:52pm

Hi Phillipp - can you tell me what version /tag you used for your setup ? I wanted to see if I can reproduce it . I had done exactly the changes you had described by separating out Rabbitmq (2-nodes) and having containers(awx/task/memcache) installed (with the modified configurations) on the same hosts running rabbitmq. PostgreSQL on a different node . So in all a 3 server configuration. Thanks

Abhishek.J_Gowda · June 7, 2018, 7:13am

Has anyone successfuly setup the high available AWX set up which can scale?, need some suggestions, thanks

Dong_Yang · June 15, 2018, 2:23am

Docker swarm might be the way forward ..

Abhishek.J_Gowda · June 22, 2018, 3:52am

hi Matt Jones,

can you pls point me to the documentation for clustered AWX in openshift,

thanks

matburt · June 27, 2018, 6:23pm

Everything you need is here: https://github.com/ansible/awx/blob/devel/INSTALL.md

The_Vandyy_Vines · September 3, 2018, 6:46am

Hello, Did you manage to find solution for this issue?

Thanks.

The_Vandyy_Vines · September 5, 2018, 3:22am

Hello,

Will this help?

Allow Containers to be installed to a stack in Docker Swarm #1287https://github.com/ansible/awx/issues/1287

BR//
Vandana

CV1 · February 16, 2019, 8:48pm

Hi Phil,

I am having trouble finding the parts of the file you edited to accomplish a multiple instance (HA) installation. Current version is 3.0.1

which command did you replace in supervisor_task.conf with "command = /var/lib/awx/venv/awx/bin/celery worker -A awx -l ERROR --autoscale=``50``,``4 -Ofair -Q
tower_scheduler,tower_broadcast_all,tower,%(ENV_CLUSTER_NODE)s -n celery@%(ENV_CLUSTER_NODE)s"?

In addition, I do not see any references to rabbitmg in main.yml or set_image.yml (the only task imported in main.yml) in local_docker/tasks/ .

Sujith_A_R1 · March 8, 2019, 12:58am

This particular change realted with celery isn’t needed for latest AWX version 3.0.1 as it auto picks up and schedule jobs under what instance group we defined over there.

Check this out as well : https://groups.google.com/forum/m/?utm_medium=email&utm_source=footer#!msg/awx-project/E6mlcdfED2s/MR3dtETLBAAJ

Sujith_A_R1 · April 6, 2019, 4:25am

Update towards enabling HA in Docker StandAlone method: https://github.com/ansible/awx/issues/3627

Topic		Replies	Views
Installing AWX local with multiple instances AWX Project awx , kubernetes	0	13	May 14, 2019
AWX HA AWX Project awx	10	22	March 8, 2019
AWX High availability AWX Project awx	2	5	May 10, 2019
High availability AWX AWX Project awx , kubernetes	4	69	July 26, 2023
Installation on 2 kubernetes clusters with the same database AWX Project awx , kubernetes	2	40	January 13, 2023

multiple instance for AWX instalation (HA)

Allow Containers to be installed to a stack in Docker Swarm #1287https://github.com/ansible/awx/issues/1287

Related topics