AWX Instance groups Question

I was under the impression that hosts in the same instance group are picking up whatever job is assigned to that group depending on which has free resources.
I have an environment with 2 hosts in the tower instance group. I have set up a git project. Apparently, this resides only on one node.
If i check in the container, the playbooks are fetched from git only on one node. If i stop the containers on that node and trigger a pull from GIT, the other node will not run it. It is queued.

This doesn’t make sense.

The decision of which node to run the job on is based on the current load, no the historical load. If you keep running a single project update, that is going to run on the same tower node. You have to run jobs in parallel if you want to observe jobs running on both nodes.

https://github.com/ansible/awx/blob/devel/docs/task_manager_system.md#node-affinity-decider

You didn’t get what i am saying. I stopped one of the nodes. The available node still up was not picking up the job. (the job’s status was queued). And there was no other job running.
In the UI i also see just one instance, somehow with 8 CPUs, when i actually have 2 nodes with 2 CPUs each.

It sounds like what you’re trying to describe is a clustered setup, and more specifically, (from your previous posts: https://groups.google.com/forum/#!topic/awx-project/0UKSmatgYYM) a clustered Docker setup.

AWX isn’t designed for Docker/Swarm clustering - the built-in clustering only works on Kubernetes/Openshift and takes advantage of a lot of the auto-clustering available in RabbitMQ on Kubernetes.

However, there has been some community effort to add Docker clustering into AWX. I’m not sure if it was ever fully figured out but it would be a good place to start looking for direction. Just understand that the Project doesn’t formally recognize a clustered Swarm setup so going this route does not guarantee that any future versions of AWX won’t break or be incompatible with the setup.

Damn, makes sense. I managed to get the awx_task and awx_web containers to work in a cluster. I guess the rabbitmq is the issue since it’s not accessible from outside. I tried ‘hacking it’ and make rabbitmq on the second node join the cluster on the first one but it wasn’t able to join.
I wish they had this written somewhere in a documentation. It would’ve spared me a bunch of hours.