AWX cluster installation

Hi!
I don’t know anything about Openshift or Kubernetes, yeah that’s the problem… Does anybody have/know a tutorial of AWX cluster installation? I need to deploy 3 app servers with an external database server.

I just spent the last two weeks trying to figure that out, I’ll throw together a write up in the next few days

Thank you, Michael! Cannot wait! I’m really in the weeds here…

Hi Larry

I know that someone had success on setting up an AWX cluster, using my RPMs…:

https://github.com/MrMEEE/awx-build/issues/26

Maybe you can use that, or convert the instructions to the docker version…

Good luck

/Martin

Hi Michael,
Can you please briefly explain the architecture you’ve used to install your AWX? I was reading articles and watching OpenShift youtube videos, and getting an impression that OpenShift itself is installed on one computer/VM, and you can install a few “Pods” within the Openshift container, where “Pod” is the Ansible Docker container. If I’m correct, this is all cool but only for development/demo. I need to break it down to three app servers that are not running on the same computer/VM, that is the point of separating if one ESXi server goes down that we have another takes the load.
Just wanted to double check in few words how it’s done
Thank you, and cannot wait for your docs.

Thanks,
Larry

Thank you, Martin! I think rpm installation isn’t supposed

My understanding is that OpenShift is simply the RedHat flavor of managing a Kubernetes environment. I’ve skipped OpenShift and just set up a Kubernetes cluster with some external postgresql servers.

I’ve used 6 servers (these can be VMs or physical servers) to setup a kubernetes cluster, 3 will be used Kubernetes masters with ETCd running on them as well. The remaining 3 in the cluster will be the Kubernetes Minions(Worker Nodes). Those minions will host the pods that make up the AWX Instances. I’m using Weavenet for the container networking which isn’t a requirement, but it was the easiest container networking solution to figure out and it didn’t require any special network configuration like some of the other container networking solutions needed.

I’ve written an Ansible Playbook that I’ll see if I can share that will take 3 brand new CentOS servers and installs/configures PostGres on them and sets up replication, creates the AWX database and user, and creates a VIP on an F5 with weighted priority groups and health checks so that connections are always sent to the primary unless the primary postgres server is unreachable on port 5432(the default port for postgresql) in which case connections are sent to the secondary instance. The 3 external postgres servers are setup as cascading slaves which means that the primary is running as the RW master, the second is a RO Slave of the Primary and the third is a slave of the Secondary so that if the primary fails for whatever reason the secondary still has redundancy.

I’ve got the Kubernetes Cluster setup manually for now since I was working through the initial design/implementation, but I will be working on a similar playbook to get the Kubernetes Cluster setup from fresh installs.

Once the Kubernetes Cluster and the Postgres Servers are setup, I reuse the inventory file that I originally used to setup the postgres servers along with the official AWX Installer playbook since I’ve seen more than a few places where someone was asking for help with an issue and if it was ever mentioned that you didn’t use their official installation method they’d basically throw their hands up and tell you to go back to the author of the blog post, RPM, or whatever else you were able to find on the subject and ask them for help and I really didn’t want to run into that myself.

Once AWX is installed on the Kubernetes Cluster, you should have a deployment, a replica set, a pod, the three necessary services, and an ingress on your Kubernetes Cluster.

In order to scale up to the three separate worker nodes (or more) you described you’d just log into the Kuberentes Dashboard and go to Deployments, then select the awx deployment and click scale and change it from one to three and the additional pods/instances will then spin up and be added to the necessary services/ingresses to allow the traffic to be sent to them as well. If you need to change the size of your cluster whether that’s growing or shrinking you can always change that number later on down the line too. Each pod will be configured the same with just a different name and internal to the Kubernetes Cluster IP addresses and will contain:

  • awx_web
  • awx_task
  • awx_rabbitmq
  • memcached

Once that’s all in place you should be able to go to http://anyworkernode:XXXXX where XXXXX is equal to the port specified in the Service “awx-web-svc” and you should then be able to connect to one of the awx-web containers.

I’ve then taken the extra step of setting up a VIP on an F5 with SSL Termination and health checks so that cluster members can go offline without impacting user experience as the load balancer should just start sending them to the remaining healthy cluster members… After that’s setup users can go to https://yourawxurl and it will accept the HTTPS connections on port 443 like any standard website would and then passes the traffic on to the cluster on the unique port that was specified in the “awx-web-svc”…

Sorry that this brief explanation is a bit long, but there’s a lot to go over and it’s been way more complicated to get going than I originally thought it would be since I had a basic knowledge of Docker, but no background in Kubernetes or OpenShift or really even postgresql and I have had to figure out how all the pieces fit together in addition to sifting through and trying a few different peoples proposed solutions which were close but I couldn’t quite get them to work.

Let me know if you’re still interested and I can try to throw together a doc that outlines how to set this whole thing up from the ground up. If you need clarification on something let me know and I’ll do my best to answer your question(s).

WOW!!! That’s impressive!
Didn’t expect that, but it is even better. I’m interested! I understand the concept, also understood about 70% from the “brief” :slight_smile: explanation.
I also have just a basic Docker and good Ansible knowledge today. Looks like you’ve done a very good job by improving your knowledge!
In meantime, I’ll jump on Kubernetes documentation or maybe I can find a training.

The F5 is costly, will I be able to replace it with some opensource solution? Maybe you can suggest one?

Thanks,
Larry

Once that’s all in place you should be able to go to http://anyworkernode:XXXXX where XXXXX is equal to the port specified in the Service “awx-web-svc” and you should then be able to connect to one of the awx-web containers.

Fwiw, you shouldn’t have to do this if you use openshift as you can set up a Route. Openshift/kubeproxy will handle proxying requests to all AWX web pods. There isn’t a specific need to have a separate load balancer in front of the openshift cluster, see:

https://docs.openshift.com/container-platform/3.9/architecture/networking/routes.html

specifically see an example here: https://docs.openshift.com/container-platform/3.9/architecture/networking/routes.html#route-hostnames

AWX will already install one for you: https://github.com/ansible/awx/blob/devel/installer/roles/kubernetes/templates/deployment.yml.j2#L333-L351

I’m fairly certain that kubernetes has an existing facility for doing this too so you don’t have to navigate to individual node ports.

Isn’t openshift a paid product?

Fwiw, you shouldn’t have to do this if you use openshift as you can set up a Route. Openshift/kubeproxy will handle proxying requests to all AWX web pods. There isn’t a specific need to have a separate load balancer in front of the openshift cluster, see:

Maybe you can clarify, since I’m not sure I follow, but my AWX/Kubernetes cluster is intended to exist within a secure datacenter where users and automation services will be consuming it. Those users should not have access to any of the other resources in the Kubernetes environment and I wouldn’t want them to have to install any additional software to simply access AWX. My intent, and maybe I’m misunderstanding something and/or using Kubernetes incorrectly was to not have users have to use kubectl proxy (an additional piece of software they’d have to install/use/manage) and simply enter a URL into a browser, the requests module, etc which would then pass the traffic onto the appropriate servers/containers/services/etc. I also didn’t see, and maybe I just missed it, an easy way to terminate SSL.

Also my environment requires 99.99% and AWX is intended to be a PoC to prove the use case/business justification to invest in Tower, but in the meantime I’ll need to make sure that the service is scalable, fault tolerant and highly available. As such pointing a DNS record to a specific node in the cluster presents a single point of failure, using round robin DNS to point to multiple nodes in the cluster can present with odd behavior if a node goes down and/or users may hit multiple nodes without any form of session persistence. I also didn’t want to have to make any non-standard network changes like dropping in a /32 or other specific routes unless absolutely necessary as we already run into issues with ridiculously large routing tables. Maybe I’m missing something, but the only option I really saw that would ensure session reliability and fault tolerance smart enough to not forward requests to downed nodes was to stand up a VIP on a pair of F5 VEs that was already present to use as the external load balancer that would forward to any nodes that’s passing health checks and listening on the appropriate port.

If there’s a better way to do it I’d love to hear it. Is there someone at RedHat that I can contact to get more information on Tower and what the recommended architecture for a deployment as large as ours will eventually grow to be? We’re already managing a large number of servers and I’ve pushed changes to as many as 4000 network devices in a day with Ansible Core and I’m currently in the process of exploring AWX/Tower as we continue to grow to give us an easy way to present users with a friendly interface and an API to integrate with other in-house automation that already exists today.

Hi Michael,
It’s been a while… I was working on some defects fixing and going back to AWX environment. Just wondering if you had a chance to put together a document on AWX clustered installation? We currently have only one instance deployed and this is an old version 1.0.8.0, but we can migrate (if possible) or deploy one of the latest versions.

Thanks,
Larry

Hi Micheal,

We have the same requirement as yours.

Have you produced any document for this setup? if yes, how to access the same?

Regards
Ajit

Hi Michael

I’m also very much interested in this - if you have put together the document, I would really appreciate a copy.

For true HA, running only one RW postgres node concerns me.

Even though the official awx installer allows one to reference the Kubernetes StorageClass to be used for the postgres pod, I have two concerns:

  1. I’m not sure which provisioner to pick. I need something that is free and opensource and can run on on-prem linux servers. Apparently NFS is considered harmful (see 18.2.2 here)
  2. In any case, the official awx installer makes use of the deprecated stable/postgresql helm chart, which does not support HA. For HA, they themselves recommend bitnami/postgresql-ha, which I have seen is the only one that people have had success with. So I’m thinking of using that - they also allow you the option to provide the persistence.storageClass and I was wondering if local storage would be fine, since they make use of repmgr that will fail over if primary fails, but then I saw local does not support dynamic provisioning of persistent volumes. Perhaps using local with pre-created persistent volume is the way to go?

I would really appreciate some tips from anyone that has been able to set up AWX on Kubernetes with true HA (fully operational even if node running postgres pod shuts down)