Hi list,
TL;DR: I’d like to know how people model their inventory data for a large set of hosts (+500 vm’s) that are given the mostly the same role, but with many varying applications parameters, to the extent where a simple with_items list or even with_nested list doesn’t satisfy anymore.
I have been pondering some time on the subject at hand, where I’m hesitant if the way I started working with ansible and how it growed over time, is the best possible way. In particular on how to model the inventory
,
variables, but obviously also in the way implementing
and nesting
groups.
Rather than showing how I did it, let me explain some of the particulars of this environment, so I can ask the community “how would you do it?”
We’re mostly a Java shop, and have a very standardized, and sometimes particular setup:
-
75% of all hosts (vm’s) are tomcat hosts (I’ll focus on just those from here);
-
every specific tomcat setup is deployed as two nodes (not a real cluster, but mostly stateless applications behind a loadbalancer);
-
every cluster typically has 1 application (1 deployed war with 1 context path in tomcat speak, basically providing http://node/app );
-
occasionally a node/cluster will have more than one such ‘application’ hosted. This can be on the same Tomcat instance (same tcp port 8080), but could also be living on another port (which calls the need for a separate ip/port combination or pool on the load balancer)
-
every application cluster typically is part of a larger application which can vary from one to several application clusters
-
the big applications are part of a project, a project is part of an organisation
-
every application has three instances in each environment: development, testing and production (clustered in the same way, everywhere)
-
the loadbalancer performs typically one, but sometimes more, health checks
per
application (a basic GET, and checking a string in the response), and will automatically mark a node as down if that fails -
some applications can communicate with some other applications if need be, but only by
communicating through
the loadbalancer; this is also enforced by the network; so
we need a configuration here that says ‘node A may communicate with node B’; we do that on the load balancer at the time, and every such set needs a separate LB config; -
every application is of course consumed in some way or another, and is defined on the load balancer (nodes and pools and virtual servers in F5 speak)
Yes, this means every tomcat application lives on, in total, 6 instances (2 cluster nodes x 3 environments), hence 6 virtual machines
A basic inventory would hence show as:
all inventory
_ organisation 1
_ project 1
_ application 1
_ dev
_ node 1
_ node 2
_ test
_ …
_ prod
_ …
_ application 2
_ …
_ project 2
_ …
_ organisation 1
_ …
Some other implented groups are:
_ development
_ organisation1-dev
_application1-dev
_ testing
_ production
or
-
tomcat
_ application1
_ application2
-
<some_other_server_role_besides_tomcat>
_ application7
_ application9
Our environment counts around 100 applications, hence 600 vm’s at this moment, so keeping everything rigorously standard is very important.
Automating the load balancer from a config per application has become a key issue1
So w
hen looking beyond thepurely per
groups and node inventory, on a node we get following data important to configure things on the load balancer:
- Within an application server:
node
_ subapp1
_ healthcheck1
_ healthcheck2
_ subapp1
…