Well, what I have is a service we deploy as a website for some customers, as an api for other customers (the website using the api internally). It is composed by parts outside the system (e.g. db), and parts inside the system (e.g. a redis server). We also host the same website in white-label version for providers who customize it and have their own customers. A host may run more than one whitelabel and there could be some services shared across whitelabels (e.g. a single redis for all the whitelabels on the same host). Bigger websites may get separate hosts for different services, tiers can be sharded etc. That’s the picture.
So, how I’m organizing the thing: I’m trying to have a single playbook, “site.yml”, for the deploy of any configuration. An inventory file would represent instead a single whitelabel with the description of what host it should run on (red.inv, blue.inv, for development I’d have vm.inv, localhost.inv etc). So, for instance, if Green and Blue are two small websites which can run on the same box while Red is so big that it needs web and api separated on two machines we could have:
green.inv
[web]
host1
[api]
host1
blue.inv
[web]
host1
[api]
host1
red.inv
[web]
host2
[api]
host3
And only one playbook to rule them all:
site.yml
-
hosts: web
roles:
-
role: web
-
hosts: api
roles:
Let’s say normally the api serves on port 4000. It is usually fine but on a host running many of them it may need configuration. With the idea of a role being self-describing I think it’s normal to define api_port=4000 somewhere inside the role. If the role is used in a playbook it is conceivable that tasks in other roles may want to refer to that variable. If the default is fine then it’s ok for both the provider and the receiver; if it’s not the inventory/group may decide to override it; so I’d have:
roles/
api/
default/main.yml
api_port: 4000
group_vars/
green
info_email: info@green.com
blue
info_email: info@blue.com
api_port: 4001
As things stand now I cannot provide a self-contained api role that defines its own default port. If I put it in the vars it can be included but cannot be overridden. If I put it in group_vars/all it is no more self contained: the idea of modularized roles that can be (re)used at leisure in different playbooks breaks because they can only be used if sensible variables are provided in “all”, the inventory or somewhere else: I would have thought that the role itself is the most sensible provider of defaults for such variables.
A different organization may see a different playbook for each whitelabel, and add vars_files to each of them, but this seems much worse to maintain: if I add a new piece to the system, let’s say now it needs rabbitmq too to run, I have to add the rule to every playbook, instead of just to site.yml and then configure only the groups where its port is non-standard (which may be none of them). The idea I have in mind is the playbook describing how a system is composed and the inventory describing where the system should run; I may be wrong but in my playing around different interpretations have led to harder to maintain configurations.
Are you sure that being able to write “http_port: 80” inside roles/apache/defaults/main.yml would be bad design? I would have thought it is well encapsulated, provided it is exposed in such way that some role/firewall may read it. It serves as role interface indeed. As things stand now I’m experiencing “a migration of defaults”: first I define http_port inside apache defaults, because I think it belongs there and I just need to write the apache config file. Then I add the firewall configuration to the site and… oops: I need to know where apache wants to listen: another variable flying into groups_vars/all just because two roles need it (while 10 other roles don’t really care, but now I have to be more careful to scope everybody’s variables because they have become more global than what was needed).
If there is a better way to organize my work I’d be happy to follow suggestions. As it is now I’m following the advice to put everything inside “all”, but I don’t find it optimal and I’m wondering if it can be improved.
Thank you
– Daniele