Roles purpose

I’m a little confused in understanding the exact purpose of roles. Reading http://docs.ansible.com/ansible/playbooks_roles.html , the first examples use a 1:1 association between role and hosts group (eg webservers and dbservers). This makes the “role” terminology meaningful to me, as it is representing a set of machines which have a role in the infrastructure. However, later in the same web page I see examples with roles named “apache” and “postgres”, which are only applications in one or more machines. I’d assume that the setup of these applications would be only tasks, not roles.

Questions:

  • In the first case, could not I simply associate a group of hosts with variables? Why use roles?
  • In the second case, could I not simply associate a group of hosts with “apache” and “postgres” setup tasks? Why use roles?

At first sight, roles seem overkill to me, specially in a small infrastructure with a few machines such as the one I have.

Even in a small environment, roles are quite useful. I use them for clusters as small as 4.

Roles do not have to be 1:1 to host groups (but can be). You might have roles apache, and postgres. Each of those would install the base packages on top of your basic config, which might be handled by your common role.

Then, you might have a more specific role for each service or customer. Those would layer the specific configs on top of the packages. If you use role dependencies, you’re guaranteed that the packages would be there.

This allows you to have top level roles that use other (standardized) roles as building blocks.

Then you can get into situations where different groups, or even organizations, maintain the different roles. Your security office might maintain a role that ensures that your machines meet our organization’s security requirements. ^^

Although you are free to define ‘role’ whichever way you want, I like to think of ‘role’ as analogous to ‘component’ rather than a 1:1 relation to a host group, even though in most cases each host group will likely include a component that is 1:1 with a host group.

For example, on a recent project I had a several services (consul, cassandra, elasticsearch, postgres, rabbitmq, …) as well as a handful of custom applications that built a data processing pipeline. Each service had its own host group and playbook. The playbook would include multiple roles, typically it went something like: [‘common’,‘docker’,‘registrator’,‘consul-client’,‘elasticsearch’]. Each of these roles would build up their own specific part of the instances in that host group, leveraging docker containers where we could.

Then in cases where we didnt need the docker stuff (the spark cluster), we could omit that from those hosts.

This setup saved us a bit of time when we realised that we actually needed 2 elasticsearch clusters. One of the apps needed to be pinned to an older version, while our ELK stack needed the latest stable. Because we had abstracted away things like the docker version tag and cluster name to variables we simple created two host groups and in the group vars applied the specific details for each group. Even the playbooks themselves were basically identical, with the only difference being the hosts group name.