Structure for Ansible and handling of config

Heya,

I’m currently evaluating Ansible for our company but I’m having problems to understand how we would set up our global structure. We have like 10.000 hosts and hundreds of different projects with different teams and needs. Still we need to use a common code base (roles, plugins) and not re-invent the wheel in every project. Of course I understand roles and playbooks. My problem is how settings are handled and from there it leads to more problems.

  1. User settings vs. project settings

According to the docs, Ansible looks in different locations for the cfg file and uses the first found, ignoring all others. So it is not possible to have settings in different locations, which makes it hard to allow user specific settings (e.g. private_key_file) and at the same time define company wide or project specific settings, which should be stored in git.

Ideally Ansible would respect all cfg files, merge the content and handle the settings with precedence in the order like described in the docs.

  1. Settings are global, not per playbook

But different playbooks might have different needs, e.g. a playbook could require hash_behaviour=merge. To archive this one would need to create a folder specific for the playbook, place the cfg with the settings inside, cd to that folder and run ansible-playbook from there.

Another thing are callbacks. Those are fired as soon as they exists in the per settings defined folder. So every playbook which has a specific callback would require its own callback directory + settings. When following this pattern, you’d need to create the roles within this directory as well as those need to be located on the same level you execute ansible-playbook on.

This all makes it very hard to re-use roles and plugins. I worked on a structure that might work but is a symlink mayhem:

group_vars all group-1 group-N host_vars host-1 host-N inventory production staging uat library modules mod-1.py mod-N.py plugins callback callback-1.py callback-N.py filter filter-1.py filter-N.py lookup lookup-1.py lookup-N.py playbooks project-1 ansible.cfg group_vars -> ../../group_vars host_vars -> ../../host_vars library modules -> ../../../library/modules plugins callback callback-1.py -> ../../../../../library/plugins/callback/callback-1.py callback-N.py -> ../../../../../library/plugins/callback/callback-N.py filter -> ../../../../library/plugins/filter lookup -> ../../../../library/plugins/lookup production -> ../../inventory/production playbook-1.yml playbook-N.yml roles -> ../../roles staging -> ../../inventory/staging uat -> ../../inventory/uat project-N ... roles role-1 role-N

Roles, inventory, host_vars, group_vars, modules and plugins are defined on the root level. The playbooks directory holds all projects. Every project then can define it’s own cfg with specific settings. The inventory files, roles, *_vars, modules, filter-plugins and lookup-plugins are symlinked into the project-folder. With the callbacks it’s more tricky as those would be fired as soon as they exists. So every callback that is required for a project needs to be symlinked explicitly.

This enables us to globally handle all re-usable components while every project can define its own settings. A project would/could be a submodule in the main git repo.

So this seems to work. Only that user specific settings are still not possible. Does this setup make sense from the PoV of more experienced Ansible users? IMHO this looks quite complex and I wonder what I’m missing here because things shouldn’t be that complex.

Thanks in advance,
Daniel

Hi Daniel, replies are inline.

Thanks Michael,

Tower allows you to upload a private key

Well the key only was an example. One that I just made up because there might be the possibility one needs to change it. Another example would be the ssh timeout or more generally the ssh_args. Probably this all could be set in the ssh config. I just want to make sure there won’t be a showstopper in future when a user requires a config tweak and can’t set it because the cfg is in source control.

The storage of ssh keys in Tower for sure is a very nice feature which might get interesting for us in future for security reasons.

It’s recommended, to be consistent with the majority of the ansible community, that people don’t adopt hash_behavior=merge. However there are some that really feel like they should use it.

`

Thanks Michael,

Tower allows you to upload a private key

Well the key only was an example. One that I just made up because there
might be the possibility one needs to change it. Another example would be
the ssh *timeout* or more generally the *ssh_args*. Probably this all
could be set in the ssh config. I just want to make sure there won't be a
showstopper in future when a user requires a config tweak and can't set it
because the cfg is in source control.

How realistic is that this would need to change on a per project setting
versus a per installation setting? In many cases it seems these should be
able to have very good defaults that work for everyone IMHO>

The storage of ssh keys in Tower for sure is a very nice feature which
might get interesting for us in future for security reasons.

It's recommended, to be consistent with the majority of the ansible

community, that people don't adopt hash_behavior=merge. However there are
some that really feel like they should use it.

I don't see a way around this setting in our case. That's because our
systems will be furnished with many different services which are provided
by different teams.

So far this describes most all Ansible users :slight_smile:

And which config a service uses might depend on other services. A good
example for this would be Splunk. Splunk is a tool for collecting and
indexing logfiles. Which logs will be collected depends then on which other
services will run on a host. My idea was to organize this in groups.

Cool, we've got a lot of big users using Ansible for splunk configuration...

group_vars for group-A:
---
splunk-forwarder:
  file-A: sourcetype-A
...

group_vars for group-B:
---
splunk-forwarder:
  - file-B: sourcetype-B
...

When the host belongs to group A and B the content will be merged:
---
splunk-forwarder:
  file-A: sourcetype-A
  file-B: sourcetype-B
...

So I think I understand it is that on a per role basis you want to
configure splunk to possibly go to different locations.

In such a case, I think this could easily be solved with a template that
based on something like group_names, decides to add which forwarders.

Ignoring splunk and generalizing it to foo.conf at the moment:

{% if 'xyz' in group_names %}
   text code to enable forwarder A
{% endif %}
{% if 'jkl' in group_names %}
   text code to enable forwarder B
{% endif %}

A playbook might look like this:
---
- name: Some playbook
  hosts: [some-host-which-may-belong-to-A-and/or-B]
  roles:
    - { role: role-A, when: "'group-A' in group_names" }
    - { role: role-B, when: "'group-B' in group_names" }
    - { role: splunk-forwarder, when: "splunk-forwarder is defined and
splunk-forwarder | length > 0" }
...

So along with some other roles the splunk-forwarder role is applied which
then uses the config of the other groups.

Another use case - exceptionally one that is not made up and I really have
in my evaluation experiment - is a redis proxy (twemproxy by twitter) which
should forward connections to different redis clusters. Each cluster is
defined in a separate group. To get the relevant configuration of all
clusters I include the *group_vars* of all clusters in a loop. The proxy
config only holds references to the clusters, like so:

---
redisproxy:
  - host: some-host
    pools:
      - cluster: A
        pool: A
      - cluster: B
        pool: K
...

So this *group_vars* holds the config for all proxies on all hosts. Here
we have 2 pools each defined in a different cluster. In a loop in the proxy
role I then include all the group_vars of the clusters (redis-proxy-*A*,
redis-proxy-*B*) which again will be merged hashes.

I would probably approach this by templating the config file if possible
too, though there could be other approaches.

I recommend you set a policy on what you use so that everyone can easily

read playbooks and know what might be going on.

In case of the *hash_behavior* you are right. Since roles are services
provided by different teams we need consistent behavior which developers
can rely on. Though for callback plugins this still is a problem.

I'm not sure how you have callbacks and roles interlocking, as they are

not related concepts.

It's just that you need to define the roles on the same level as the
ansible.cfg or the playbook won't find them.
The simple requirement "playbook specific callback" -> requires a specific
ansible.cfg -> requires a root folder for every playbook where the cfg can
be placed in along with the playbook -> requires the roles on the same
level inside this specific folder.

I'm still confused a bit why you would have 100 different chat channels.
That seems pretty interesting, but also almost like you'd want a better
way of recording than chat channels. Not to say this isn't novel.

You could also write a callback that payed attention to the name of the
play or something, though this may require some tweaking.

However, if your custom callback requires configuration, the common

mechanism is for it to read an environment variable. This environment
variable could even reference the path to a configuration file.

It's not that I need config (well, I do, but that's another topic ;-)) but
to enable or disable a callback per playbook. Some team might want to log
to a database like you say. Another team might want to send notifications
to their Hipchat channel. Who knows. That's up to them, I just try to find
a solution to give them the chance to do whatever they want. Environment
variables might be an option, but that's everything but convenient when a
user needs to manually set 15 variables before running the playbook and
then changing it when running another one.

Forgot about this.

Unless I misremember implementing it a playbook can have a
"./callback_plugins" directory relative to it and that will work.

Whether this was a symlink or whatever, that callback plugin could contain
an INI file or something that included a room ID.

From the other mail thread I have seen how to access vars inside the
callback plugin and that might be a handy option. Then it would be possible
to enable/disable a callback per group_vars.

but I'd first like to step back and ask what your callbacks *do*

Nothing specific. I don't really have callbacks other than the Hipchat
plugin I'm playing with. I just want to find the best possible setup to
give our teams the most freedom in future. But as written before, roles
come from different teams and each team might want to get notified on
failure, log changes to a database or whatever comes to their mind. So I
need a flexible framework where things can be configured per playbook and
role.

With the settings available in the callback I believe I can work.

In the *all* group_vars we then can define notification settings:
---
notifications:
  playbook-name:
    role-name:
      task-name:
        fail:
          - type: Hipchat
            room: 12345
          - type: Email
            to: me@example.com
        ok:
          - type: Hipchat
            room: 12345
...

Each element (playbook-name, role-name, task-name and the actual callback
type name could be a wildcard to match any value, so a role provider could
get notified of failures in-depended of the playbook name.
Each callback would then run through those definitions and either get
active or not. Then I can use a simple structure as we do not require a
custom cfg to define a separate set of callbacks.

As for user specific settings vs. company-wide settings: Is there a
reasons why multiple cfg's are not merged? Would you accept a PR for such a
feature?

We thought it would be confusing. Various folks agree and disagree.