Advice on variable organization.

C_Morgan_Hamill · November 21, 2013, 3:15pm

Howdy folks,

I was hoping to get some tips on how best to organize variables in
Ansible. Right now, I don't have hard-and-fast rules, but I'm generally
sticking to the following patterns:

  * 'group_vars/all' contains any variables which I will need to access
    over and over, such as domain-level DNS settings, user information,
    Yum server URL, etc.

  * Files for individual groups under 'group_vars' contain any variables
    which must be accessible for all machines in a group (but not other
    machines), and any variables which need to be overridden from
    'group_vars/all'

  * Files for individual machines under 'host_vars' almost exclusively
    contain information related to network device configuration, but
    also may occasionally contain overrides of group variables.

  * In roles, I use the 'vars/main.yml' for any variables which:
      a) I won't need to access from outside the role.
      b) I deem more closely associated with the role itself than, say,
         the group(s) which will be having that role applied.

  * I pretty much tend not to use 'defaults/main.yml', because:
      a) I don't want to accidentally have inventory variables override
         them because of accidental namespace collisions.
      b) I like the explicitness of passing overrides to roles in the
         play itself.

Now, this mostly seems to work, but I've noticed a few issues:

  * Most of these choices are based on some variant of where I might
    need to access the variables, which strikes me as *almost* an
    implementation detail, with no real semantic significance.

  * 'group_vars/all' has the potential to get _huge_. I could see
    hitting thousands of lines before too long. This can be very
    annoying and error-prone to edit and browse.

  * Deciding whether a variables should be a group variable or a role
    variable is a pretty arbitrary process. I can feel my brain trying
    to ask what in the hell is the difference whenever I am making the
    decision. Part of this is that groups and roles often map 1:1, so
    in practice there may be no real distinction between the two.

Now, the middle issue is sort of the clincher. I find myself, because
of this, wanting to put more things into role variables, because then
they are kept near their associates tasks and templates, but I often
cannot because I want to be able to access then later on from hosts
which will not have that role applied.

As a concrete example, I'm currently writing a playbook to set up
a Gitolite installation, and am going to define the needed repositories
in a hash table and then write them out to the configuration file in
a template. This is likely to be quite verbose.

My initial inclination is to put this in
'roles/gitolite/vars/main.yml', because it would be nice and
centralized. I am going, however, to want to use some values from these
when configuring other roles to deploy applications kept here, doing
something like:

- name: Deploy software from git repository.
git: repo=ssh://{{ gitolite_server }}/{{ repo.foobar.path }} ...

So in practice I will need to put the repository definitions in
'group_vars/all' or perhaps in some specific group's variable file and
then access it indirectly via something like:

'{{ hostvars[groups['some_group'][0]]['repos'] }}'

Which would give me a really kludgey kind of namespacing.

I'm feeling as though I'm doing something silly here. So how do other
folks organize variables? Am I thinking about this insane?

Thanks in advance for any help.

tannerjc · November 21, 2013, 4:01pm

In this case, maybe you could move to external sources for these vars? A couple lookup plugins come to mind … etcd – source data from an etcd instance redis_kv – source data from redis pipe – get data from any arbitrary command dnstxt – grab the TXT record from a dns server More detail:

jpmens1 · November 21, 2013, 4:43pm

In this case, maybe you could move to external sources for these
vars? A couple lookup plugins come to mind ...

etcd -- source data from an etcd instance
redis_kv -- source data from redis
pipe -- get data from any arbitrary command
dnstxt -- grab the TXT record from a dns server

Another one comes to mind: csvreader [1]

(Sorry: I couldn't resist, even though M. is going to impose two weeks
of Van Halen at full blast on me, and rightfully so.

-JP

[1] https://github.com/ansible/ansible/pull/4987

Serge_van_Ginderacht · November 21, 2013, 7:24pm

‘group_vars/all’ contains any variables which I will need to access
over and over, such as domain-level DNS settings, user information,
Yum server URL, etc.

Files for individual groups under ‘group_vars’ contain any variables
which must be accessible for all machines in a group (but not other
machines), and any variables which need to be overridden from
‘group_vars/all’

So far this sounds pretty natural to me, and how I do it myelf.

Files for individual machines under ‘host_vars’ almost exclusively
contain information related to network device configuration, but
also may occasionally contain overrides of group variables.

In my case, atm, here I just have iscsi id’s some host need to connect to.

In roles, I use the ‘vars/main.yml’ for any variables which:
a) I won’t need to access from outside the role.
b) I deem more closely associated with the role itself than, say,
the group(s) which will be having that role applied.

As I see it, “vars” in roles are “pretty much” the same as “vars” of “vars files” in a play. So far, I only use them for data that is never host specific.

Most of these choices are based on some variant of where I might
need to access the variables, which strikes me as almost an
implementation detail, with no real semantic significance.

How I see it:

Inventory variables are host specific, and group_vars/all is for defaults that often are overruled.
I consider play or role vars as static variables, that never change.
But, so far, I don’t have many roles that are used multiple times. I mostly use them to organize things, and don’t have cases where I parametrize roles.

Also at my current gig, playbooks and roles are very specific, and not something I would share as to specific. Once I’d write things I would want to share, I defiitely would start using more role/default variables, and leave role/vars empty to allow other people to customize things to their liking, once they integrate them with their own plays.

‘group_vars/all’ has the potential to get huge. I could see
hitting thousands of lines before too long. This can be very
annoying and error-prone to edit and browse.

Yes, that file gets huge. Although it didn’t bother me yet. So far I see now I have 280 line in that file.

Some of those variables however are things I don’t overrule in sub groups. They are dicts with a bunch of settings that belong together, tied to a particular key, e.g.

JAVA_PACKAGE_HOME:
sun-java6: /usr/lib/jvm/java-6-sun
sun-java7: /usr/lib/jvm/java-7-sun
JAVA_HOME: “{{ JAVA_PACKAGE_HOME[JAVA_PACKAGE] }}”

These are not defaults, but a kind of db where I retrieve stuff based on JAVA_PACKAGE set per group.
I could put them somewhere in a vars file, but then I need to remember to include it in each play that might need it.

Now I think of it, I could see a use of a playbook wide (as not per play, but for all plays) vars file here. But I guess there are better options, than this kin of new syntax.

Deciding whether a variables should be a group variable or a role
variable is a pretty arbitrary process. I can feel my brain trying
to ask what in the hell is the difference whenever I am making the
decision. Part of this is that groups and roles often map 1:1, so
in practice there may be no real distinction between the two.

Yes, roles often map to a group, but are all variables only specific for the role or that whole group?
I have a tomcat role mapping to a tomcat group, installing tomcat applications, and file locations are mostly identical everywhere, but which application to install and which version, is different as per certain application groups etc.

Now, the middle issue is sort of the clincher. I find myself, because
of this, wanting to put more things into role variables, because then
they are kept near their associates tasks and templates, but I often
cannot because I want to be able to access then later on from hosts
which will not have that role applied.

Yes, I can see that happen. I for example have a vars file where I define users. Basically a list of users with dicts setting usernames, descriptions, etc. It’s a file I want to reuse in different plays, sometimes within a play, sometimes within a role; but in the latter case I can have only 1 role/vars/main.yml; so what to do if I have more of these files I need in the role?

But this is now solved with the new include_vars module, which allow you to load variables dynamically, conditionally, and keep them in later plays.
Heck, I wonder if I could replace group_vars with it, might even be much more efficient?

I didn’t get to look into that module in detail yet, but I think it is very promising and very flexible.

Serge

Giorgio_Valoti · November 22, 2013, 10:35am

"C. Morgan Hamill" <chamill@wesleyan.edu> writes:

I was hoping to get some tips on how best to organize variables in
Ansible. Right now, I don't have hard-and-fast rules, but I'm generally
sticking to the following patterns:

  * In roles, I use the 'vars/main.yml' for any variables which:
      a) I won't need to access from outside the role.
      b) I deem more closely associated with the role itself than, say,
         the group(s) which will be having that role applied.

  * I pretty much tend not to use 'defaults/main.yml', because:
      a) I don't want to accidentally have inventory variables override
         them because of accidental namespace collisions.
      b) I like the explicitness of passing overrides to roles in the
         play itself.

I pretty much do the opposite I don't use `vars/main.yml` and I put
in `defaults/main.yml` only the common defaults. Basically I pass every
variable from the including playbook.

Now, this mostly seems to work, but I've noticed a few issues:

  * Most of these choices are based on some variant of where I might
    need to access the variables, which strikes me as *almost* an
    implementation detail, with no real semantic significance.

  * 'group_vars/all' has the potential to get _huge_. I could see
    hitting thousands of lines before too long. This can be very
    annoying and error-prone to edit and browse.

  * Deciding whether a variables should be a group variable or a role
    variable is a pretty arbitrary process. I can feel my brain trying
    to ask what in the hell is the difference whenever I am making the
    decision. Part of this is that groups and roles often map 1:1, so
    in practice there may be no real distinction between the two.

I group most of the vars by OS using this technique:

http://www.ansibleworks.com/docs/playbooks_best_practices.html#operating-system-and-distribution-variance

with the goal to separate OS-specific stuff from the rest. It's not
perfect but it seems to work quite well.

HTH

C_Morgan_Hamill · November 22, 2013, 2:26pm

Thanks so much for all the input, folks.

Excerpts from Serge van Ginderachter's message of 2013-11-21 14:24:15 -0500:

Also at my current gig, playbooks and roles are very specific, and not
something I would share as to specific. Once I'd write things I would want
to share, I defiitely would start using more role/default variables, and
leave role/vars empty to allow other people to customize things to their
liking, once they integrate them with their own plays.

This is the case for me also, so I've been tempted to put things in
role/vars on the grounds that I'm essentially using roles for
organization, but not so much for modularity/generality. Of course,
then I suddenly realize I need to access those variables somewhere else
entirely, and they get yanked into group_vars/all.

It has occurred to me that it might be nice to have a 'rolevars' that
functions analogously to 'hostvars', so I could do something like:

If something like that were accessible from outside a role, it would
mean that roles could do double-duty as a variable store. I'm not sure
how this would interface with role/defaults, however.

Some of those variables however are things I don't overrule in sub groups.
They are dicts with a bunch of settings that belong together, tied to a
particular key, e.g.

JAVA_PACKAGE_HOME:
sun-java6: /usr/lib/jvm/java-6-sun
sun-java7: /usr/lib/jvm/java-7-sun
JAVA_HOME: "{{ JAVA_PACKAGE_HOME[JAVA_PACKAGE] }}"

These are not defaults, but a kind of db where I retrieve stuff based on
JAVA_PACKAGE set per group.
I could put them somewhere in a vars file, but then I need to remember to
include it in each play that might need it.

Now I think of it, I could see a use of a playbook wide (as not per play,
but for all plays) vars file here. But I guess there are better options,
than this kin of new syntax.

I think this is probably the heart of the issue, for me. I want to use
our ansible inventory as our canonical source of information for almost
everything. Other folks have mentioned lookup plugins or other external
sources, but that seems silly when I'd use those sources solely from
within ansible.

Yes, I can see that happen. I for example have a vars file where I define
users. Basically a list of users with dicts setting usernames,
descriptions, etc. It's a file I want to reuse in different plays,
sometimes within a play, sometimes within a role; but in the latter case I
can have only 1 role/vars/main.yml; so what to do if I have more of these
files I need in the role?

But this is now solved with the new include_vars module, which allow you to
load variables dynamically, conditionally, and keep them in later plays.
Heck, I wonder if I could replace group_vars with it, might even be much
more efficient?

This could be a reasonable workaround.

I'm thinking that one of three things makes the most sense for my setup:

1. Suck it up and let 'group_vars/all' get very long.

  2. Place the 'database-y' part of my inventory into an external source
     and have an inventory script that just sticks the contents of it
     into the group vars for the 'all' group.

  3. Patch the 'group_vars.py' plugin to allow 'group_vars/all' to be
     a directory, and to suck in the contents of every YAML file
     underneath that directory, so I could have separate vars files for
     each category:

        group_vars/all/users.yml
        group_vars/all/dns.yml
        group_vars/all/webapps.yml

This would essentially be equivalent to option one, but allow me to
break out 'group_vars/all' into individual files.

Not sure if a patch to that effect would be accepted or if I'd have
to maintain my own.

Does anyone have any thoughts on that?

tannerjc · November 22, 2013, 2:31pm

Thanks so much for all the input, folks.

Excerpts from Serge van Ginderachter's message of 2013-11-21 14:24:15 -0500:

Also at my current gig, playbooks and roles are very specific, and not
something I would share as to specific. Once I'd write things I would want
to share, I defiitely would start using more role/default variables, and
leave role/vars empty to allow other people to customize things to their
liking, once they integrate them with their own plays.

This is the case for me also, so I've been tempted to put things in
role/vars on the grounds that I'm essentially using roles for
organization, but not so much for modularity/generality. Of course,
then I suddenly realize I need to access those variables somewhere else
entirely, and they get yanked into group_vars/all.

It has occurred to me that it might be nice to have a 'rolevars' that
functions analogously to 'hostvars', so I could do something like:

     {{ rolevars.gitolite.foo.bar }}

If something like that were accessible from outside a role, it would
mean that roles could do double-duty as a variable store. I'm not sure
how this would interface with role/defaults, however.

Some of those variables however are things I don't overrule in sub groups.
They are dicts with a bunch of settings that belong together, tied to a
particular key, e.g.

JAVA_PACKAGE_HOME:
   sun-java6: /usr/lib/jvm/java-6-sun
   sun-java7: /usr/lib/jvm/java-7-sun
JAVA_HOME: "{{ JAVA_PACKAGE_HOME[JAVA_PACKAGE] }}"

These are not defaults, but a kind of db where I retrieve stuff based on
JAVA_PACKAGE set per group.
I could put them somewhere in a vars file, but then I need to remember to
include it in each play that might need it.

Now I think of it, I could see a use of a playbook wide (as not per play,
but for all plays) vars file here. But I guess there are better options,
than this kin of new syntax.

I think this is probably the heart of the issue, for me. I want to use
our ansible inventory as our canonical source of information for almost
everything. Other folks have mentioned lookup plugins or other external
sources, but that seems silly when I'd use those sources solely from
within ansible.

Yes, I can see that happen. I for example have a vars file where I define
users. Basically a list of users with dicts setting usernames,
descriptions, etc. It's a file I want to reuse in different plays,
sometimes within a play, sometimes within a role; but in the latter case I
can have only 1 role/vars/main.yml; so what to do if I have more of these
files I need in the role?

But this is now solved with the new include_vars module, which allow you to
load variables dynamically, conditionally, and keep them in later plays.
Heck, I wonder if I could replace group_vars with it, might even be much
more efficient?

This could be a reasonable workaround.

I'm thinking that one of three things makes the most sense for my setup:

   1. Suck it up and let 'group_vars/all' get very long.

   2. Place the 'database-y' part of my inventory into an external source
      and have an inventory script that just sticks the contents of it
      into the group vars for the 'all' group.

   3. Patch the 'group_vars.py' plugin to allow 'group_vars/all' to be
      a directory, and to suck in the contents of every YAML file
      underneath that directory, so I could have separate vars files for
      each category:

         group_vars/all/users.yml
         group_vars/all/dns.yml
         group_vars/all/webapps.yml

      This would essentially be equivalent to option one, but allow me to
      break out 'group_vars/all' into individual files.

      Not sure if a patch to that effect would be accepted or if I'd have
      to maintain my own.

This already exists =)

https://github.com/ansible/ansible/pull/4758

C_Morgan_Hamill · November 22, 2013, 3:02pm

Excerpts from James Tanner's message of 2013-11-22 09:31:37 -0500:

This already exists =)

https://github.com/ansible/ansible/pull/4758

This is the best kind of surprise!

Topic		Replies	Views
best practices in vars organisation Ansible Project	0	6	December 2, 2020
Remove vars directory from ansible-galaxy generated roles? Ansible Developer galaxy-ng	15	1	February 27, 2014
Var precedence Ansible Project ansible-project	34	11	June 20, 2013
modelling inventory variables Ansible Project	11	1	February 11, 2014
List of Ansible default variables Ansible Project	19	16	October 9, 2015

Advice on variable organization.

Related topics