Playbook organization with var files

I'm looking for advice about how to organize my playbooks. Not so much
the content as their structure on the file system.

Currently I have all of my configuration management (role-level)
playbooks at the top level with things like common.yml, app.yml,
db.yml, etc. These correspond directly to roles and are pretty much
just like the examples in
http://docs.ansible.com/playbooks_best_practices.html#directory-layout
(along with a site.yml to run through them all for the whole cluster).

But I also have a bunch of playbooks for doing various things like
rolling upgrades of certain applications, updating configurations of
other applications, creating DB snapshots and replication, AWS server
creation, etc. I've started to accumulate a lot of these and have
started to put them into a plays/ directory organized into per-topic
subdirectories. This involved a little bit of rewriting so that all of
the paths (files, templates, etc) need "../../" path prefixes since
they were being loaded relative to the playbook and not the cwd. Not
too bad.

But this means that the variables in group_vars, host_vars, etc aren't
loaded automatically. I've tried putting boilerplate "vars_files" to
load everything it needs that looks something like this:

  vars_files:
    - ../../group_vars/all.yml
    - ../../group_vars/{{ ec2_tag_environment }}.yml
    - ../../group_vars/{{ ec2_tag_role }}.yml

(where ec2_tag_environment and ec2_tag_role are facts provided by
ec2.py that correspond to groups)

Not only would that be annoying to have to copy/paste into each of
these playbooks, but this doesn't quite seem to work. Variables loaded
by vars_files don't seem to go into hostvars for that host. I assume
they are just globally scoped but I have other plays and included
files that rely on things to be in hostvars and I'd rather not rewrite
those just because the playbooks moved around (and sometimes they
can't be changed because they are included in top-level playbooks
too). I guess changing paths to files seems reasonable when moving
things, but not the scope of variables.

So, what am I doing wrong? Should I just have dozens (and in the
future likely hundreds) or playbooks littered at the top level? Could
ansible look for group_vars in the CWD as well as the location of the
playbook? Something else?

Thanks,

I’d consider using roles and just making your role path configurable, which would still make it easy to keep all your plays in whatever location you like.

(You could even keep roles in different repos, etc)

I do use roles rather extensively for configuration management tasks.
But for other (more orchestration related) tasks it doesn't seem to
make much sense or at least I'm not seeing how best to organize that.
For a more concrete example I have some mysql databases, some acting
as master nodes some as read-only slaves. There is a mysql role and a
mysql slave role that takes care of making sure the servers are
provisioned and configured correctly. But I also have a playbook used
when setting up (or resetting) replication between a master and a
slave. This playbook uses some local actions, delegate_to takes and
some templates and variables from both of those other roles. So it
doesn't really belong to either role and making it it's own role would
involve duplicating the things it currently shares with those roles
(for instance, roles can't share templates).

But even turning all these orchestration playbooks into roles doesn't
reduce the clutter at the top level since roles can't be executed by
themselves. I'd still need lots of top-level playbooks that called
these roles. And I'm not sure different repos really help that much
either. I'm trying to keep things simple and reduce copy-paste. Having
things in separate repos would increase the amount of things that need
to be copied between those repos.

A lot of these problems could be solved if ansible could look for
group_vars in the CWD. Would that cause other problems?

I don’t feel the cwd should matter in most cases and it should not be significant.

Ansible already looks for group_vars relative to the playbook dir, so keeping your top level playbooks in the same directory is recommended.

I don't feel the cwd should matter in most cases and it should not be
significant.

But it does matter in some. I can't get playbooks that aren't in the
same directory as group_vars to behave like those that are. And
there's nothing in the documentation that makes this obvious as to why
or that says that using subdirectories to organize playbooks is a bad
idea.

Ansible already looks for group_vars relative to the playbook dir, so
keeping your top level playbooks in the same directory is recommended.

Am I doing something wrong by having so many playbooks? This is a
fairly smallish project with 20 roles, a site.yml, and 20+ other
playbooks for orchestrating other common tasks. And that's not
counting any of the multiple task files in roles that are included.
That right there is getting close to 50 playbook files sitting at the
top level if I did it that way. And this project is just getting
started.

Using the cwd as an extra path lookup seems like a pretty easy way to
add this capability without any problems that I can see.

“But it does matter in some. I can’t get playbooks that aren’t in the
same directory as group_vars to behave like those that are.”

This is not because of cwd at all, but because of the playbook basedir.

You're right that the cwd isn't causing the difference in behavior. I
was trying to point out that it seems weird that playbooks behave
differently depending on where they are located. And these differences
are more than just path differences.
And that using the cwd in addition to the playbook basedir would
remove these differences for lots of cases.

But you're right that it wouldn't remove them for all cases because
things could still not work if your cwd wasn't the same between
executions.

At the very least the docs should mention that you shouldn't put
playbooks in subdirectories if that's an unsupported setup or at least
mention what things won't work right if you do.

You want to change behavior from relative to playbook to relative to where you run them from? I find that less intuitive, having it relative to playbook gives you clear ‘packaging’, if I decide to run the playbook when I’m in /var/tmp and now it behaves differently, I would find that very confusing.​

I don't want to change the current behavior, just add to it. This
wouldn't stop using the playbook basedir as the primary place to look.
But you're right, it would change the execution based on where you
execute the command. But only if you were using the cwd feature. This
wouldn't stop using the playbook basedir as the primary place to look.

If there's a better way than I'm open to it. Maybe it's a flag that
can be turned on per playbook? Or maybe a playbook can have a
search_paths parameter to list places to look. Or something else?

All of this cwd discussion comes out of the fact that I'm trying to
avoid having 50+ (now, likely closer to 100 later on) playbooks
littering a single directory. If there's a better way, I'm open to it.

I have many more playbooks than that, I use subdirs and symlinks for common resource files and directories, I don’t see it as a big issue. I have also been consolidating many tings into roles, which simplifies the sharing.​

So how do you solve loading group_vars automatically? Or if you don't,
how do you resolve the differences between how group_vars are treated
when automatically loaded vs loading them explicitly with vars_files ?
That's the issue that led me on this wild chase.

group_vars are inventory related, vars_files are play related, I’m not sure what the confusion or hardship comes from​.

I do mostly use group_vars relative to the inventory dir, not the plays.