Modeling question: roles vs. plays

We have a good 30 or so Python web applications because we have an SOA architecture. Each listens on a different port, requires different nginx routing rules, requires different Python packages (and sometimes OS packages) but they have a fairly consistent structure. Each one takes a few minutes to install, mostly because of the pip installing. In some environments, they will run on separate hosts (e.g.: prod); in others they'll run on one host (e.g.: dev)

We've currently modeled each app as a role, with each role having some role dependencies on common roles that actually do most of the work (because the apps have similar structure) and we pass params for app name and such.

This works though we've had a few problems where the way role dependencies work caused some problems for us (role dependencies are pre-computed and once ansible decides to exclude a role because one role doesn't need it, that means ansible doesn't use that role even if other roles need it). Anyway, when we mentioned these problems on the mailing list and IRC, a lot of people remarked that role dependencies were a feature to be avoided, because they were added to placate folks that kept asking for them. At least one person thought role dependencies should be removed (this person may or may not have any influence).

I'm wondering if we would be better off modeling our apps as separate plays rather than roles. That is motivated by:

1. The subtle behavior of role dependencies.
2. The fact that it takes a long time to install all of our apps to one host because it's serial. We don't benefit from ansible's ability to parallelize across hosts. One possible way to address this is to add a bunch of fake hosts to the inventory to trick Ansible into parallelizing (hacky). Another is to make Ansible only worry about doing one app and then launch multiple ansible-playbook procs in parallel from the shell.

Wondering if others have been in the same boat and what they did. Or if there are some guidelines that help with deciding...?

Marc

We have a good 30 or so Python web applications because we have an SOA architecture. Each listens on a different port, requires different nginx routing rules, requires different Python packages (and sometimes OS packages) but they have a fairly consistent structure. Each one takes a few minutes to install, mostly because of the pip installing. In some environments, they will run on separate hosts (e.g.: prod); in others they’ll run on one host (e.g.: dev)

We’ve currently modeled each app as a role, with each role having some role dependencies on common roles that actually do most of the work (because the apps have similar structure) and we pass params for app name and such.

That’s about how we do it as well. We separate the tasks that aren’t related to the app to separate roles though. So the first play runs ‘common’, ‘collectd’, etc roles on all hosts, then there’s a play for each hostgroup with the app roles in them.

This works though we’ve had a few problems where the way role dependencies work caused some problems for us (role dependencies are pre-computed and once ansible decides to exclude a role because one role doesn’t need it, that means ansible doesn’t use that role even if other roles need it).

Can you give an example? I’ve never experienced it myself, and I use role deps quite a lot.

Anyway, when we mentioned these problems on the mailing list and IRC, a lot of people remarked that role dependencies were a feature to be avoided, because they were added to placate folks that kept asking for them. At least one person thought role dependencies should be removed (this person may or may not have any influence).

I’m wondering if we would be better off modeling our apps as separate plays rather than roles. That is motivated by:

  1. The subtle behavior of role dependencies.
  2. The fact that it takes a long time to install all of our apps to one host because it’s serial. We don’t benefit from ansible’s ability to parallelize across hosts. One possible way to address this is to add a bunch of fake hosts to the inventory to trick Ansible into parallelizing (hacky). Another is to

When installing a dev machine it does take longer. You do it only once - the rest of the time is updates, so you don’t re-deploy everything. To speed up things you can try these:

  • Tag tasks that need to run only on the first time (f.ex apt-get install on deps, creating users etc) and then --skip-tags them on later deployments. Can save some time if ansible takes some time to realize there’s nothing to be done
  • Create binary dists/virtual envs/docker images and deploy those. Can save some ‘pip install’ time.

Other than that, I work pretty much like you. Improving playbook run times is on my todo somewhere :stuck_out_tongue: