Choosing a subset of hosts to run a playbook on

We do not have any slicing yet. (Like ansible-playbook foo.yml --slice 0-10%)

Options include:

(A)

Create subgroups and pass them in like so:

hosts: $hosts

ansible-playbook foo.yml --extra-vars=“hosts=blarg”

(B)

Use a stage environment

Hi,

(disclaimer: I’m completely new to ansible)

Say I have a playbook set up to run on “hosts: webservers”. Is there a way to run this playbook only on a subset of these hosts? I’d like to deploy the playbook gradually (on a human time scale, not just “don’t forkbomb”) instead of pushing the changes everywhere at once.

We do have “-i hostname,hostname,hostname” # at least one comma is required which is a hair gross and ignores your inventory file. Not so recommended, and I still want a cleaner way of doing that.

I saw a post that said to use “hosts: $group” and pass $group on the command line but (IMHO) that is ugly and not self-explanatory. I was a bit surprised ansible-playbook does not take a host-list parameter. Would a patch to add it stand a chance of being accepted?

Yes. In this case, it might make since to do a ansible-playbook --slice-range “0:10” where that would select the first 10. That seems cleaner than having to remember and type in a ton of them all at once. I’d really like to see this, though it may not be super trivial, it is definitely not impossible. Most of the love would have to happen in inventory.py, with values passed up from Playbook() via the command line arguments to /usr/bin/ansible/playbook

Workaround: use child groups, such that “webservers” has child groups “webservers1,2,3,4” etc. The INI format file supports subgroups.

If anyone wants to attempt this, I've made some notes on implementation on

https://github.com/ansible/ansible/issues/691

W dniu środa, 25 lipca 2012 23:30:41 UTC+2 użytkownik Michael DeHaan napisał:

We do not have any slicing yet. (Like ansible-playbook foo.yml --slice 0-10%)

I’d rather know, which machines get updated (they’re not identical and replaceable), so “any 10%” doesn’t cut it for me.

Options include:

(A)

Create subgroups and pass them in like so:

hosts: $hosts

ansible-playbook foo.yml --extra-vars=“hosts=blarg”

Is there a way to specify a default, so that:

ansible-playbook foo.yml

runs e.g. on the webservers group while still allowing override from the command line?

(B)

Use a stage environment

Doesn’t let me deploy changes gradually, only increases my confidence in the changes (up to the point I’m supposed to push to all hosts at the same time?).

We do have “-i hostname,hostname,hostname” # at least one comma is required which is a hair gross and ignores your inventory file. Not so recommended, and I still want a cleaner way of doing that.

A -H inventory_host,inventory_group,inventory_wildcard* would be best for me.

Yes. In this case, it might make since to do a ansible-playbook --slice-range “0:10” where that would select the first 10. That seems cleaner than having to remember and type in a ton of them all at once. I’d really like to see this, though it may not be super trivial, it is definitely not impossible. Most of the love would have to happen in inventory.py, with values passed up from Playbook() via the command line arguments to /usr/bin/ansible/playbook

I understand this is often desirable (with e.g. a large number of identical machines fronted by a load balancer) but does not fit my use case. My servers are identical only on some levels (as in, they all have customers managed the same way), but I cannot deploy “somewhere”.

Workaround: use child groups, such that “webservers” has child groups “webservers1,2,3,4” etc. The INI format file supports subgroups.

This would still require extensions to ansible, wouldn’t it? If I have a playbook with “hosts: webservers”, how do I run it on the group “webservers1”?

Best regards,
Grzegorz Nosek

> We do not have any slicing yet. (Like ansible-playbook foo.yml --slice 0-10%)

I'd rather know, which machines get updated (they're not identical and replaceable), so "any 10%" doesn't cut it for me.

Yeah, but if you slice by ranges (numbers) you would, and the hosts that get updated would be included in the output anyway.

If you have to feed the full list of hosts in that seems bad to me, because it wouldn't scale in terms of the command line interface experience to a very large number of hosts.

However, this would assume the hosts you want to guinea pig aren't very very specific hosts.

The great challenge of building these kinds of apps is everyone wants to do it differently, and I'm trying not to build in ten ways to do the same thing … so yeah… need to think about this.

>

Is there a way to specify a default, so that:

ansible-playbook foo.yml

runs e.g. on the webservers group while still allowing override from the command line?

A -H inventory_host,inventory_group,inventory_wildcard* would be best for me.

--override-hosts used to exist, but it did things I didn't care for. It added hosts to inventory that were not there, and it also didn't support wildcards.

I can see something like what you say existing, though I'd be tempted to call it something like "--additional-filtering" where hosts must ALSO be in the groups listed in the playbook.

Doing so could be a basis of the slicing functionality too, internals wise

> Yes. In this case, it might make since to do a ansible-playbook --slice-range "0:10" where that would select the first 10. That seems cleaner than having to remember and type in a ton of them all at once. I'd really like to see this, though it may not be super trivial, it is definitely not impossible. Most of the love would have to happen in inventory.py, with values passed up from Playbook() via the command line arguments to /usr/bin/ansible/playbook

I understand this is often desirable (with e.g. a large number of identical machines fronted by a load balancer) but does not fit my use case. My servers are identical only on some levels (as in, they all have customers managed the same way), but I cannot deploy "somewhere".

> Workaround: use child groups, such that "webservers" has child groups "webservers1,2,3,4" etc. The INI format file supports subgroups.

This would still require extensions to ansible, wouldn't it? If I have a playbook with "hosts: webservers", how do I run it on the group "webservers1"?

Ansible already supports groups of groups today, and has had this in 0.5 as well, for INI format files. This is listed in the web docs where it talks about inventory files.

You would have to change it like so:

hosts: $webservers

--extra-vars="webservers=webservers1"

I know you said you didn't like this, but it does give you some degree of control to do some of the above today.

I need to think about the filtering part a bit, but would be curious on thoughts.

W dniu 26.07.2012 13:26, Michael DeHaan pisze:

Yeah, but if you slice by ranges (numbers) you would, and the hosts
that get updated would be included in the output anyway.

If you have to feed the full list of hosts in that seems bad to me,
because it wouldn't scale in terms of the command line interface
experience to a very large number of hosts.

However, this would assume the hosts you want to guinea pig aren't
very very specific hosts.

Actually I *do* want specific hosts. Thinking more about it, what I'd
really like would be to define a hierarchy:

* master group 1
   * subgroup 1
     * host 1
     * host 2
     * etc.
   * subgroup 2
* master group 2

and specify "hosts: master_group_1" in playbooks but still be able to
run these playbooks on:
- a single named host (inside master_group_1)
- a subset of hosts (e.g. 1/3 of subgroup_1)
- a subgroup
- the whole master group

I'd need a way of knowing what exactly "1/3 of subgroup_1" means for
ansible, but that should be doable with a playbook that goes ping (and
consistent ordering of hosts in slices).

The great challenge of building these kinds of apps is everyone wants
to do it differently, and I'm trying not to build in ten ways to do
the same thing … so yeah… need to think about this.

How about letting the user specify an expression like in only_if? If the
expression returned True or False for every host, combined with slicing
it should be quite expressive, right? My examples above could look like:

-F '$name == "host1"'
-F '"subgroup1" in $groups' --slice 0:20
  (hosts 0..19, sorted alphabetically, or in inventory order etc)
-F '"subgroup1" in $groups'
(no filter)

(the $ is nasty due to shell quoting but consistent with only_if)

OK, it would be ugly and wouldn't scale the best if you wanted to pick a
couple of hosts from a group of thousands, but it would be a start.

I can see something like what you say existing, though I'd be tempted
to call it something like "--additional-filtering" where hosts must
ALSO be in the groups listed in the playbook.

Fine for me, I don't care too much about adding hosts ad-hoc. BTW,
--hosts-filter?

Doing so could be a basis of the slicing functionality too, internals
wise

Or the slicing could be applied later.

Ansible already supports groups of groups today, and has had this in
0.5 as well, for INI format files. This is listed in the web docs
where it talks about inventory files.

You would have to change it like so:

hosts: $webservers

--extra-vars="webservers=webservers1"

I know you said you didn't like this, but it does give you some
degree of control to do some of the above today.

OK, I thought I missed something.

I need to think about the filtering part a bit, but would be curious
on thoughts.

Thinking out loud :slight_smile:

Best regards,
Grzegorz Nosek

It seems to me like this complexity should be contained in the
configuration, not the application, since it's not applicable to
everyone, and even the people who want it sometimes probably don't want
it most of the time.

If I were to address the OP's subset-of-a-group problem in
configuration, I'd create a webservers-a group and a webservers-b group,
and then make the webservers group contain webservers-a and
webservers-b. This lets you address parts of a group specifically, so
you can roll out apps to part of your infrastructure, without adding
complex filtering logic that can introduce parsing and semantic bugs for
everyone.

If this is insufficient for some needs, I would go so far as to say that
what is needed is a separate app that dynamically writes inventory files
for ansible, that would allow for complex selection criteria without
complexifying ansible.

If this is insufficient for some needs, I would go so far as to say that
what is needed is a separate app that dynamically writes inventory files
for ansible, that would allow for complex selection criteria without
complexifying ansible.

It's called "external inventory". [1]

        -JP

[1] http://ansible.github.com/api.html#external-inventory

W dniu 26.07.2012 15:31, Jan-Piet Mens pisze:

If this is insufficient for some needs, I would go so far as to say that
what is needed is a separate app that dynamically writes inventory files
for ansible, that would allow for complex selection criteria without
complexifying ansible.

It's called "external inventory". [1]

        -JP

[1] http://ansible.github.com/api.html#external-inventory

Perfect! Well, almost, but I'm sure I can massage it into something that
fits my needs.

FWIW, here's my reply to Darren Chamberlain written before your mail
(still somewhat relevant).

yep, if people have a lot of “I really want this workflow”, the external inventory stuff is indeed the best option.

we will probably still peruse some native slicing support, since some other apps have that and I find it interesting, and it’s easy to do.

W dniu 26.07.2012 15:31, Jan-Piet Mens pisze:

If this is insufficient for some needs, I would go so far as to say that
what is needed is a separate app that dynamically writes inventory files
for ansible, that would allow for complex selection criteria without
complexifying ansible.

It's called "external inventory". [1]

         -JP

[1] http://ansible.github.com/api.html#external-inventory

Perfect! Well, almost, but I'm sure I can massage it into something that
fits my needs.

Yes, this is basically what I meant. Obviously I need to reread the docs. :slight_smile:

FWIW, here's my reply to Darren Chamberlain written before your mail
(still somewhat relevant).

-------
My inventory files are already going to be generated, but this would
require a per-playbook group and a tool to maintain the lot.

Actually, this part isn't true, at least if I correctly understand what
you mean. If you're always generating an inventory file that contains
only and exactly the hosts to operate on, the hosts: key in your
playbook can simply be "all".

W dniu 26.07.2012 15:45, Chamberlain, Darren pisze:

Perfect! Well, almost, but I'm sure I can massage it into something that
fits my needs.

Yes, this is basically what I meant. Obviously I need to reread the docs. :slight_smile:

Yeah, especially for such a young project ansible's documentation is
really good and we still don't read it :wink:

BTW, the external inventory would be better still, if ansible passed a
parameter to --list with the host/group name it's interested in. The
return value wouldn't need to change at all (still a dict of lists) but
it would enable creating groups on the fly (or e.g. storing them in an
external data source that's fast to query for individual records but
heavy to dump).

With a little bit of luck it would be backwards compatible (though I can
see even from git grep that cobbler_external_inventory would need a tweak).

Anyway, I see it would be rather hard to do cleanly with the current
Inventory* classes interface.

My inventory files are already going to be generated, but this would
require a per-playbook group and a tool to maintain the lot.

Actually, this part isn't true, at least if I correctly understand what
you mean. If you're always generating an inventory file that contains
only and exactly the hosts to operate on, the hosts: key in your
playbook can simply be "all".

This breaks horribly the moment two people want to push something to a
different subset of hosts via ansible. It's a slim chance, but still
it's bad.

Best regards,
Grzegorz Nosek

--host host

already exists

I think you mean something like --list-groups <group>

and

--group <group>

versus the existing --list

unfortunately doing so would break compatibility with everyone who's
already wrote one

W dniu 26.07.2012 16:13, Michael DeHaan pisze:

--host host

already exists

Yes, but it returns detailed info about a single host, not a list of
host names.

I think you mean something like --list-groups <group>

and

--group <group>

versus the existing --list

Yes, something like this.

unfortunately doing so would break compatibility with everyone who's
already wrote one

I know. Too bad. I just have to get my hands dirty and see what works
for me, either with the external inventory, or with a patch.

Best regards,
Grzegorz Nosek

We're kind of getting off-topic here, but....

I'm thinking something like:

   $ ansible-playbook -i /tmp/ansDkf9S pb1.yaml

And another person could simultaneously run:

   $ ansible-playbook -i /tmp/anscI9wi pb1.yaml

Where /tmp/ansDkf9 and /tmp/anscI9wi are true temp files generated by
the external tool. I don't see how these could collide. (I'm
legitimately interested in tracking down potential problems with this
approach, because I want to implement something along these lines in our
setup.)

There shouldn’t be any, ansible tempdirs contain random components and we no longer push the setup file down at all (which used to be the only one thing that would
be a problem with that).

W dniu 26.07.2012 17:30, Chamberlain, Darren pisze:

We're kind of getting off-topic here, but....

I'm thinking something like:

   $ ansible-playbook -i /tmp/ansDkf9S pb1.yaml

And another person could simultaneously run:

   $ ansible-playbook -i /tmp/anscI9wi pb1.yaml

Where /tmp/ansDkf9 and /tmp/anscI9wi are true temp files generated by
the external tool. I don't see how these could collide. (I'm
legitimately interested in tracking down potential problems with this
approach, because I want to implement something along these lines in our
setup.)

Oh, right. Didn't think of temporary inventory files. Definitely looks
good for me, thanks a lot for this idea.

Best regards,
Grzegorz Nosek

I feel your pain as I need something similar.

When we had yaml inventories, we could add groups on a per node basis. If the same would be possible with the current per-host variable files, it would know what group a host belongs to, and "-i hostname,hostname" would still act on plays for e.g. group "webserver".

With yaml I could effectively recreate only the host yaml node to also get the groups it was part of, which would allow me to simply create a subset yaml file with the nodes I want to run something on directly from my big yaml inventory and instruct this subset yaml using ansible-playbook -i. This is no longer possible as INI files are harder to automate/generate and I am forced into programming my inventory instead or revert to plain INI files :-/