Ansible EC2 inventory

Hi all,

I have started on the EC2 external inventory script for Ansible, and things are going well thanks to Boto which has done most of the hard work.

What I want to discuss is how best to name groups of instances. I am planning on grouping instances by instance IDs, tags, security groups, regions, availability zones, and possibly more. Things like instance IDs are simple, but security groups can have funky names, and tags makes it even more complicated.

I am hoping whatever naming solution we come up with can be used by both ansible as well as ansible-playbook. Current naming and pattern matching are here:

http://ansible.github.com/patterns.html

Examples:

ansible -i ./ec2.py i-f08f8689 -m ping

ansible -i ./ec2.py ‘security-group[default]’ -m ping
ansible -i ./ec2.py ‘tag[Name=Some FuNny name]’ -m ping

So far, this style works, but is far from ideal. Currently, colon is used as a list separator, so I could not use it to do something like

ansible -i ./ec2.py ‘tag:Name:Value’ -m ping

because it first tries to match “tag”, then “Name”, and finally “Value” separately.

ansible-playbook also supports groups of groups, while ansible does not.

Any suggestions on the best way of naming these when fields when they can contain a wide variety of characters?

Example playbook:


  • hosts: tag[Name=Some FuNny name]
  • tasks:
  • name: Do stuff
    action: ping

Gives:

$ ansible-playbook -i ./ec2.py play.yml

PLAY [tag[Name=Some FuNny name]] *********************

SETUP PHASE *********************

ok: [ec2-23-22-197-78.compute-1.amazonaws.com]

ERROR: hosts declaration is required

Not quite sure what that error is able since it appears to work.

Replacing colons with dashes or underscores seems most obvious, as ansible uses it as a tag separator.

One word of warning -- ansible inventory scripts are queried more than once per playbook run. It may be worthwhile to try to optimize this in Ansible, or otherwise it may be worthwhile to optimize this in the inventory script. The first is probably a better solution, but the second isn't bad either -- it could choose to cache it's results to a file, and then just hit EC2 API again when the file was older than X seconds. Either way...

Alternatively you might consider a script that looks at ec2 and then statically generates your YAML inventory file, though I'd really like to see inventory scripts be the way to go, just because it's a nice concept and doesn't make anything into two steps. On the other hand, generating the inventory file is nice for being able to show what you've got, which isn't necessarily a bad thing either.

As far as naming, hosts could also be in more than one group, so maybe you could do something like "zone_XYZ" and have other groups like "tag_XYZ" ?

As I mentioned on list, something like this definitely deserves to be it's own sub-project in the ansible org, and once you have something configurable and generic, I'll gladly fork it and give you commit rights on it. I can see this getting a lot of use.

--Michael

Example playbook:
> ---
> - hosts: tag[Name=Some FuNny name]
> - tasks:
> - name: Do stuff
> action: ping

Perhaps replace spaces with underscores also, and drop the angle brackets altogether, and separate the word tag with a double underscore.

tag__Some_FuNny_name

Not too hideous.

Gives:
> $ ansible-playbook -i ./ec2.py play.yml
>
> PLAY [tag[Name=Some FuNny name]] *********************
>
> SETUP PHASE *********************
>
> ok: [ec2-23-22-197-78.compute-1.amazonaws.com (http://ec2-23-22-197-78.compute-1.amazonaws.com)]
>
> ERROR: hosts declaration is required

Not quite sure what that error is able since it appears to work.

Is there more of the playbook to see?

That should be pretty easy to debug, regardless. ansible/playbook/*

Hi all,

I have started on the EC2 external inventory script for Ansible, and things are going well thanks to Boto which has done most of the hard work.

What I want to discuss is how best to name groups of instances. I am planning on grouping instances by instance IDs, tags, security groups, regions, availability zones, and possibly more. Things like instance IDs are simple, but security groups can have funky names, and tags makes it even more complicated.

I am hoping whatever naming solution we come up with can be used by both ansible as well as ansible-playbook. Current naming and pattern matching are here:

http://ansible.github.com/patterns.html

Examples:

ansible -i ./ec2.py i-f08f8689 -m ping
ansible -i ./ec2.py ‘security-group[default]’ -m ping
ansible -i ./ec2.py ‘tag[Name=Some FuNny name]’ -m ping

So far, this style works, but is far from ideal. Currently, colon is used as a list separator, so I could not use it to do something like

ansible -i ./ec2.py ‘tag:Name:Value’ -m ping

because it first tries to match “tag”, then “Name”, and finally “Value” separately.

ansible-playbook also supports groups of groups, while ansible does not.

Any suggestions on the best way of naming these when fields when they can contain a wide variety of characters?

Replacing colons with dashes or underscores seems most obvious, as ansible uses it as a tag separator.

OK, character substitutions it is.

One word of warning – ansible inventory scripts are queried more than once per playbook run. It may be worthwhile to try to optimize this in Ansible, or otherwise it may be worthwhile to optimize this in the inventory script. The first is probably a better solution, but the second isn’t bad either – it could choose to cache it’s results to a file, and then just hit EC2 API again when the file was older than X seconds. Either way…

Yes I already have the file cache in place.

Alternatively you might consider a script that looks at ec2 and then statically generates your YAML inventory file, though I’d really like to see inventory scripts be the way to go, just because it’s a nice concept and doesn’t make anything into two steps. On the other hand, generating the inventory file is nice for being able to show what you’ve got, which isn’t necessarily a bad thing either.

At the moment, the script returns JSON, while inventory files are either ini-style or yaml. I guess I could write a different output method, but that will be something to do later.

As far as naming, hosts could also be in more than one group, so maybe you could do something like “zone_XYZ” and have other groups like “tag_XYZ” ?

As I mentioned on list, something like this definitely deserves to be it’s own sub-project in the ansible org, and once you have something configurable and generic, I’ll gladly fork it and give you commit rights on it. I can see this getting a lot of use.

Other than the inventory file script, what else did you see going into this sub-project? I have my own ideas, but want to hear about yours as well.

Other than the inventory file script, what else did you see going into this sub-project? I have my own ideas, but want to hear about yours as well.

I didn't have a whole lot in mind.

A simple script to provision a new EC2 node and then automatically ansible-ize it to a given playbook would be pretty slick, even it's just wrapping a few basic things, that would be pretty sharp, for starters.

Later, it might be nice if I could say make me 5 new acme-foo-servers, etc. Now make it 10, etc.

For bonus points, you could also integrate virt-control features, so you could decide of your pool of 10 acme-foo-servers (an ansible group corresponding to a tag, maybe), that you only want 5 running right now, etc.

Because of how ansible now allows nodes to see the groups other nodes are in, and the variables for other nodes, you could do some pretty clever multi-node scaling tricks.

I always wanted to do something lightweight like this on top of Cobbler, Func, and libvirt (like a mini-cloud controller), but really EC2 is a much cleaner place to do it.

--Michael

First pass of the EC2 external inventory script is done. You can find it here:
https://github.com/pas256/ansible/blob/devel/examples/scripts/ec2_external_inventory.py

Please take at look at it and if you are happy, I will add a PR.

Thank you for your feedback so far.

Peter

This looks *quite* sharp. Nice code.

Any comments from EC2 users? It might be useful to make some notes on what variables it makes available and where groups come from at the top of the file.

I am very tempted to remove the cobbler one from the tree (doesn't take parameters correctly currently, doesn't cache) and let this be the example we point folks to.

--Michael

Hi Peter,

Nice EC2 inventory script. I did some tests:(skipped vpc variable
below)

$ export ANSIBLE_HOSTS=$(pwd)/ec2_external_inventory.py

$ cat ec2.ini

[ec2]
regions: us-east-1
cache_path: /tmp
cache_max_age: 3600
destination_variable: public_dns_name
vpc_destination_variable: vpc_dest_variable

$ cat play.yaml

Thanks for testing.

PR done: https://github.com/ansible/ansible/pull/600

Peter

There is more than an example, there is a complete and working one in github:
https://github.com/ansible/ansible-plugins/blob/master/inventory/ec2.ini

Please let me know if you have any issues.

Peter

Thanks Peter, however that URL gives me a 404?

Does it work for you?

We've merged the repos.

You want to look over here:

https://github.com/ansible/ansible/tree/devel/plugins/inventory

Brilliant, thanks Michael.