[newbie]Start/stop EC2 instances based on tags

Hey all,

I’m trying to use the EC2 module to start/stop instances, but I don’t have the specific IDs ahead of time. I’ve been trying to get it to run one task to grab the IDs from the EC2 inventory, and one to actually run the command on localhost, but I can’t seem to find a way to match both. I either get “no hosts match” (when I use a [local] inventory) or I cannot find localhost when using the ec2 inventory.

Is there an example anywhere of collecting ec2 instance ids and reusing them?

Yes, I just opened a pull request for a module I’ve written to aid in tasks like this: https://github.com/ansible/ansible/pull/6349

Example:

  • name: Obtain list of existing stopped instances with tag ‘environment’ equal to ‘prod’
    local_action:
    module: ec2_instance_facts
    states:

  • stopped

tags:

environment: “prod”
region: “{{ vpc_region }}”
aws_access_key: “{{ aws_access_key }}”
aws_secret_key: “{{ aws_secret_key }}”
register: instance_facts
ignore_errors: yes

  • name: Start the stopped instances
    local_action:

module: ec2
state: running
instance_ids: “{{ item.id }}”
wait: yes
wait_timeout: 600
region: “{{ vpc_region }}”
aws_access_key: “{{ aws_access_key }}”
aws_secret_key: “{{ aws_secret_key }}”
with_items: instance_facts.instances

Regards,
-scott

This feels like it belongs in the inventory plugin, so it is automatic, can you expand on thoughts?

– Michael

Yes, I wrote this precisely because the inventory plugin didn’t suit my needs. The inventory module only works for existing instances.

I create a maintenance instance on the fly for burning AMIs. I need to check to see if the instance exists; if not, I create it using a base AMI in a particular VPC, then configure the instance with whatever roles I need (web, monitoring, syslog, etc) by passing the newly created host IP to add_host. Finally, I burn an AMI from the instance and shut it down.

Regards,
-scott

I forgot to add, the inventory module only gives the instance-ids, not the actual facts about those instances. If the instance isn’t running, I don’t want to have to start it up, log in, and use ec2_facts (which only works for a single instance) to get this information. That’s very, very slow compared to directly gathering the ec2_instance_facts information without having to actually log into the instances individually.

-scott

“Yes, I wrote this precisely because the inventory plugin didn’t suit my needs”

The problem here is Ansible is trying to serve everyone’s needs. If it becomes a conglomeration of 500-ways-to-do-something, we quickly get into a lot of sprawl.

“I need to check to see if the instance exists; if not, I create it using a base AMI in a particular VPC”

The exact_count parameter in 1.5 would be a good choice here. There’s also a pull request to configure autoscaling groups (set the size to 1).

It also seems like you could just use the ec2_ami module, aminator, or Packer.

“Yes, I wrote this precisely because the inventory plugin didn’t suit my needs”

The problem here is Ansible is trying to serve everyone’s needs. If it becomes a conglomeration of 500-ways-to-do-something, we quickly get into a lot of sprawl.

Well, someone highly placed in the Ansible project once told me that the inventory plugins were just examples, and not really supported… this would add a supported means of doing this… :wink:

While I appreciate your point, the issue remains that the inventory script (apart from being an “example”), is not a callable piece of functionality. Having a module able to provide information about instances is highly valuable by allowing more flexibility. I get the feeling that the only reason this is objectionable is that it has to do with instances, which are considered inventory and hence in some way special. This seems to be a pretty good way to ensure that they are always treated specially, which will increase the friction any time someone needs to do something slightly different.

Perhaps the better approach is to refactor both the ec2.py script and this module to share underpinnings, giving both an external inventory and internal module view on the same functionality.

“I need to check to see if the instance exists; if not, I create it using a base AMI in a particular VPC”

The exact_count parameter in 1.5 would be a good choice here. There’s also a pull request to configure autoscaling groups (set the size to 1).

Figures, that didn’t exist when I wrote my playbook.

It also seems like you could just use the ec2_ami module, aminator, or Packer.

Packer is exactly what I’ve done with my playbook using ec2_ami, but I’d much rather use Ansible natively then integrate a bunch of other stuff to do something Ansible is capable of doing on its own.

Regards,
-scott

"Yes, I wrote this precisely because the inventory plugin didn't suit my
needs"

The problem here is Ansible is trying to serve *everyone's* needs. If it
becomes a conglomeration of 500-ways-to-do-something, we quickly get into a
lot of sprawl.

Well, someone highly placed in the Ansible project once told me that the
inventory plugins were just examples, and not really supported... this would
add a supported means of doing this... :wink:

Things change a lot over time, and my meaning was also different.

In particular, if there are non-standard use cases, they are things that
can be adapted and modified. There are already a ton of people
contributing to all of the inventory scripts.

Still, they are in fact examples -- meaning people are also free to modify
them if they want to add more groups and so on.

Thus they don't need to be all things to everyone, either.

While I appreciate your point, the issue remains that the inventory script
(apart from being an "example"), is not a callable piece of functionality.
Having a module able to provide information about instances is highly
valuable by allowing more flexibility. I get the feeling that the only
reason this is objectionable is that it has to do with instances, which are
considered inventory and hence in some way special. This seems to be a
pretty good way to ensure that they are always treated specially, which
will increase the friction any time someone needs to do something slightly
different.

I'd need to better understand use cases to see why this didn't fit. It
seems one of the best place to put data-driven information about existing
instances.

I'm also not sure what you mean about "increasing friction here" -- they
are in fact inventory, so this is more of a discussion about not
duplicating things that can be easily sourced from inventory.

And yes, part of my job is exactly to be the friction or sieve to decide
what we take on and what we don't -- so I do need to understand why
something couldn't be done with the dynamic inventory data.

Are you using ec2.py?

What data is missing from what ec2.py returns and what needs to be
"callable?"

Help me understand more.

Perhaps the better approach is to refactor both the ec2.py script and this
module to share underpinnings, giving both an external inventory and
internal module view on the same functionality.

I'm still having trouble understanding what "internal module view" here
would serve, use case wise.

You'll need to convince me, but I can be convinced. I'm just not there yet.

In particular, if there are non-standard use cases, they are things that can be adapted and modified. There are already a ton of people contributing to all of the inventory scripts.

Still, they are in fact examples -- meaning people are also free to modify them if they want to add more groups and so on.

From the viewpoint of a CTO, I’m not interested in using and modifying an example if there is a way to get the same functionality in a supported fashion. As soon as I modify something like that for my own purposes, I’m walking down a path that leads to just doing it myself. It’s not a great place to be from a maintenance standpoint. I use tools like Ansible to avoid precisely that situation. I’m happy to contribute, but only if by doing so I can get code into a community-maintained and supported path.

I'd need to better understand use cases to see why this didn't fit. It seems one of the best place to put data-driven information about existing instances.

I'm also not sure what you mean about "increasing friction here" -- they are in fact inventory, so this is more of a discussion about not duplicating things that can be easily sourced from inventory.

One of my points is that everything else other than inventory is handled via modules; for inventory you’re saying “here, do this another unsupported way with this example script”. Inventory is somehow “special”, and as a result it’s not treated similarly to other resources in the infrastructure (such as RDS instances, ElastiCache instances, etc). Because it has an IP address and it’s something that can be logged into, it’s now handled via a completely different pathway.

I can already use add_host to perform playbook-internal additions to inventory. But there’s no way within a playbook to query facts about existing instances: that has to be done at the start of the playbook with an external script, or by logging into every individual instance and using ec2_facts.

Most of my play books are doing things via AWS and boto-related modules, not on hosts that are being accessed directly. I just want to be able to treat instances the same way as other resources from a consistency standpoint.

Are you using ec2.py?

I have used it, yes. Given that I’m moving my infrastructure to an AMI-based stem cell approach, however, ec2 inventory doesn’t really mean much any longer. Every instance is ephemeral and the only thing I really need to manage directly in AWS (apart from other AWS services) is a maintenance instance that is used to create an AMI.

What data is missing from what ec2.py returns and what needs to be "callable?”

There’s nothing missing per se; it’s the means of obtaining that information that I’m finding inconsistent and insufficient for my design.

I need to filter by two tags at once (environment and type), whereas the inventory script is not suitable for doing this. I would have to start with one or the other as a group and filter further within the playbook.

Another major miss, however, is this: I need to update a set of machines not in ec2 with information about ec2 instances. So, my inventory file is going to be something other than ec2.py, but I want to automatically gather information about the relevant ec2 instances and do things to the local inventory with the results. I think this example most clearly demonstrates why constraining gathering inventory information to an actual inventory script is inconsistent with treating instances like other resources.

Regards,
-scott

> In particular, if there are non-standard use cases, they are things that
can be adapted and modified. There are already a ton of people
contributing to all of the inventory scripts.
>
> Still, they are in fact examples -- meaning people are also free to
modify them if they want to add more groups and so on.

From the viewpoint of a CTO, I'm not interested in using and modifying an
example if there is a way to get the same functionality in a supported
fashion. As soon as I modify something like that for my own purposes, I'm
walking down a path that leads to just doing it myself. It's not a great
place to be from a maintenance standpoint. I use tools like Ansible to
avoid precisely that situation. I'm happy to contribute, but only if by
doing so I can get code into a community-maintained and supported path.

Not understanding the titles part of the thing.

Just seeing a hypothetical playbook example would help me understand quite
a lot, where right now I have gaps in understanding.

> I'd need to better understand use cases to see why this didn't fit. It
seems one of the best place to put data-driven information about existing
instances.
>
> I'm also not sure what you mean about "increasing friction here" -- they
are in fact inventory, so this is more of a discussion about not
duplicating things that can be easily sourced from inventory.

One of my points is that everything else other than inventory is handled
via modules; for inventory you're saying "here, do this another unsupported
way with this example script". Inventory is somehow "special", and as a
result it's not treated similarly to other resources in the infrastructure
(such as RDS instances, ElastiCache instances, etc). Because it has an IP
address and it's something that can be logged into, it's now handled via a
completely different pathway.

There are two types of things.

There are things that are done and need to be, and then there are things
that are -- sources of truth and existence. Inventory scripts are very
much like input, they don't express state, but they relate a ton of
information about what you have running.

I can already use add_host to perform playbook-internal additions to
inventory. But there's no way within a playbook to query facts about
existing instances: that has to be done at the start of the playbook with
an external script, or by logging into every individual instance and using
ec2_facts.

I'm still curious why the inventory script can't just return this data.

A playbook or deeper use case example would help me understand more.

Most of my play books are doing things via AWS and boto-related modules,
not on hosts that are being accessed directly. I just want to be able to
treat instances the same way as other resources from a consistency
standpoint.

> Are you using ec2.py?

I have used it, yes. Given that I'm moving my infrastructure to an
AMI-based stem cell approach, however, ec2 inventory doesn't really mean
much any longer. Every instance is ephemeral and the only thing I really
need to manage directly in AWS (apart from other AWS services) is a
maintenance instance that is used to create an AMI.

If doing AMI builds, I think the need to interrogate the instance would
actually be *reduced* for most people, as you'd just let the ASG take over.

Another major miss, however, is this: I need to update a set of machines
not in ec2 with information about ec2 instances. So, my inventory file is
going to be something other than ec2.py, but I want to automatically gather
information about the relevant ec2 instances and do things to the local
inventory with the results. I think this example most clearly demonstrates
why constraining gathering inventory information to an actual inventory
script is inconsistent with treating instances like other resources.

Technically you can, however, use an inventory script and static data at
the same time, as long as they are all in a common directory.

Just trying to understand: what is your proposed solution for, say ‘starting all EC2 VMs that match tag_env_prod in zone US-WEST-1’?

Technically you can, however, use an inventory script and static data at the same time, as long as they are all in a common directory.

PS.: BTW, I tried using multiple inventory files but it didn’t work with ansible.cfg, only with the -i param.

> Technically you can, however, use an inventory script and static data at
the same time, as long as they are all in a common directory.

PS.: BTW, I tried using multiple inventory files but it didn't work with
ansible.cfg, only with the -i param.

Please file a ticket if you can't set the inventory location to a directory
for sure, and include the behavior you witnessed.

Thanks!

You can iterate over hosts and check states like so:

- hosts: localhost
   tasks:

       - shell: blah
         when: hostvars[item][ec2_state] == blah and item in
hostvars[item]['ec2_region'] == blah
         with_items: groups.tag_env_prod

The region should also be available as a group:

when: item in groups.foo

etc

Ah, thank you very much, this indeed allowed me to shut down all instances in an environment (code pasted below if anyone ever needs to google this). Unfortunately, shutdown didn’t work because ec2.py doesn’t seem to list stopped instances (even after --refresh-cache).

Thank you both.

-------------------------------- sample below

Sorry,
“Unfortunately, shutdown didn’t work…”
should read
“Unfortunately, starting ‘stopped’ instances didn’t work…”.

Hi,

I rant into the same issue today. That is that the ec2.py plugin only shows details of running instances. To get around it I used the Boto provided list_instances command. Here’s my playbook.

Hi,

I am not sure if this problem is solved already, however, I recently implemented a python shell script to retrieve the instance_ids.

`

conn = boto.ec2.connect_to_region(region, aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key)

instances=conn.get_only_instances(filters={‘tag:tag_name’:[tag_value]})

instance_ids =
for instance in instances:
instance_ids.append(instances[0].id)

print json.dumps(instance_ids)

`

This script allows you to retrieve the instance_ids even if the instances have been stopped.