In particular, if there are non-standard use cases, they are things that can be adapted and modified. There are already a ton of people contributing to all of the inventory scripts.
Still, they are in fact examples -- meaning people are also free to modify them if they want to add more groups and so on.
From the viewpoint of a CTO, I’m not interested in using and modifying an example if there is a way to get the same functionality in a supported fashion. As soon as I modify something like that for my own purposes, I’m walking down a path that leads to just doing it myself. It’s not a great place to be from a maintenance standpoint. I use tools like Ansible to avoid precisely that situation. I’m happy to contribute, but only if by doing so I can get code into a community-maintained and supported path.
I'd need to better understand use cases to see why this didn't fit. It seems one of the best place to put data-driven information about existing instances.
I'm also not sure what you mean about "increasing friction here" -- they are in fact inventory, so this is more of a discussion about not duplicating things that can be easily sourced from inventory.
One of my points is that everything else other than inventory is handled via modules; for inventory you’re saying “here, do this another unsupported way with this example script”. Inventory is somehow “special”, and as a result it’s not treated similarly to other resources in the infrastructure (such as RDS instances, ElastiCache instances, etc). Because it has an IP address and it’s something that can be logged into, it’s now handled via a completely different pathway.
I can already use add_host to perform playbook-internal additions to inventory. But there’s no way within a playbook to query facts about existing instances: that has to be done at the start of the playbook with an external script, or by logging into every individual instance and using ec2_facts.
Most of my play books are doing things via AWS and boto-related modules, not on hosts that are being accessed directly. I just want to be able to treat instances the same way as other resources from a consistency standpoint.
Are you using ec2.py?
I have used it, yes. Given that I’m moving my infrastructure to an AMI-based stem cell approach, however, ec2 inventory doesn’t really mean much any longer. Every instance is ephemeral and the only thing I really need to manage directly in AWS (apart from other AWS services) is a maintenance instance that is used to create an AMI.
What data is missing from what ec2.py returns and what needs to be "callable?”
There’s nothing missing per se; it’s the means of obtaining that information that I’m finding inconsistent and insufficient for my design.
I need to filter by two tags at once (environment and type), whereas the inventory script is not suitable for doing this. I would have to start with one or the other as a group and filter further within the playbook.
Another major miss, however, is this: I need to update a set of machines not in ec2 with information about ec2 instances. So, my inventory file is going to be something other than ec2.py, but I want to automatically gather information about the relevant ec2 instances and do things to the local inventory with the results. I think this example most clearly demonstrates why constraining gathering inventory information to an actual inventory script is inconsistent with treating instances like other resources.
Regards,
-scott