We’re using a custom inventory script that calls a webservice which again calls the aws ec2 api to list instances.
The reason we do this, is that we store auto-generated and unique instance credentials, and the webservice retrieves these, decryptes them using aws kms and presents them to ansible.
From testing, it looks like Ansible will always grab every host in the inventory, even if ansible-playbook is started with the “limit” parameters. So for example:
I run ‘ansible-playbook site.yml --limit server10’. My (dynamic) inventory constist of servers server01 - server99.
Ansible’s behavior in this case seems to be:
- Call the inventory script with the ‘–list’ parameter to get all hosts/groups.
- For each host, call the inventory script with the ‘–host ’ param to get host metadata.
I would argue that the second step is inefficient, because after step 1, Ansible already has enough info to filter out the “unneeded” hosts, so it should only have to perform a single ‘–host’ lookup in order to grab the hostvars of ‘server10’
As we use kms to decrypt credentials, every api call to kms is an exensive operation, which means that ansible starts quite slowly. Before I start making something myself to “pre-populate” an inventory file and just hand that over to Ansible, I’d like to know if there are any plans to improve this behavior.
We do not consider caching an alternative - we want credentials safely tucked away, so storing them randomly in an “open” cache file is not an option for us. We also use Kubernetes pods to trigger ansible jobs, so each ansible “job container” only lives a few minutes - nothing gets persisted after it’s deleted when the job has run.