How to optimise many hosts which are all localhost?

I have a use case which is a bit different to most, but Ansible seems to do a pretty good job.
I’m trying to figure out the best way to leverage Ansible’s inherent concurrency for multiple hosts, when deploying to serverless infrastructure, which has no hosts. (Instead just using tasks which do API calls on localhost.)

The Task

I’m building, testing and deploying a project using Ansible.

My target infrastructure is all serverless. I’m deploying code to AWS Lambda functions, with CloudFormation. So there are no ‘hosts’ for Ansible to connect to.

I have something currently which works, but is slow and a bit messy.

I’ve got a big playbook, all connecting to one host, which is ‘localhost’ (connection=local).
I’ve got a yaml file with a list of my lambda functions and relevant config (e.g. timeout, name, comment, environment variables etc).

I’ve got one role for all my Lambda function stuff.

  1. Use include_vars to load in aforementioned variable file

  2. using with_dict and the variable from that file, pip install the modules each one depends on,

  3. copy over the code for each Lambda function to where those dependencies where installed, also with_dict

  4. run a unit test for each (shell: python main.py), also with_dict

  5. zip up each folder, also with_dict

  6. upload each zip to S3, using with_nested (for each lambda function, for each region)

In particular the uploading is slow, because it’s in series. I’d like to do it in parallel.

Then I have other roles for other things which are not per-lambda. (e.g. deploying cloudformation templates, configuring a firewall etc.)

The problem

Inside that role for Lambda stuff, almost every task has with_dict or with_items, or sometimes with_nested (e.g. upload each lambda zip for each of multiple regions), in addition to extensive when conditions. (I made it so I can pass in -e only_lambda=x to Ansible, to skip tasks for Lambdas I haven’t changed.)
This ends up quite messy.

Overall the whole thing is quite slow too, because it does everything in series not parallel. Some things are I/O bound, and could be sped up a lot by doing them concurrently.
e.g. I used async to do the upload in parallel.

Another thing that get’s messy is that I want to keep that variable file with lambda config minimal. I don’t want to copy-paste boiler plate for each one. (It’s already too huge.)
So each lambda has a field called ‘Name’.
I want to add another field like: local_zip_name: "{{ build_dir }}/{{ self.cf_name }}.zip". Modifying a dict in Ansible is surprisingly messy. Currently I do this using set_fact`.

Although it turns out that if you load the variable from a file with -e @file.yaml, it seems that changes with set_fact don’t take effect.
So I have to use an include_vars task instead or loading a var file with command line arguments.

Attempted solution

I’m trying to add each ‘lambda function’ to static inventory.
Each one being localhost, but having unique host variables equal to what I had in that yaml file.
This way I can leverage all the work Ansible devs have already done to do things concurrently.

It turns out you can omit the IP address, and it defaults to localhost anyway, and Ansible will happily work with multiple hosts that are actually the exact same host.
It all just works. Yay!

I can get rid of all the with_items, with_dict etc, and just run the role against the group of hosts.
And I can reduce the number of tasks by moving set_fact definitions into host definitions.

Then for other tasks which I only need to do once, I run those roles with a different host group, that’s one Ansible host connecting to localhost.

This host approach also makes it nice to set things like local_zip_fname (mentioned above). I can set that variable once for the whole group, and lazy variable evaluation means it’s evaluated differently for each host. This means I don’t have to worry about dependency issues about which variables are defined first, and there’s one place for all the default variables.

But, there are 2 issues.

  1. I want to reference that local_zip_fname variable (defined using Jinja2 templating for each Lambda host) from the other host (one host, localhost). But local_zip_fname is not defined for that other host. So I tried the thing where you dig into host_vars['other_host']['my_var'] to get it. But now lazy variable evaluation bites back. The variable I want (local_zip_fname) includes Jinja2 templating for another variable (cf_name), which was present on those other hosts (one per Lambda), but not this host.

Question: How can I access variables for other hosts, forcing Jinja2 evaluation based on that other host’s environment?
The only solution I can think of is to save the variables to disk for each of those other hosts (using | to_yaml to force Jinja evaluation), and then loading that file from the host I want to read the variables. Is there a better way?

The other issue is that the speed improvement is no where near as good as I hoped. If I deploy from a beefy machine with 8 CPUs instead of my usual 2, it seems to speed up the relevant tasks by a factor of 2. (Not by a factor of 8/2). If I try to deploy with my usual 2 CPU, there’s no noticeable difference in speed.
It seems that there is a lot of overhead per host, which is taking away from the speed increase from converting sequential with_items to concurrent hosts.

I’ve tried to optimise things with:

gathering=no

or using fact caching,
and I’ve already set forks=50 (More than the number of hosts).
And I enabled pipelining (which probably does nothing for connection=local)

Is there any other way to optimise this? Is there a way to tell Ansible that these 30 hosts are actually all the same machine, so it only needs to do some things once not 30 times?

Or is there some other approach to solve my problem? Is this the x y problem?

Thanks,
Matt

(GH: matt-telstra and mlda065)

Hmm, no replies. Damn.

Here’s my solution, in case anyone else has a similar problem.

I realised I can skip the approach of writing to disk, and keep it all in memory.

So I have a task run by all ‘lambda’ hosts, which is:

`
set_fact:
this_host_vars: “{{ hostvars[inventory_hostname] | dict_filter(filter_by) | to_yaml | from_yaml }}”

`

Where dict_filter is a custom filter I wrote to do

`
x = {k:v for k,v in d.items() if k in filter_by}

`

Then on the single main host, I do:

`
with_items: “{{ groups.lambdas }}”
vars:
to_add: “{{ { item: hostvars[item][‘this_host_vars’] } }}”
set_fact:
lambda_confs: “{{ lambda_confs | default({}) | combine(to_add) }}”

`

For two dozen ‘lambdas’, it takes a few seconds to do these two tasks.

Regards,
Matt