I’m looking for advice on a clean AWX workflow design to solve this expected workflow.
- I need to create an EC2 instance (from scratch, create new EC2 from base OS AMI and then run the provisioning)
- Usually these new nodes have a curl callback request to an AWX job template in their user-data so at boot time they’ll ask AWX to configure them, this works well, but it seems impossible to integrate into a workflow. I would like it integrated into the workflow so the user knows there straight away after a few mins if the instance succeeded or not.
One workaround is to create the instance, add it (add_host) to a hardcoded group (ex: just_created as per docs) that the next task will use in its hosts: section so it only targets that, but using add_host only makes sense during the same playbook. And, and, requires me to duplicate existing initial-config playbook that itself imports multiple playbooks all with hosts: all and clone all of those just to change from hosts: all to hosts: just_created.
My idea was using the existing limit capability to:
- Start a workflow
- Workflow Node 1: Create the EC2 instance
- Wait for it to be ready
- Force a refresh of the dynamic inventory
- Set a new limit
{{ ec2.instances[0].private_ip_address }}
- Workflow Node 2: Start
initial-configwith the full inventory, but with the limit changed earlier to just the new instance’s ip{{ ec2.instances[0].private_ip_address }}- This would allow me to use the existing
initial-configwithout any changes at all, as if it would have been initiated by the callback, but inside the AWX workflow, providing the final result to the user at the end.
- This would allow me to use the existing
I understand that apparently the limit cannot be changed dynamically during the workflow execution, but I wonder what would be the recommended way to approach this while avoiding refactoring all the existing playbooks that have their own values for the host: parameter.
I also see as I tested that I cannot have the workflow or job started with a limit of just_created (that would be non-existent or empty and error out), have a 1st task that creates the ec2 instance (delegate_to: localhost), a 2nd one that calls add_host (this can’t be delegated I think it means it runs on the controller), and finally the rest of the playbook using hosts: just_created targeting only the new instance.
Could I have a just_created group with some kind of dummy entry so I could run (delegate_to: localhost) the ec2 instance creation, and on the next step have add_host somehow replacing that dummy entry with the new instance?
I’m going in circles unable to find a solution that could even work. I’m looking for something that’s “AWX idiomatic” and second if possible avoids duplicating existing playbooks or is difficult to understand.
I’m leaning into the “looks impossible currently” mindset and accepting that the current situation of creating the instance in one playbook with a cloud-init user-data that has the curl callback to AWX to self-initiate the provisioning is the best workaround currently. But it leads to decoupled job runs that may require more complex error checking, is not as simple as having a new workflow node running on-failure that just terminates (kills) the instance that failed to provision.
Any ideas?