callback question

hi awx,
Here is an interesting problem, would love to know what others think about it.

We use callbacks to provision servers that are launched as part of Autoscaling groups.
We run into issues when a new server is launched in this way b/c it may not yet be registered in the inventory.

In that case we expect the callback curl call to FAIL and it retries until the server is alive.

What we are surprised by is when the callback curl command SUCCEEDS but then later on AWX it fails with this message:
[WARNING]: Could not match supplied host pattern, ignoring: 10.0.1.XX

Isn’t that an error we would have expected to get at the callback level? How can it be ‘allowed’ to call the template itself if :
ERROR! Specified hosts and/or --limit does not match any hosts

I am planning to add a delay to the callback but that’s just kind of hoping.

I also would be fine if I could find a way to re-trigger the failure automatically.

The callback is being called like so:
/api/v2/job_templates/XX/callback/

One thing to wonder about is if the new IP matches the old one, could it be that AWX doesn’t know the server has changed at all? and so could it be like a host key verification problem? but then I would expect to see that explicit error.

Going to try a bunch of stuff but just wondering if anyone else has run into this and what you did to resolve it :slight_smile:

Another thing to mention, the inventory groups are based off tags and so are ‘smart inventory’ which I think have less options for triggering automatic inventory refresh.

thanks,
Alex

ok I’m going to try and write a cron job to relaunch any failed templates :slight_smile:

First attempt didn’t work but next will plan to create a fresh job from the failed params and “Create” it.

from awx.main.models import UnifiedJob
for job in UnifiedJob.objects.filter():
if(job.status == ‘failed’):
#relaunch
job.signal_start()

console just showed “False” for the attempted job.

We run into issues when a new server is launched in this way b/c it may not yet be registered in the inventory.

How are you adding the new servers to the inventory? is that also done automatically as part of the provisioning steps? Why are the callbacks being evoked before the hosts are added?

AWX Team

Hi yeah - the servers are added as part of AWS inventory sync and then put into “smart” inventories by tag.

The callbacks are running as part of cloud init, after a small sleep curl is retried for 10 minutes until the server is there.

What is weird is that when the inventory hasn’t been registered by AWX the API call properly “fails”.

There is some kind of inbetween though where the API passes but then the job fails. I wonder if it’s something to do with state propagation across “api” and “web” caches or sources of truth?

I should say this is relatively rare but I have time to look into it now so I’m trying to solve this edge case.