I am trying to automate some tasks using AWX jobs via API. The playbook has set_fact or debug with some JSON data. I want to get it after the job finishes.
I do:
POST to launch the job and get job URL
GET in a loop to get job status (or via webhook) until it finishes
GET to collect the results
Last step is my headache. I tried:
GET $job_url/stdout/?format=json
The stdout of playbook is in single string with ANSI color sequences. Not what I want.
GET $job_url/job_events/ and filter it via jq -r ‘.results[].event_data’
It is a list with many items.
The same, but looking for task name in the job_events etc. Single playbook task named “report resources” has, for instance, 9 matches in 3 events having same , when stdout has only single TASK [report resources] run on a single localhost which prints via debug some data variable.
Can anyone suggest what is the best way to return data from AWX job using pure REST without tricks like saving data into intermediary files, etc. The data exists, but I still have no clear understanding how to find it in the job_events. All I need is a single JSON variable available on the playbook level.
In your playbook you can add a set_stats task, and you give it the exact data you want to make available at the end of the run. This is given under the keyword argument of “data”.
Run your playbook on the CLI locally with the environment variable “ANSIBLE_SHOW_CUSTOM_STATS=True” set, so that you see exactly what data it will show.
Then push your playbook to source control, sync the project in AWX, and run it in a job template. After it finishes, the job should show the data in the “artifacts” key as native JSON along with the rest of the job data.
One more question:
It works with per_host=no, but if I change it to per_host=yes, the artifact list is empty {} on AWX (it is properly show with show custom stats, though).
Is there any way to have per host stats except to combine them in the playbook?
For the case of per_host=yes, this is actually called out in official documentation that it is not supported. There is a separate feature of fact caching, and that’s probably much closer to what you want if you need per-host data.
It is true that workflows do not have artifacts. When doing workflows-in-workflows, the idea of aggregating artifacts from all jobs that ran as a part of a workflow came up. There were concerns about data size as the number of jobs which may contribute to the combined artifact pool could grow to a large number. We also didn’t have much concrete interest or use cases for it.
Here is my use case, maybe you can help me with workaround:
I use ansible playbook to provision cloud resources:
using cloud module (Scaleway in my case) create new server (playbook runs on localhost);
refresh Scaleway dynamic inventory (plugin) to update host information (include just added one);
install packages on new host (playbook runs on remote host).
When I run this using plain ansible, I use meta: refresh_inventory at step 2.
When I run this using AWX, this meta does not help since it updates from local synced version. And new host is not available.
So I have to use workflow which includes 3 steps (step 2 is inventory sync). But can’t return data easily.
Can you give an advice what can I do? Maybe I should add new host dynamically? But I have no host info until it is added to the inventory.
Any other options?
You could do a provisioning callback with the inventory source set to update on launch. That’s the closest thing we have to meta: refresh_inventory. It is designed so that it will update the inventory source so the new host is present before the job template starts. I don’t know that this would be better that what you have now, it involves an extra step of having the host do a curl command at startup, but it needs mention.
This question is just for my own interest, but how did you create the inventory source with the scaleway plugin? Did you source it from a project, and did you use a custom credential type to apply your token or API key? I ask because we have work related to inventory plugin support going on now.
Perhaps, I did not understand your proposal.
I have to update the inventory in the middle of single job run. I can’t use more than one job (like workflow) if I need to return artifacts. So the best I could do is to launch inventory sync from the playbook, but I think that playbook will continue running with old inventory. And the actual sync will wait for job completion.
My project has inventory source which uses plugin:
The AWX inventory uses this as a “Sourced from a Project” source. Currently I have to hardcode token (playbooks use vault). I did not find a way yet how to use custom credentials with inventories. I am new with both ansible and awx, but as always I spend a lot of time trying to do something more than “Hello world” using new platform comparing to how would I do that in plain bash/nodejs/etc.
When running a workflow, the results from set_stats are passed to all downstream jobs. So if your workflow is:
job1->inventory update->job2
And the entire issue is that you are trying to get artifacts (set via set_stats) from job1 to job2, they should already be passed as extra_vars to job2, and you just need to reference them in the playbook.
There may still be more I don’t understand about your use case, but maybe there is a simple solution this way?
The lack of support of in-line vault in inventory plugins is something on my radar. I just have not yet gotten around to filing an issue for it.
The goal is to launch AWX job via REST API and get some JSON data back.
Steps are:
POST …/launch/ and read job URL
GET URL to read job status in a loop until finished (or use webhook where possible)
Use last returned result to read job artifacts.
In case of single job template it is fine since set_stats returns data via artifacts key.
But if I have to update inventory in the middle of job run, I have either:
use workflow (and do not have easy to read artifacts), OR
use 2 separate jobs initiated one after one using two identical POST/WAIT/GET sequences.
Would be perfectly fine if in case of workflow it returned, at least, set_stats from last job. Currently it returns nothing.
It is fine when you just install a software on remote host and should know if it succeeded. But when you need some data from the job, it becomes not trivial.
Of course, it’s possible to use uri module to send data back to the calling platform from the playbook directly. But it is not trivial again if you POST/GET from, say, bash script.
Would be perfectly fine if in case of workflow it returned, at least, set_stats from last job. Currently it returns nothing.
I have to inject a slight modification here, because we can’t assume that a workflow is a single linear path. We could add artifacts to workflow jobs, where those artifacts are built from a combination of that artifacts from all terminal nodes in the workflow job. Deterministic rules for variable precedence would also need to be established, in the case that 2 path defined conflicting values for the same keys.
You could file an issue as an enhancement request for this. I don’t think it’s a bad idea, myself.