Question on best way to monitor AWX deployed in kubernetes

Can some one give ideas around how we can monitor AWX deployed with awx-operator in
kubernetes. Especially some thing like below scenario

→ Where AWX jobs just spin for ever like status shows as waiting or pending.
→ Postgres database gets filled up.
→ Execution environments fail to start with CPU’s/pod’s dont get spin up.

Thanks,
Rakesh Boinapally

Any Guidance or Help would be much appreciated

Hi, were you able to find a good monitoring setup?

There are a few built-in ways to monitor AWX, using the api/v2/metrics endpoint https://docs.ansible.com/automation-controller/4.2.0/html/administration/metrics.html

You can point a prometheus instance to scrape that endpoint, and maybe use grafana to display it nicely.

However, it doesn’t monitor database usage / disk usage. Maybe some tool exists to monitor a k8s persistent volume

AWX Team

Also, you may look at the job_lifecycle logs (LOG_AGGREGATOR_LEVEL needs to be DEBUG to see these) which will show information about why a job is blocked/pending and why it is not entering running state