I have recently installed AWX 24.6.1 in RHEL 9 machine and i have synced my projects playbook but while i try simple ping playbook, it took 2 mins to complete the job. I have attached screen shot. Please help me to fix it.
I have referred previous topics and applied necessary changed but still same slowness issue. I have attached refence image.
To be more clear.
Long start time for jobs
I’ve got a issue with jobs taking a “long” time to start 1 min 30 secs. scaling up the resources on the node has also been done but no improvements. Also I have referred AWX forum and changed some settings but still no luck.
I have pasted one playbook output and it took 2 min 29 secs to complete. Even simple ping play book take 1 min 30sec for single host.
Please advise us incase of any suggestions to fix the error.
I have attached logs. For your reference.
I think you should review your postgres database. Run vacuum analyze for tables contain large tombstone. I opened monitoring topic in thread, you can review it.
Thank you @crimroseKiechan,
we will check and enable it.
Today, I have some free time to walk you through how I troubleshoot a slow AWX UI issue.
Performance issues with the UI and job execution can stem from several sources, such as the database, Nginx, or Kubernetes resource limitations.
Nginx: Start by checking Nginx logs for a high number of 499 errors, as these indicate that the web UI is timing out. If this is the case, consider increasing the number of worker processes in Nginx, setting the value to 2 or higher depending on your system’s capacity.
Kubernetes Resources: Next, verify whether your pods are experiencing throttling. If they are, increase the resource limits for the affected pods to ensure they have sufficient CPU and memory.
PostgreSQL Tuning: Based on your available resource limits, you can use PGTune to optimize your PostgreSQL configuration.
To pinpoint the cause of slowness during peak traffic, use profile_sql
to analyze database performance. For example:
awx-manage profile_sql --threshold=1.5 --minutes=5
You should also schedule regular maintenance tasks for PostgreSQL, such as running VACUUM ANALYZE
and REINDEX
, to clean up dead tuples and refresh table statistics.
To check if PostgreSQL performance is contributing to the slowdown, you can run the following commands to measure query execution time:
\timing on
SELECT COUNT(*) FROM main_jobevent;
Additionally, you can run cleanup jobs to reduce the number of job events stored in the database, helping to optimize performance.
If possible you can tune autovacuum threshold for big table.
If possible you can share me your metrics dashboard.
Thank you @Crimrose , we found a solution that created temporary Git repo and run the job which completed in less than a min, Old repo size was 4G which is causing the latency problem.