AWX Jobs stuck pending after upgrade to 2.19.1 / v24.6.1

I updated by the usual process with awx-operator. Seemed to complete but jobs do not start. Just stay pending. I see this in the logs on the task container.

django.db.utils.ProgrammingError: unrecognized configuration parameter "idle_session_timeout"
2024-10-11 16:35:06,664 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch.periodic scheduler found send_subsystem_metrics to run, 0.01080179214477539 seconds after target
2024-10-11 16:35:06,665 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch.periodic Scheduler next run is metrics_gather in 7.988242149353027 seconds
2024-10-11 16:35:06,665 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch task 2616f50c-c5e3-4f84-8326-e2b7104fb047 starting awx.main.analytics.analytics_tasks.send_subsystem_metrics(*[])
2024-10-11 16:35:06,811 INFO     [-] awx.main.wsrelay Producer 10.51.194.142-job_events-25636 has no subscribers, shutting down.
2024-10-11 16:35:06,813 INFO     [-] awx.main.wsrelay Producer 10.51.194.142-jobs-summary has no subscribers, shutting down.
2024-10-11 16:35:14,506 DEBUG    [-] awx.main.wsrelay Web host awx-demo-web-c4d6f5ff-jp88b (10.51.194.142) online heartbeat received.
2024-10-11 16:35:14,667 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch.periodic scheduler found metrics_gather to run, 0.013670206069946289 seconds after target
2024-10-11 16:35:14,671 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch.periodic Scheduler next run is tower_scheduler in 1.9822404384613037 seconds
2024-10-11 16:35:16,660 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch.periodic scheduler found tower_scheduler to run, 0.007393598556518555 seconds after target
2024-10-11 16:35:16,661 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch.periodic Scheduler next run is cluster_heartbeat in 0.9922580718994141 seconds
2024-10-11 16:35:16,662 DEBUG    [d949237159a54888b7c5ab8450725256] awx.main.dispatch task 1eb4d8e3-fd9b-4290-b89a-5f0fe4433a7e starting awx.main.tasks.system.awx_periodic_scheduler(*[])
2024-10-11 16:35:16,683 ERROR    [d949237159a54888b7c5ab8450725256] awx.main.dispatch Worker failed to run task awx.main.tasks.system.awx_periodic_scheduler(*[], **{}
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
    raise ex.with_traceback(None)
psycopg.errors.UndefinedObject: unrecognized configuration parameter "idle_session_timeout"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/task.py", line 103, in perform_work
    result = self.run_callable(body)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/task.py", line 78, in run_callable
    return _call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 719, in awx_periodic_scheduler
    with advisory_lock('awx_periodic_scheduler_lock', lock_session_timeout_milliseconds=lock_session_timeout_milliseconds, wait=False) as acquired:
  File "/usr/lib64/python3.11/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/utils/pglock.py", line 19, in advisory_lock
    idle_session_timeout = cur.execute('SHOW idle_session_timeout').fetchone()[0]
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
    raise ex.with_traceback(None)

Hi @trippinnik

EDIT: Sorry I see you’re using 24.6.1 from the title.

I think based on line:

 File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/utils/pglock.py", line 19, in advisory_lock
    idle_session_timeout = cur.execute('SHOW idle_session_timeout').fetchone()[0]

This looks to be related to: Add TASK_MANAGER_LOCK_TIMEOUT by TheRealHaoLiu · Pull Request #15300 · ansible/awx (github.com)? :thinking:

cc @TheRealHaoLiu

So it’s not included yet? How do i get it working?

Based on ‘SHOW idle_session_timeout’ this is a postgres query. This parameter was added to Postgres 14, so if you’re still on 13, you probably need to upgrade Postgres.

1 Like

I upgraded Postgres to 15.8 and now I get:

Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
    raise ex.with_traceback(None)
psycopg.errors.SyntaxError: trailing junk after numeric literal at or near "1d"
LINE 1: SET idle_in_transaction_session_timeout = 1d
                                                  ^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/task.py", line 103, in perform_work
    result = self.run_callable(body)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/task.py", line 78, in run_callable
    return _call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 719, in awx_periodic_scheduler
    with advisory_lock('awx_periodic_scheduler_lock', lock_session_timeout_milliseconds=lock_session_timeout_milliseconds, wait=False) as acquired:
  File "/usr/lib64/python3.11/contextlib.py", line 144, in __exit__
    next(self.gen)
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/utils/pglock.py", line 26, in advisory_lock
    cur.execute(f"SET idle_in_transaction_session_timeout = {idle_in_transaction_session_timeout}")
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
    raise ex.with_traceback(None)
django.db.utils.ProgrammingError: trailing junk after numeric literal at or near "1d"
LINE 1: SET idle_in_transaction_session_timeout = 1d

Also I don’t see anything in release notes about change in Postgres version requirements.

Sorry can’t help you with that error.
But see Database Configuration - Ansible AWX Operator Documentation for the database requirement.