Hi folks,
I’ve only been working on Ansible for a short time and I don’t have a production instance yet. It’s all new to me.
I’m trying to install AWX/ansible in multiple instances mode.
Initially, I based my work on this git https://github.com/sujiar37/AWX-HA-InstanceGroup. It worked perfectly well until version 9.3.0.
Since version 10.0.0 and the removal of RabbitMQ, I have adapted the deployment to take these changes into account (Redis added).
The instances install correctly:
- 1 node with web and task (called web1)
- 1 node with task only (called task1)
All nodes also have a redis and memcached.
However, when I run a job on task1, the job is done correctly but the information does not go back to the webui and the webui remains in running state until I refresh the page.
- task1 logs
2020-04-09 13:45:16,322 DEBUG awx.main.dispatch task cc9b6dd3-b8b5-4900-9ca4-60705a3d2ee0 starting awx.main.tasks.awx_periodic_scheduler([])
2020-04-09 13:45:16,333 DEBUG awx.main.tasks Starting periodic scheduler
2020-04-09 13:45:16,336 DEBUG awx.main.tasks Last scheduler run was: 2020-04-09 13:45:15.731662+00:00
2020-04-09 13:45:17,369 DEBUG awx.main.commands.run_callback_receiver ProjectUpdateEvent.objects.bulk_create(1)
[…]
2020-04-09 13:45:21,556 DEBUG awx.main.commands.run_callback_receiver ProjectUpdateEvent.objects.bulk_create(2)
2020-04-09 13:45:21,645 INFO awx.main.commands.run_callback_receiver Event processing is finished for Job 11, sending notifications
2020-04-09 13:45:21,645 INFO awx.main.commands.run_callback_receiver Event processing is finished for Job 11, sending notifications
2020-04-09 13:45:21,646 DEBUG awx.main.tasks project_update 11 (running) finished running, producing 25 events.
2020-04-09 13:45:21,695 DEBUG awx.main.dispatch task 4d2bf842-1b2e-4724-9150-119df9ee44a0 starting awx.main.tasks.handle_success_and_failure_notifications([11])
In the meantime, on web1 logs, i can see error with wsbroadcast connections:
2020-04-09 14:06:32,956 DEBUG awx.main.wsbroadcast web1 connect attempt 274 to task1
2020-04-09 14:06:37,961 WARNING awx.main.wsbroadcast Failed to connect to task1: ‘Cannot connect to host task1:443 ssl:False [Connect call failed (‘10.64.10.123’, 443)]’. Reconnecting …
2020-04-09 14:06:37,962 DEBUG awx.main.wsbroadcast web1 connect attempt 275 to task1
2020-04-09 14:06:42,972 WARNING awx.main.wsbroadcast Failed to connect to task1: ‘Cannot connect to host task1:443 ssl:False [Connect call failed (‘10.64.10.123’, 443)]’. Reconnecting …
After some searches, I understand that since version 10.0.0 AWX nodes are fully connected via a special websocket endpoint (https://github.com/ansible/awx/blob/devel/docs/websockets.md).
So awx_web now seems mandatory on each node?
So I redo a new installation with two nodes (Web1 and Web2) each with an awx_web and awx_task:
2020-04-09 14:17:27,163 INFO awx.main.consumers Broadcast client connected.
2020-04-09 14:17:27,163 INFO Broadcast client connected.
I’m trying to restart a new job on the ui web1 running on the web2 task node.
I have exactly the same problem, webui is not updated until a refresh the page.
I also read https://github.com/ansible/awx/issues/5443 to understand how it should works from AWX 10.0.0.
But i’m not sure to understand where is my problem. Since when a refresh the page, status is updated so, it’s correctly sent to database ? A notification is failing somewhere ?
Any ideas ?
Thanks a lot for your help !
Wilfried.