ansible got stuck in remote password query?

Harald_Dunkel · March 27, 2019, 10:36am

Hi folks,

I am trying to use ansible to deploy openshift on 2 hosts (okd01a and okd01b, CentOS 7.6). Problem: Apparently some remote task gets stuck in a password query:
root 7406 1 0 Mar26 ? 00:00:00 /usr/sbin/sshd -D
root 7945 7406 0 Mar26 ? 00:00:00 _ sshd: root@pts/0
root 7948 7945 0 Mar26 pts/0 00:00:00 | _ -bash
root 896 7948 0 11:32 pts/0 00:00:00 | _ ps -ef --forest
root 897 7948 0 11:32 pts/0 00:00:00 | _ cat
root 48897 7406 0 09:59 ? 00:00:00 _ sshd: root@pts/1
root 49097 48897 0 09:59 pts/1 00:00:00 _ /bin/sh -c /usr/bin/python /root/.ansible/tmp/ansible-tmp-1553677155.03-134576205842945/AnsiballZ_systemd.py && sleep 0
root 49109 49097 0 09:59 pts/1 00:00:00 _ /usr/bin/python /root/.ansible/tmp/ansible-tmp-1553677155.03-134576205842945/AnsiballZ_systemd.py
root 49117 49109 0 09:59 pts/1 00:00:00 _ /usr/bin/systemctl restart docker
root 49118 49117 0 09:59 pts/1 00:00:00 _ /usr/bin/systemd-tty-ask-password-agent --watch
root 49119 49117 0 09:59 pts/1 00:00:00 _ /usr/bin/pkttyagent --notify-fd 5 --fallback

Ain’t ansible supposed to run the remote tasks without controlling terminal to avoid this kind of problem?

Regards
Harri

Jonathan_Lozada_De_L · March 27, 2019, 10:38am

Can you provide more information? its hard to tell what exactly is going on without looking at the code and logs.

Harald_Dunkel · March 27, 2019, 11:04am

Apparently systemd got confused. After running “systemctl daemon-reload” in another terminal ansible continued its playbook. Success.

I just wonder, shouldn’t ansible run “systemctl daemon-reload” before starting, stopping or restarting services on the remote host?

Jonathan_Lozada_De_L · March 27, 2019, 11:42am

it works different. You need to add a handler that will do a systemctl daemon-reload. That’s the proper way to do it. If this tasks doesn’t have it then I suggest you do a PR.

Harald_Dunkel · March 28, 2019, 6:39am

I am not sure if thats reasonable. Ain’t the service module supposed to hide the underlying implementation (systemd, sysv init, upstart, whatever)? It should not get stuck, breaking the whole ansible-playbook session, no matter what.

jean-yves · March 28, 2019, 9:51am

Hi,

use systemd module, there is an option to reload

https://docs.ansible.com/ansible/latest/modules/systemd_module.html

Regards,

JYL

Harald_Dunkel · March 28, 2019, 1:01pm

You mean, you hardwire "systemd-only" into your playbooks? Or
do you explicitly list the systemd hosts in your inventory file,
making sure the general service module isn't used by accident?

Ansible is very new to me, but I had the impression that it is
supposed to hide these internal details.

Regards
Harri

jean-yves · March 28, 2019, 2:11pm

You can make some tasks depending on os version. So for example use service module with RHEL <= 6 and use systemd module with RHEL >=7

When statement and ansible_os_family or ansible_distribution and ansible_distribution_major_version could be interesting fact to use

Regards

Harald_Dunkel · March 28, 2019, 2:57pm

I don't like to distinguish between these hosts. The RedHat
host might have been upgraded from 6.x to 7.y, for example. Its
unreasonable to assume that everybody updates his inventory
files in such a case.

But thats not the point. I am highly concerned how one bad
host can force the whole ansible-playbook task to halt for
>70 minutes. Surely this is not best practice, but I wonder
if this is seen as an acceptable hiccup for using Ansible?

Regards
Harri

Jonathan_Lozada_De_L · March 28, 2019, 3:11pm

it depends how you are writing then automation. You can have ansible do the work in a inventory and go through that list in a order or shoot and do the work in every host and doesn’t matter if one is hold up. https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html

S_C_Rigler · March 28, 2019, 3:28pm

The “ansible_” variables are facts gathered from the remote system and not set in your inventory. They allow you to customize your plays based on the target system.

Regarding your other question how one bad host can force the playbook to halt; it happens and you have to find ways to work around it. We’ve run into scenarios where fact gathering halts because of hung NFS mounts, the yum module hangs because yum is having problems, etc. There are ways to work around these. I frequently use “async” to set a timeout on the tasks that could potentially hang so that they can at least fail without ruining the playbook for the rest of the hosts.

–Steve

Topic		Replies	Views
Remote job suspended ,How can i handle it ? Please help. thanks a lot Ansible Project	4	9	June 24, 2015
Gathering Facts "hanging" for a single host Ansible Project	16	50	December 13, 2013
How to debug very simple hanging playbook Ansible Project	5	18	July 30, 2015
Running ansible on local Linux desktop hangs on Gathering Facts Ansible Project	5	2	March 21, 2014
Can’t reach a server? Ansible Project	20	3	August 19, 2019

ansible got stuck in remote password query?

Related topics