I am using AWX tower to patch a linux VM and using Hushicorp vault signed SSH for connecting to all our linux vm. gathering facts is working and in my task i have performing pre-patch reboot after my vm got restart vm got disconnected error is “timeout waiting for last boot time”
can any one guide me on this if I need any config from my AWX tower side.
Unless Hashicorp* is rotating the authorized keys every reboot, then there’s no issue with that or AWX.
The reboot module is looking at the “last boot time” timestamp on the remote host, polling it periodically until the timestamp changes. There are a handful of scenarios where Ansible will timeout waiting for this value to change.
The device is something like a Raspberry Pi 3B where this timestamp exists, but never changes. I forget the technical reasons for why, but I mention RPi3B specifically because I have experienced this.
1A. To work around this issue, you have to send a shell reboot command to the host, then have ansible pause and wait for connection.
The remote host is hung for some reason and cannot proceed with the reboot. This warrants a root-cause-analysis to determine why the host stalled on a reboot, and attempt to fix the root of the issue.
2A. This might be addressed simply with reboot -f, but may lead to data loss, depending on the root issue.
The host really needs more than 1500 seconds to reboot. Depending on a number of factors like hardware specs, system load, pending system patches, and what all needs to happen for a safe shutdown to complete; it could very well take a long time.
3A. Generally your only option may be to extend the timeout. However, if the issue is how long it takes for a particular service to stop and start, then it might be easier to stop and start the service manually before and after a reboot.
@thirumalairaja sorry to jump in, but I’m trying to accomplish (I believe) exactly what you already have, using the Hashicorp Vault to sign a public key to allow sshing into the target machines with ease.
Can you please share with me how did you proceed? I’m always getting ‘Invalid credentials’, as if the AWX is not considering my signed certificate.
I know that my configuration in Vault is working, because I did the process manually and was able to sshing into the target machine with no problems.
I following this old doc and perhaps that’s the problem: