Posted this on Stack Overflow with no response so far, so I’m hoping someone can answer this in the group:
http://stackoverflow.com/questions/19642167/ansible-test-if-ssh-login-possible-without-fatal-error
I have a setup playbook that takes a freshly installed linux instance, logs in as the default user (we’ll call user1), creates another user (we’ll call user2), then disables user1. Because user1 can only access the instance before this set of tasks is executed, the tasks are in a special playbook we have to remember to run on new instances. After that, all the common tasks are run by user2 because user1 no longer exists.
I want to combine the setup and common playbooks so we don’t have to run the setup playbook manually anymore. I tried to create a task to see which user exists on the instance to make the original setup tasks conditional by attempting to login via SSH as user1. The problem is that if I try the SSH login for either user, ansible exits with a FATAL error because it can’t login: user2 doesn’t exist yet on new instances or user1 has been disabled after the setup playbook executes. ignore_errors: yes
has no effect in this case.
I believe testing the login via SSH is the only way to determine externally what condition the instance is in. Is there a way to test SSH logins without getting a FATAL error to then execute tasks conditionally based on the results?
Quick Public Service Announcement — There’s a much larger community of Ansible users here, so I’d prefer if you’re going to post here, just post here – and then we don’t have to bother with duplicating the response on Stack Overflow. Our support guys read Stack Overflow some, but everybody is here. Stack Overflow has some policies around discussion and closing topics that I also don’t care for, but that’s their right, obviously. Back to the regularly scheduled program…
There isn’t presently a way to try one user and then another. I think it would be a useful feature, but is not something I’m immediately going to implement any time soon as I can see it getting out of hand…
try this user with this password
then this user with that password
etc
etc
I’d much rather people know what machines are using what keys and passwords rather than doing guess and check.
Don’t get me wrong, I can see it being useful… but this seems to be a very good reason to get to common SSH keys.
Fair enough on the Stack Overflow issue. Some groups prefer usage questions on SO and dev questions in the google group, but I’ve been very happy with the quality of discussion here, so that’s fine with me. My intent was not to duplicate, only to try and post to the correct forum first. I’ll follow your suggestion going forward.
Regarding the SSH keys, I’m not necessarily advocating any features. But as a security issue, we routinely disable the default users that some OS distros ship with in favor of less discoverable usernames with completely different SSH keys. Let me also make it very clear that we realize this is not maximum security, just another deterrent. So the default user only exists right after instantiation, then it’s disabled. I just don’t know of a good way using ansible to detect whether that switch has happened, so we end up running one playbook on new instances, then our master playbook from that point forward. Perhaps we can come up with a workaround if fatal errors can’t be avoided.
Thanks for the response.
We have ansible-devel for development topics.
This is the user list actually.
You can set ansible_ssh_user: foo as a host_vars/ variable to specify a different username to try for the host if you want.
Thanks. I believe ansible_ssh_user would still fail. The same host will only allow login to one or the other username depending on the state. So one will always fail. I’m considering a different, very limited login shell instead of disabling the initial user or simply leaving the burden with the user to run the initial playbook.
Ideally it would be nice to have a switch like “ignore_errors” for login failures that doesn’t produce a fatal error and allows execution to continue. I could then change users based on that result and try again with a different user. I’m not pushing for that feature, but there is a use case for disabling a login for security and needing the ability to detect that and continue with another one. Perhaps it’s a niche and we just need to remember to run the initial playbook first. Or perhaps my process could use improvement - I’m just accustomed to locking down boxes on install. Open to ideas on process.
you might want your first play to be something like this:
- hosts: all
gather_facts: False
remote_user: normal_one
tasks:
- setup:
ignore_errors: true
register: normalworks
- include: bootstrap.yml remote_user: bootsrap_user
when: normalworks is not defined
# bootstrap should call setup again to make sure you have host facts
.... continue as normal
Brian,
I know it took me a long while, but I finally got to working on this and your suggestion was extremely helpful.
I implemented a variation that works really well for me.
In this example, user “original” is the one the OS ships with and that we want to disable.
User “newuser” is the one we want to enable.
Secure the server and disable user original if this is a new instance
-
hosts: all:!localhost
gather_facts: false
user: original
sudo: yes
tasks:
-
name: Attempt basic command as user original to determine setup status.
Failure means the machine has been secured
raw: hostname
ignore_errors: yes
register: setup_status
-
include: roles/common/tasks/user-newuser.yml
when: not “Account disabled.” in setup_status.stdout
handlers:
-
include: roles/common/handlers/main.yml
Continue with standard setup when user newuser is enabled and original disabled
- hosts: all:!localhost
user: newuser
sudo: yes
roles:
- common
handlers:
- include: roles/common/handlers/main.yml