Handling Windows Local Admin Passwords in Ansible?

Hi Everyone,

I wanted to hear your thoughts about some of the pain points I’ve seen around handling Windows local admin passwords in Ansible. My primary use case is with AWS, though many of these concepts would apply beyond just AWS.

Currently, Ansible (as far as I’m aware) supports only basic auth (i.e., username and password) and Kerberos auth for WinRM connections. The Kerberos support is pretty inflexible (as best I read it), and anyway, I still would need to bootstrap my Windows instance to join an AD domain to use Kerberos. The basic auth requires passing in a username and password in plain text to the connection plugin, via host variables. So, for AWS, what I end up doing is having a separate task which calls out to the AWS APIs, reads in the encrypted password data, decrypts it, and then sets the appropriate ansible_* host variables.

There are a few problems with this approach, including:

  1. Conceptually, I think this kind of discovery about how to connect to an instance should live in the inventory gathering, rather than a separate task. However, the inventory-gathering doesn’t support lazily loading these passwords, so I could end up trying to decrypt passwords for instances that I don’t need the password for for the given play(s), which could take a lot of time for large Windows fleets, and that fleet could include instances that never generated new passwords and instances for which I don’t have the private key to decrypt the passwords.
  2. The passwords aren’t particularly well protected as part of the hostvars. It makes me really, really uncomfortable to just dump a secret like this unencrypted into a hostvar.
  3. It doesn’t really support rotating the password, which I definitely want to do.
    When I think about how Ansible handles Linux SSH connections, I like it a lot more. We just pass a pointer to the secret (i.e., the name of a file containing the SSH private key), and the connection plugin accesses the secret in order to establish the connection. Nothing outside of the low-level connection needs access to the secret, so why should it get access? It’s much more tightly controlled that way. I could imagine something similar in which the encrypted password data and the private key file (discovered as part of the inventory) are passed in to the WinRM connection plugin, and the connection plugin then decrypts the password and uses that to connect. This could just be another option for connecting using WinRM and be backwards compatible with what we have now, so it wouldn’t be too AWS-specific. This also solves (I think!) problems 1 and 2 that I identified.

The third problem is, I think, trickier, and I don’t have as well formed thoughts about how to solve it in Ansible. Looking 6-12 months ahead, I could imagine some combination of using HashiCorp’s Vault to fully manage the Windows local admin password (doesn’t support it yet, but their model fits pretty closely into what I would want to do, so assuming they add support for this) or just ditching WinRM entirely and moving to OpenSSH with private keys, once (if?) MSFT fully supports it. If I go with the former, I probably still have to solve the bootstrapping problem somehow (i.e., I have to get the new instance into Vault), and I wouldn’t want to couple Ansible too tightly to Vault anyway. I’m thinking something along the lines of a pluggable set of authentication providers/callbacks that I could pass into the WinRM connection module and have it call back to my code to, e.g., first try Vault and if that doesn’t work then fall back to decrypting the password from AWS. This is still all very vague and off in the future.

Anyway, I’d love to hear what you all think about either my specific concrete idea of passing in the encrypted password data and the private key file to the WinRM connection plugin or the more vague question about how to better handle password rotation.

Thanks!

–Joel

Good points Joel,
Just to clarify some of your questions regarding the Windows platform:

  1. SSH support is coming. We don’t know when, but I assume we’ll see a relatively stable release within the next 12 months (they’ve done great progress here in just 6 months, so my impression is that they are working hard on this)
  2. WinRM absolutely supports certificate-based authentication. I have ny idea if it’s possible for pywinrm (which Ansible uses) to authenticate using certs, but on the Windows/WinRM side it is fully supported.

BTW; if you look at the dynamic inventory spec, it is fully supported to supply hostvars on a per-node basis, which would support lazily loading the creds from aws. It would probably require you to tweak the aws inventory script, but it’s probably not too hard.

Hi all,

I finally got some time to implement a POC of what I think this might look like: https://github.com/joelthompson/ansible/commit/4a5b2647605a7340b18379390f673657ecf885d9

My desired semantics are in the the git commit message, but repeating here:

If you define ansible_private_key_file and ansible_encrypted_password

variables for the host, then this will attempt to decrypt the encrypted
password with the private key specified and, if that succeeds, use that
decrypted value as the password.

ansible_encrypted_password could be populated by a dynamic inventory script, or it could be populated by a modified ec2_win_password module which, if no private key is passed in, will return the encrypted password instead of the unencrypted password. This also has the benefit that it can wait for the password to be populated, which is probably cleaner than trying to hack that in to an

Let me stress again that this is a POC implementation that still works (hence why this isn’t a real PR). I say this for a couple reasons:

  1. Raw usage of RSA like this is an anti-pattern, and the usage of PKCS#1 v1.5 is suboptimal (as opposed to PKCS#1 v2 OAEP), but that’s just what AWS does, so we’re forced to do it for AWS. Perhaps we could also add a hostvar of something like ansible_encrypted_password_scheme, and when set to “aws” that would invoke this code, and it adds flexibility for future password encryption schemes?

  2. I’d like to think about protecting the decrypted password a little more strongly. I don’t like the fact that set_host_overrides persists the WinRM password in self._winrm_pass when the only time it is used is to construct self._winrm_kwargs, and _winrm_kwargs is really only used in _winrm_connect. It would help me sleep better at night if the decryption happened on demand in _winrm_connect and then the decrypted password were used only to construct the winrm.protocol.Protocol object. The flip side is that you incur a very slight performance hit for every new connection by doing the decryption then, and if something were to happen to the RSA private key in the middle of a play (seems unlikely to me), then that could cause Ansible to fail in the future. I think the added protection would be worth it, but what does everyone else think?

  3. I’m not broadly aware of other use cases, but perhaps the concept of an encrypted password could be promoted to be more of a first-class citizen and stored in the play context, so rather than calling host_vars.get(‘ansible_encrypted_password’) it would be self._play_context.password. Would that fit better? Or is the vision that this sort of thing should be handled by Ansible Vault and thus this is really only useful in the bootstrapping phase?

Thanks!

–Joel

I don’t have the rotating password issue or a particular large group of VMs to manage but I’ve gone with a vault encrypted password file where, during inventory generation, ansible_ssh_pass gets set for each host/group. I tried kerberos auth but it just didn’t work for me. I like your idea of certificate authentication while we wait for the SSH implementation to make it into windows. I would suspect this is some time out ( > 1yr)

If you are concerned with a plain text password being available in memory (in the initialized inventory variables), I am guessing you are concerned with someone running your automation suit and sniffing out passwords. What is your policy of SSH key auth? How does the PPK pair get placed on your automation machine? Is there a passphrase associated with your private key? My point is that if you think about how you’re handling SSH auth from a key perspective is having a variable in memory that concerning?

–Marc

Hi Marc,

Thanks for the feedback.

The short answer: a combination of defense in depth and least privilege.

Long answer:

I worry that storing something as sensitive as a password in hostvars is just asking for trouble. More than once, I’ve done something like “debug: var=hostvars” in order to just look at all the hostvars in a play, and that will expose the password. That can seem innocuous enough, until all your passwords get dumped into a log file because somebody forgot about a debug: var=hostvars in promoting playbooks to production (or thought it would be innocuous ).

So, following the principle of least privilege, the only thing that really needs the plaintext password is the WinRM connection plugin, so why leave it sitting somewhere where it’s easily exposed? I agree that this isn’t perfect protection, but I think it’s much better than what exists today. You’d have to explicitly go out of your way to decrypt the password in your playbooks, which means that it’s both harder to do the wrong thing (expose the password) and more obvious to people who are reviewing the code.

I also want to have a more dynamic Windows environment, where I’m constantly spinning Windows machines up and down, so to use Ansible Vault fully, I would need my inventory to wait for my Windows hosts to actually output the password data, decrypt it, then store it in Vault. That just feels a little gross on a number of levels (and then it would also leave the password relatively unprotected), while this solution feels a lot more natural to me.

You’re absolutely right about the need to have a good story around secure generation and distribution of RSA private keys, protecting those from getting dumped out of memory, etc., and I’m not intending to replace those controls but rather to augment them.

Thanks again,

–Joel

Yep, pluggable secret handling is likely the way we’d tackle this- we’ve talked about a few different implementations, but it’s probably not going to happen in the next couple of releases. We’d probably also do that in conjunction with display/disclosure masking (eg, “known” secrets like ansible_password might not be disclosable to anything but a connection plugin, where other secrets sourced from the secret mechanism might be masked from bulk display, but could still be used in templates/tasks/etc (eg, DB passwords). Since our support obligations for previous releases are becoming longer, we probably wouldn’t ship an interim setup like you’ve proposed (ie, connection/use-case specific that’s not centrally implemented)…

In the meantime, though, I’d suggest the following:

  • Just use your patched connection plugin (call it winrm_enc.py pr something and place it alongside your content in a connection_plugins/ dir ala roles/, library/, etc, and adjust your inventory ansible_connection var to use the new name). You might occasionally need to update it when you update Ansible, but it should just work in most cases.

  • Adjust the ec2.py inventory script to bring down the encrypted Windows password as a fact (so your custom connection plugin can access the facts directly). So long as it doesn’t significantly delay the inventory process (or make it configurable default-off if it does), we’d probably accept a PR for that.

-Matt