ansible-pull checks out your entire project repository, then runs whichever playbook you tell it to. That repo is basically a map to your entire infrastructure.
So, how do you ensure a compromised server doesn’t reveal all that information to an attacker? (With the assumption that the attacker has root access, and that a single rooted server doesn’t mean your entire infrastructure is rooted.)
ansible-pull can purge the repo after it runs, but that doesn’t stop an attacker from running ansible-pull with that option turned off in order to get a copy of the whole repo. Or just read the repo the next time ansible-pull is running.
If you use ansible-vault, then your vault password is either in the cron job, or in a file on the server that the attacker has access to, and knows the location of.
So far, all I can think of to mitigate these issues, is a repo per server, and a vault password per repo… Which kinda destroys most of why people use configuration management.
Am I just not thinking of it in the right way, or maybe misunderstanding how something works?
Hello David,
I am using push right now exclusively and thought about ansible-pull as well.
My idea was to tag all tasks which need passwords/secret keys and only run them only in push mode. Most (of my) tasks do not secrets.
Regards
Mirko
So, two repos? One with passwords in it, another without?
So there are several aproaches to this:
- making a repo per host, which would isolate the compromisable data,
this is a LOT of work and requires workarounds for shared things
(roles, includes, etc) but this would work right now.
- use sparse checkouts, this is not supported currently by the git
module and requries newer versions of git, this still copies all the
data it just does not make it available in the working directory (this
might still change at the git level).
- use git archive's prefix option, also not currently supported by the
git module, but this would provide the best protection against leaking
data unnecessarily to each target machine.
All require that the repo is structured in such a way that each host
(or similar group of hosts) can have access to only their subset of
data and yet still get the shared resources they need (symlinks?).
A different approach is to vault all sensitive data with different
passwords for the different host 'security zones', each ansible-pull
will only be able to decrypt the data relevant to themselves.
Would a sparse checkout or using git-archive prevent an attacker from simply initiating a pull of the repo without those options, so that they can get everything.
Hello David,
yes, sorry. I have all my secrets in a different directory/repository, my playbooks and roles are completely clean of secrets.
You may of course deduce the general structure and machine names and maybe even the topology.
Regards
Mirko
you might need to play with githooks to get this kind of fine grained
permission.
Hi David,
We had the same issue, where we didn’t like our entire git repository exposed on all servers while using ansible-pull.
We have solved our problem differently. Instead of letting ansible-pull do a git checkout, we have a small shell script on each host, which uses rsync to pull in the playbooks, roles, group_vars and host_vars.
We try not to keep any private information in the playbooks and roles. All private information is in group_vars and host_vars. And then on the server side, we use rsync filters to only send those group_vars and host_vars files to a host that it needs. So a host should never have access to private data of another host.
Perhaps you should consider such an approach. It’s more work, of course, but we like our idea, and it works well for us.
Regards,
Anand
Hmm… So, use rsync to push the scripts to each server, making sure to only send the data relevant to each specific server. Right?
Um, isn’t that basically what Ansible does by default? It pulls together the relevant information for each server, then transfers the scripts to that server, and runs them.
What am I missing?
No no, we don’t push scripts. Each managed node pulls playbooks and related bits over SSH. The rsync server runs a “restricted rsync” that will allow the managed node to fetch only some files. This way, a compromised node should never be able to get anything it’s not supposed to.
Anand
Once you have this kind of restrictive environment, you might want to
look into Tower, it will pull/push provision servers on request and
keeps things pretty tight and secure. It also has audit trails and
reports which tend to be needed when security is at this level.