Configuring Ansible to run play books through a bastion host on aws/ec2

Hello,

I am building out an env in AWS using ansible and would like to configure all of my hosts by running through a single bastion host which has port 22 open.
Laptop → AWS Bastion → AWS private network instances

Is there a good example of how to configure the proxy around?

Thank You in advance,

Jeff

I've had musings on that too. Currently, I think you'd have to manually configure $HOME/.ssh/config, with ProxyCommand.

However, I just had a thought. What if there was an ansible_ssh_proxy=$other_inventory_host feature? When set, ansible would auto-add the -o ProxyCommand="$something".

This is just some random brainstorm ramblings.

I just looked over ssh.py and ssh_old.py; if I were to actually want to sit down and do this, I would factor those 2 classes, into a common base class, then introduce a third version that supported ProxyCommand.

ps: I notice something odd in the two files above:

I use bastions for nearly all of my communication with servers. It is all done via my ~/.ssh/config file. Something like:

Host bastion
User myuser
HostName bastion.example.org
ProxyCommand none
IdentityFile ~/.ssh/id_rsa
BatchMode yes
PasswordAuthentication no

Host *
ServerAliveInterval 60
TCPKeepAlive yes
ProxyCommand ssh -qaY bastion ‘nc -w 14400 %h %p’
ControlMaster auto
ControlPath ~/.ssh/mux-%r@%h:%p
ControlPersist 8h

In ~/.ansible.cfg I then have

[ssh_connection]
ssh_args = -o ControlPersist=15m -F ~/.ssh/config
scp_if_ssh = True
control_path = ~/.ssh/mux-%%r@%%h:%%p

Nothing else required. I execute ansible and all my connections go through the bastion. Your “Host *” might benefit from being more targeted. In any case, I also have to use these same configs for normal SSH access, so for me it makes sense to just have them in my ssh config.

I really don’t see a need to modify anything within Ansible to do this.

When ansible is configured to auto-create a cluster of brand new virtual machines, all connected to a brand new auto-generated vlan, and all behind a single front-end router(again, also a virtual machine), then a series of test cases are run on this isolated universe of machines, then the entire virtualized cluster is thrown away. The machines in the isolated cluster will have the exact same address as *real* internet servers, so it's not possible at all to connect to them directly.

You'd have to auto-generate the config file for ssh in this case.

ps: Someone kick me later to publish my opennebula dynamic inventory script, and opennebula task(support for template instantiate, vm delete) I'd need to rewrite the the former from perl to python, and the later from shell to python.

A different approach is to setup your bastion with an OpenVPN server, then your client as a laptop will be using the bastion server as a Network Gateway, the IP will be showing as your are connecting from the bastion server.

I’d love a feature that let you set ansible_ssh_proxy in this way. I’d be able to set it from my openstack inventory module.

I think I rejected this in the past, when we were young, saying you could set this in ~.ssh/config (as you can).

I’m open to it now though, for exactly those reasons.

Would need to be implemented in ssh.py and probably raise warnings if found in paramiko.py.

Code submissions would be great, otherwise file a feature idea in GitHub.

I’ve been hacking around this for my AWS VPCs by having my VPC setup playbook drop an ansible.cfg in the playbook dir with the appropriate ProxyCommand ssh_args set to use the jump box. When it gets to provisioning, it fails (since it can’t re-read ansible.cfg), then we re-run the VPC setup and provision playbooks and everything works through the jump box as expected. Hacky, but it’s the cleanest thing I could come up with to work in a fully dynamic VPC env (where each dev can stand up/tear down their own multiple times a day).

If I were going to take this to the next level, I’d probably add ansible_ssh_proxy_host, _user, and _port vars and ssh.py support to generate the right ProxyCommand config. That part looks pretty straightforward, and would probably solve a lot of folks’ issues (since you could then use set_fact to configure the jump box on the fly).

The part that seems tougher to get a general-purpose solution for is getting ec2.py/ec2_vpc doing something sane for automatic proxy support on private VPC hosts. I think the cleanest approach would probably be to add first-class support for jump box provisioning to ec2_vpc (as has been discussed for NAT support), at which point ec2.py could have a mode to set the ansible_ssh_proxy_X vars to the jump box for hosts without a public IP. I think that would solve 99% of the issues people have with jump box/bastion host access for dynamic VPC environments.

Thoughts? I can just push forward and kick out a PR, but if folks generally disagree with the approach, I’d rather spend my time elsewhere.

-Matt

Based on our own experience doing this (fully automated “single click” AWS VPC deployment), the only gap in the automation dynamic assignment of the bastion host during playbook execution. I would imagine the parameters could be exposed as variables that playbook developers can assign via set_fact.

I’m a little unclear about your last paragraph though:

The part that seems tougher to get a general-purpose solution for is getting ec2.py/ec2_vpc doing something sane for automatic proxy support on private VPC hosts. I think the cleanest approach would probably be to add first-class support for jump box provisioning to ec2_vpc (as has been discussed for NAT support), at which point ec2.py could have a mode to set the ansible_ssh_proxy_X vars to the jump box for hosts without a public IP. I think that would solve 99% of the issues people have with jump box/bastion host access for dynamic VPC environments.

I’m not sure that you want to add more to ec2_vpc (it’s pretty busy already). Wouldn’t having the dynamic bastion host variables be sufficient? You run ec2_vpc, set up instances, routing and security groups, then you call set_fact and set the bastion host to an instance that you have in your registered variables. The selection criteria for which host you choose will obviously vary, but you could, for example, base it on a host group assignment.

Hi Guys,

I’m new to the Ansible world, and consequently, late to this discussion.

Here’s what I’ve done to address the bastion host issue. I created a small module called “cloudformation_extract_ssh_config”. It requires 3 parms:

  • A cloudformation stack name
  • An AWS region name
  • A substring of a dns name to identify the bastion server (probably gonna change that to an ip address)

Given those values, it walks thru the stacks, gets on that is active, then queries each machine therein. After a little munging, it spits out an SSH config with the proxies pre-configured.

Ex.

`

host test-bastion-01
batchmode yes
passwordauthentication no
hostname 111.111.111.111
user ubuntu
proxycommand none
host test-private-ip-01
hostname 10.0.0.10
user ubuntu
proxycommand ssh -qaYy test-bastion-01 ‘nc -w 14400 %h %p’
host test-private-ip-02
hostname 10.0.1.10
user ubuntu
proxycommand ssh -qaYy test-bastion-01 ‘nc -w 14400 %h %p’

`

The downside is I have to now figure out an an elegant way to dynamically associate the ssh config file with the playbook without having to do complicated things with extra vars or hard coding values.

$0.02
-T

Thinking on this a bit more … it seems there are two use cases here: how to dynamically change your SSH control connection during playbook execution and how to subsequently refer to the new bastion host on subsequent calls to ansible-playbook. If you could set SSH arguments per play, then I think both of these cases are addressed:

  • hosts: all
    connection: ssh
    connection_args:
    proxy_host: {{ groups.bastion[0] }}

proxy_port: 22
user: johndoe

The ‘connection_args’ feature implies you no longer require SSH config files (but could optionally use them if preferred). It could be used dynamically within a playbook to override your defaults that come from “ANSIBLE_SSH_ARGS”, for example.

I can see an argument for just specifying raw SSH command line arguments as well, something like:

  • hosts: all
    connection: ssh
    connection_args:
    command_line: “-o ProxyCommand ssh -W %h:%p -l johndoe johndoe@{{ groups.bastion[0] }}”

Another use case to consider (that I myself have come up against) is
configuring the bastion per-host from a dynamic inventory. The servers
need to use a different bastion depending on their role and location.

Good point – so configuration per play might be inflexible. I guess the better choice is a variable that can be modified per host/group/play. Call it “ssh_args” and give it the same meaning as ANSIBLE_SSH_ARGS. Assign it per host, group or play where required and use the “-o” option to pass in ProxyCommand parameters.

This seems pretty clean, although I’m not sure what the convention is for exposing new “global” variable state in Ansible. :slight_smile:

How about implementing “ansible_ssh_proxy” to match “ansible_ssh_user” and “ansible_ssh_host”?-T

I think that would be my preference. I know in the past there's been
some pushback against implementing more ansible_ssh_* parameters
because that's long rabbit whole considering the number of ssh
configuration parameters that exist. I agree with this point, so if
adding one more (ansible_ssh_proxy) is too much, then maybe a last and
final ansible_ssh_config to point to a config file on a
per-host/play/task level. Then anything you want can be put into that
config file and ansible itself wouldn't have to ever add any other
ansible_ssh_* parameters.

Either way would solve the problem, although the latter is more
complicated to implement for users (would probably need to have the
dynamic inventory dynamically generate ssh config files too).

It sounds like everyone’s in agreement that the first step needs to be the ability to dynamically set the SSH proxy (and I’d argue that we need proxy_host, proxy_port, and proxy_user). I’m going to start there, since anything else would need to build on that anyway.

If there was a more generic “ansible_ssh_args” parameter, it could be used however the user sees fit. It’s a more flexible approach because it assumes less about how the parameter might be used or implemented in SSH. It just means a little more overhead for the user to know how to construct that string correctly.

I suppose I could do both. I personally prefer the explicit vars- I find it makes the playbooks more readable and maintainable than deciphering ssh_args line noise. I can see where it’d nice to have the “escape hatch” to do unsupported things, though, too. The trick is getting everything to behave if things get set with the explicit vars and also in ssh_args- there’s a bit of code in ssh.py to deal with that already (ssh_args takes precedence IIRC), but there would probably need to be a lot more to make it really robust.

Here is a possible compromise. Another way to use a proxy is via this kind of ssh construct:

`
ssh -o “ForwardAgent=yes” -tt foo@1.2.3.4 ssh -tt bar@10.0.0.10

`

It accomplishes the same thing. Could that be passed somehow in Ansible in its current code?

-T