Reading the SSH config file discussion (devel branch)

Hi...

So we have some new code on the devel branch that reads the SSH config
file to augment data that is coming from the ansible host file or the
playbook config. This is pretty nice stuff but I think we should
discuss the implementation and how far we should go with it. The way
it is presently written the values from the host file will override
whatever comes in from the playbook or the inventory file, which I am
not sure I agree with, in which case, we probably need to make some
tweaks. I would like to bring this up so everybody can be clear what
the plan is and how much we should or should not be reading from this
file.

Having a different hostname for the hostname doesn't seem effective to
me, was this intended to support
bastion hosts? If so, I would think more changes would be required,
right? Maybe I'm reading this wrong.

        if 'hostname' in credentials:
            self.host = credentials['hostname']

Always using the port if defined there seems great too

        if 'port' in credentials:
            self.port = int(credentials['port'])

Using the user? This seems wrong to me as I might want to log in and
do things as different user accounts in the course
of a deployment. I think the proper place to set the user is in the
playbook play.

        if 'user' in credentials:
            user = credentials['user']

Keypair seems fine to me.

        if 'identityfile' in credentials:
            keypair = os.path.expanduser(credentials['identityfile'])

Thoughts?

PROPOSAL -- do not pay attention to the user from this file, but keep
other parts of the implementation as is. Also, figure out what our
story is about how we nicely do configuration for bastion hosts, or do
we just tell folks to run Ansible from the bastion host? Maybe the
latter unless someone wants to try to make the former work?

Thanks!

--Michael

I agree with use ansible user always and skip getting username from
ssh config file.

Hmm..what about if the steps will be (on connection.py):
1. Try Use the default . without reading ssh config file .
2. If the #1 fails, using the same user from playbook, try using ssh
config settings (omitting hostname and user)

Another case I have is (though might not be related).
Each host get's the same keypair/port though they have different
hostname but belong to the same domain(this is working fine right
now):

Host *.mydomain.com
  Port 22
  Identityfile /path/to-keypair

Ansible host:
[web]
web1.mydomain.com
web2.mydomain.com

--Cocoy

–Michael

I agree with use ansible user always and skip getting username from
ssh config file.

Hmm…what about if the steps will be (on connection.py):

  1. Try Use the default . without reading ssh config file .
  2. If the #1 fails, using the same user from playbook, try using ssh
    config settings (omitting hostname and user)

When you say “the default”, the defaults for what setting? Just the default user, or did you mean
that for other things as well?

The retry logic would slow down operations, and I want ansible to be as streamlined as possible.

Another case I have is (though might not be related).
Each host get’s the same keypair/port though they have different
hostname but belong to the same domain(this is working fine right
now):

This implies expressing the key pair by group in YAML inventory would be useful, I think?

I’m not sure how many people have this problem.

When you say "the default", the defaults for what setting? Just the default user, or did you mean
that for other things as well?

I mean here, run as is (without reading ssh_config file).

The retry logic would slow down operations, and I want ansible to be as streamlined as possible.

Ok, I see that make sense.

This implies expressing the key pair by group in YAML inventory would be useful, I think?

As long as keypair(s) is/are added via ssh-agent, ansible will
automatically match it to the servers. :slight_smile:

I think for now "just make sure to add all private keypairs using ssh-
agent before running ansible"?
Or we set this ssh config thing to a lower priority.

I think for now "just make sure to add all private keypairs using ssh-
agent before running ansible"?
Or we set this ssh config thing to a lower priority.

Or get port + keypair only from ~/.ssh/config.

Erase my comment -- " Or we set this ssh config thing to a lower
priority. " :slight_smile:

Since ansible is all about SSH, I think there should be a clear waterfall (inheritance) for SSH configuration.

From what I can see, SSH config can come from:

  1. ansible.constants (default values)

  2. inventory file

  3. ~/.ssh/config

  4. playbook

  5. command line

There is no clear order or inheritance for these. Currently

  • the inventory file can contain only the SSH port (as ansible_ssh_port)

  • only the port and identity file are picked from ~/.ssh/config

I believe there should be a clear hierarchy of what location overrides what. I propose the order is the one above, where command line is king. This allows the ~/.ssh/config file to be fully utilized, while allowing playbooks and the command line to overwrite values as needed.

Also ssh_config makes a distinction between Host and Hostname. This is great for creating human-friendly aliases, and it would be great if the inventory file could merge with ~/.ssh/config. Currently, this distinction is not in ansible, though I would like it to be. Perhaps reading ~/.ssh/config should be part of the inventory file code?

Background on my use case:

  • Using Ubuntu on AWS

  • root SSH is disabled (which is what I prefer)

  • SSH in using the user ‘ubuntu’ and then use sudo for root commands (e.g. apt)

  • Each instance may have a different SSH identity file

  • Only AWS provided DNS (no mydomain.com), so ~/.ssh/config provides friendly Host to Hostname mapping

An example ~/.ssh/config file is:

Host with elastic IP

Host my-host-1

Hostname 123.123.123.1

User ubuntu

IdentityFile ~/credentials/us-west-2-admin.pem

ForwardAgent yes

Host without EIP

Host my-host-2
Hostname ec2-123-123-253-0.compute-1.amazonaws.com
User ubuntu
IdentityFile ~/credentials/us-east-1-admin.pem
ForwardAgent yes

and so my inventory file could look like this:


AWS

  • group: grp-by-host
    hosts:
  • my-host-1
  • my-host-2

but not this:


AWS

What are your thoughts and use cases? Does the above proposal work?

Further down the track, I would love to have the inventory management be a module/plugin. I can see value in having the inventory coming directly from the EC2 API, or Rackspace Cloud API, but that is for another day :slight_smile:

I am happy to volunteer to make the changes in a separate branch once a decision is made.

Since ansible is all about SSH, I think there should be a clear waterfall (inheritance) for SSH configuration.

Waterfall and inheritance mean other things. I think you mean “order of precedence”.

From what I can see, SSH config can come from:

  1. ansible.constants (default values)

  2. inventory file

  3. ~/.ssh/config

  4. playbook

  5. command line

Overrides actually do this:

  1. the defaults mechanism is SOURCE CODE and basically just uses the default SSH port. Not remotely strange :slight_smile:
  2. the command line on the devel branch has no options to control the port anymore, because we want people to keep that in inventory. The simple INI format takes a host:port, for instance, which is pretty simple.
  3. the playbook will override the inventory file if you put a port in there, but it’s really the least ideal place to put a port, as you’ll end up repeating yourself, and we should basically remove this from examples now
  4. if there is a port setting in the SSH config, or a key pair setting, those always win, but ~/.ssh/config is not required.

There is no clear order or inheritance for these. Currently

It may not be clear, but there is an order.

  • the inventory file can contain only the SSH port (as ansible_ssh_port)

  • only the port and identity file are picked from ~/.ssh/config

I believe there should be a clear hierarchy of what location overrides what. I propose the order is the one above, where command line is king. This allows the ~/.ssh/config file to be fully utilized, while allowing playbooks and the command line to overwrite values as needed.

There’s already an order.

The SSH config file can already be fully utilized for setting the port or key pair, so I don’t understand what you are requesting. It should not be used to set the user because you can have perfectly valid reasons for logging
into a host as multiple users.

Also ssh_config makes a distinction between Host and Hostname. This is great for creating human-friendly aliases, and it would be great if the inventory file could merge with ~/.ssh/config. Currently, this distinction is not in ansible, though I would like it to be. Perhaps reading ~/.ssh/config should be part of the inventory file code?

I think the group designations are quite important and think that just further confuses things.

Background on my use case:

What are your thoughts and use cases? Does the above proposal work?

I’m interested in hearing what is currently broken and impossible for you, right now, before we discuss any idea of change. It’s easier to understand the problem from the problem rather than a proposed change or solution.
So far I’m mostly hearing that things are unclear, which I think are explained above.

Further down the track, I would love to have the inventory management be a module/plugin. I can see value in having the inventory coming directly from the EC2 API, or Rackspace Cloud API, but that is for another day :slight_smile:

No, it’s already today. This is what the external inventory stuff does if your inventory file is an executable script. This is covered in the docs and there’s an example of how to do it with the Cobbler API.

So you can source groups and variables from anywhere, basically. This was modeled more or less directly on puppet’s external nodes concept, which many people may be familiar with.

I am happy to volunteer to make the changes in a separate branch once a decision is made.

I don’t think I’ve agreed to any changes yet :slight_smile: I need to be convinced something is not possible first.

Basically, I want one place to specify a default user, port, hostname, and identity file for each server.

I have no playbooks, so they can be taken out of the equation for now.

To get a ‘/usr/bin/whoami’ command to return ‘root’ for all servers, I need to:

  • Write hostnames (ugly EC2 versions) to my inventory file (cannot use ~/.ssh/config Host aliases)

  • Specify the user ‘ubuntu’ on the command line (can’t add that to the inventory file)

  • My ~/.ssh/config file has the identity key because that also cannot be specified in the inventory file

  • I need to have double definitions for each server in my ~/.ssh/config file (1 with Host alias, 1 with Host being the hostname) because ansible does not lookup by Hostname, it looks up by Host

  • Add “-s” on the command line because that cannot go in the inventory file

If any of those points are wrong, please let me know, as I am trying to understand the best way to set it up.

Other comments inline:

Since ansible is all about SSH, I think there should be a clear waterfall (inheritance) for SSH configuration.

Waterfall and inheritance mean other things. I think you mean “order of precedence”.

Yes “order of precedence” is what I mean

From what I can see, SSH config can come from:

  1. ansible.constants (default values)

  2. inventory file

  3. ~/.ssh/config

  4. playbook

  5. command line

Overrides actually do this:

  1. the defaults mechanism is SOURCE CODE and basically just uses the default SSH port. Not remotely strange :slight_smile:

No, not strange at all, but it does require a code change to make the default user something else.

  1. the command line on the devel branch has no options to control the port anymore, because we want people to keep that in inventory. The simple INI format takes a host:port, for instance, which is pretty simple.

But the user can still be specified on the command line.

  1. the playbook will override the inventory file if you put a port in there, but it’s really the least ideal place to put a port, as you’ll end up repeating yourself, and we should basically remove this from examples now

Agreed, and repetition is what I am trying to avoid with inventory vs ssh/config

  1. if there is a port setting in the SSH config, or a key pair setting, those always win, but ~/.ssh/config is not required.

There is no clear order or inheritance for these. Currently

It may not be clear, but there is an order.

  • the inventory file can contain only the SSH port (as ansible_ssh_port)

  • only the port and identity file are picked from ~/.ssh/config

I believe there should be a clear hierarchy of what location overrides what. I propose the order is the one above, where command line is king. This allows the ~/.ssh/config file to be fully utilized, while allowing playbooks and the command line to overwrite values as needed.

There’s already an order.

The SSH config file can already be fully utilized for setting the port or key pair, so I don’t understand what you are requesting.

It is not fully utilized because it does not use the username nor host/hostname. Only 2 values are hand picked. There is a lot more value in .ssh/config.

It should not be used to set the user because you can have perfectly valid reasons for logging
into a host as multiple users.

Yes, I totally agree, which is why specific playbooks can overwrite the default values that the combination of ~/.ssh/config and the inventory file provide.

Also ssh_config makes a distinction between Host and Hostname. This is great for creating human-friendly aliases, and it would be great if the inventory file could merge with ~/.ssh/config. Currently, this distinction is not in ansible, though I would like it to be. Perhaps reading ~/.ssh/config should be part of the inventory file code?

I think the group designations are quite important and think that just further confuses things.

How so?

In my example, the inventory file is clear because it uses the human friendly names (Host instead of Hostname from ~/.ssh/config).

Background on my use case:

What are your thoughts and use cases? Does the above proposal work?

I’m interested in hearing what is currently broken and impossible for you, right now, before we discuss any idea of change. It’s easier to understand the problem from the problem rather than a proposed change or solution.
So far I’m mostly hearing that things are unclear, which I think are explained above.

Top of post - default user and identity file per server.
Added bonus if Host to Hostname mapping allowed.

Further down the track, I would love to have the inventory management be a module/plugin. I can see value in having the inventory coming directly from the EC2 API, or Rackspace Cloud API, but that is for another day :slight_smile:

No, it’s already today. This is what the external inventory stuff does if your inventory file is an executable script. This is covered in the docs and there’s an example of how to do it with the Cobbler API.

So you can source groups and variables from anywhere, basically. This was modeled more or less directly on puppet’s external nodes concept, which many people may be familiar with.

Awesome… I didn’t see this before. Do you know if anyone is working on an EC2 version?

I am happy to volunteer to make the changes in a separate branch once a decision is made.

I don’t think I’ve agreed to any changes yet :slight_smile: I need to be convinced something is not possible first.

Exactly, but rest assured when/if we do, you are not the only one doing the work :slight_smile:

I have no playbooks, so they can be taken out of the equation for now.

To get a ‘/usr/bin/whoami’ command to return ‘root’ for all servers, I need to:

Ok, this is better info, thanks…

  • Write hostnames (ugly EC2 versions) to my inventory file (cannot use ~/.ssh/config Host aliases)

I’m not sure why the inventory file matters if they are ugly or not, but when running ansible you can already do “foo*.example.com” as wildcards when not addressing a group.

  • Specify the user ‘ubuntu’ on the command line (can’t add that to the inventory file)

You can’t because you may want to talk to the host using multiple user accounts and that would break things pretty horribly if you couldn’t.

  • My ~/.ssh/config file has the identity key because that also cannot be specified in the inventory file

I like the idea of being able to specify it in the inventory file somewhat because it makes ~/.ssh/config completely optional. The idea is ~.ssh/config should be useful IF THERE, to supplement data you might already have.

  • I need to have double definitions for each server in my ~/.ssh/config file (1 with Host alias, 1 with Host being the hostname) because ansible does not lookup by Hostname, it looks up by Host

Ansible looks up by whatever’s in the inventory file. We might as well call that a hostname.

  • Add “-s” on the command line because that cannot go in the inventory file

Right, because there are reasons for running Ansible via multiple users (sudo and not) and it would be wrong to force it to use just one of them.

It is not fully utilized because it does not use the username nor host/hostname. Only 2 values are hand picked. There is a lot more value in .ssh/config.

See comments about user above. that’s more or less why.

Further down the track, I would love to have the inventory management be a module/plugin. I can see value in having the inventory coming directly from the EC2 API, or Rackspace Cloud API, but that is for another day :slight_smile:

No, it’s already today. This is what the external inventory stuff does if your inventory file is an executable script. This is covered in the docs and there’s an example of how to do it with the Cobbler API.

So you can source groups and variables from anywhere, basically. This was modeled more or less directly on puppet’s external nodes concept, which many people may be familiar with.

Awesome… I didn’t see this before. Do you know if anyone is working on an EC2 version?

There was just a thread about how to do this, I am pretty sure the answer is nothing yet.

+1 that suggestion. Our inventory is in a database. I don't really want
to write it to a text file before running a playbook.

I would even be happy if I could specify the hostname/IP-address on the
ansible-playbook command line. Yes, I know that it's dangerous but no
human being will type these commands.

    -- Art Z.

It should be relatively simple to write an inventory script to pull host data from EC2 if you’re already familiar with their API. I’m doing this from Cobbler using an xmlrpc API and it works well.

To get a ‘/usr/bin/whoami’ command to return ‘root’ for all servers, I need to:

Ok, this is better info, thanks…

  • Write hostnames (ugly EC2 versions) to my inventory file (cannot use ~/.ssh/config Host aliases)

I’m not sure why the inventory file matters if they are ugly or not, but when running ansible you can already do “foo*.example.com” as wildcards when not addressing a group.

It matters when I want to target a specific server and not a group. Without making use of the Host to Hostname mapping in ~/.ssh/config, I need to specify a host by it’s IP address, or by something ugly and unmemorable like “ec2-123-123-253-0.compute-1.amazonaws.com

  • Specify the user ‘ubuntu’ on the command line (can’t add that to the inventory file)

You can’t because you may want to talk to the host using multiple user accounts and that would break things pretty horribly if you couldn’t.

I agree, but how do I specify a default without changing code? There is a great default value in ~/.ssh/config that cannot be used because of the current order.

I am not just thinking about playbooks, but also about ad-hoc tasks.

  • My ~/.ssh/config file has the identity key because that also cannot be specified in the inventory file

I like the idea of being able to specify it in the inventory file somewhat because it makes ~/.ssh/config completely optional. The idea is ~.ssh/config should be useful IF THERE, to supplement data you might already have.

Yes, this would be cool, but needs to also include the user since it is combination of user and identity file. Again, only as a default that can be overridden by a playbook.

  • I need to have double definitions for each server in my ~/.ssh/config file (1 with Host alias, 1 with Host being the hostname) because ansible does not lookup by Hostname, it looks up by Host

Ansible looks up by whatever’s in the inventory file. We might as well call that a hostname.

Correct, it is a hostname, but the lookup:

credentials = ssh_config.lookup(self.host)

is by host, which means if the host from the inventory is ‘my-friendly-name’, then matching this config

Host without EIP

Host my-friendly-name

Hostname ec2-123-123-253-0.compute-1.amazonaws.com

User ubuntu

IdentityFile ~/credentials/us-east-1-admin.pem

fails.

my-friendly-name | FAILED => FAILED: [Errno 8] nodename nor servname provided, or not known

The inventory file needs to have “ec2-123-123-253-0.compute-1.amazonaws.com”, and the SSH config needs to be

Host without EIP

Host ec2-123-123-253-0.compute-1.amazonaws.com
User ubuntu
IdentityFile ~/credentials/us-east-1-admin.pem

for a match to work. This results in loosing the friendly name :frowning:

It matters when I want to target a specific server and not a group. Without making use of the Host to Hostname mapping in ~/.ssh/config, I need to specify a host by it’s IP address, or by something ugly and unmemorable like “ec2-123-123-253-0.compute-1.amazonaws.com

This sounds best served by an RFE for the YAML inventory file to support host aliases.

I agree, but how do I specify a default without changing code? There is a great default value in ~/.ssh/config that cannot be used because of the current order.

It’s not intended for users to change. It’s intended for users to set in the inventory file.

I disagree that the port value in ~/.ssh/config can’t be used, in fact, it is always used if present.

It matters when I want to target a specific server and not a group. Without making use of the Host to Hostname mapping in ~/.ssh/config, I need to specify a host by it’s IP address, or by something ugly and unmemorable like “ec2-123-123-253-0.compute-1.amazonaws.com

This sounds best served by an RFE for the YAML inventory file to support host aliases.

I agree, but how do I specify a default without changing code? There is a great default value in ~/.ssh/config that cannot be used because of the current order.

It’s not intended for users to change. It’s intended for users to set in the inventory file.

I disagree that the port value in ~/.ssh/config can’t be used, in fact, it is always used if present.

Ultimately the only sane solution I can find, to end this ~/.ssh/config debate, is to make sure EVERYTHING can be specified in the inventory file, otherwise there’s always going to be the argument
of some is here and some is there.

We can’t put arbitrary metadata in the ~/.ssh/config.

The argument could be made that paramiko is not ~/.ssh anyway, in which case this seems reasonable. We could also revert the support for reading ~/.ssh/config completely to make it even more clear.

–Michael

Rodneys’ comment below:

I think for now “just make sure to add all private keypairs using ssh-
agent before running ansible”?

If you had a ton of them you could write a nice script to add all of them, even. Besides, some of them could be locked anyway, right?

Peter and I were talking on IRC. Apologies on long thread.

What if – we never read ~/.ssh/config

We teach inventory to be able to express in host OR group variables:

ansible_default_user: True/False
ansible_default_sudo: True/False
ansible_keypairs:
user1: keypair_file
user2: keypair_file

“ansible_” is a bit arduous, but we have it there with “ansible_ssh_port” already. We could make both work and start using the shorter form.

It might be nice to also be able to set variables on the ‘all’ group, in case people’s systems are consistent between groups and other groups, but groups of groups is also another thing we need to do.

–Michael

Seems a long thread to read about .ssh/config but nice discussions
though.:slight_smile: