ssh failing for a newly created EC2 instances

My apologies for posting this as a new question but I’m doing it because I think it’s a separate question. My understanding is that the newly launched EC2 instances are added to an in-memory host group. And since I’m passing the private key of the EC2 instances to the ansible-playbook command I should be able log in.

I was able to start up EC2 instances by following the example link below.
https://github.com/ansible/ansible-examples/blob/master/language_features/eucalyptus-ec2.yml

However, I’m having trouble connecting to the instances after they are created.

This is my playbook.

  • name: Stage instance(s)
    hosts: cfgsvr
    connection: local
    user: myuserid
    gather_facts: false

vars:
keypair: mykeypair
image: ami-2efa9d47

instance_type: m1.small #t1.micro
subnet: subnet-xxxxxx
region: us-east-1
image: ami-2efa9d47

Launch 1 instance with the following parameters. Register the output.

tasks:

  • name: Launch instance
    local_action: ec2 keypair={{keypair}} vpc_subnet_id={{subnet}} instance_type={{instance_type}} image={{image}} wait=true count=1
    register: ec2

Use with_items to add each instances public IP to a new hostgroup for use in the next play.

  • name: Add new instances to host group
    local_action: add_host hostname={{item.public_ip}} groupname=deploy
    with_items: ${ec2.instances}

  • name: Wait for the instances to boot by checking the ssh port
    local_action: wait_for host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
    with_items: ${ec2.instances}

  • name: Breathing room
    pause: seconds=30

This play targets the new host group

  • name: Configure instance
    hosts: deploy #must match groupname in “add_host” above
    user: ubuntu
    sudo: yes
    gather_facts: true

Install the necessary software on each instance

tasks:

  • name: Get the latest updates for instance
    action: command apt-get update

  • name: Install JDK
    apt: pkg=openjdk-6-jre-headless state=latest install_recommends=no update_cache=yes
    #action: apt pkg=java-1.7.0-openjdk state=latest

  • name: Install Maven2
    apt: pkg=maven2 state=latest update_cache=yes

I’m launching the playbook using the following command.

$ansible-playbook -v ec2_launch.yml -vvvv -i inventory/ansible_hosts –private-key=/path/to/private/key

GATHERING FACTS ***************************************************************

ESTABLISH CONNECTION FOR USER: ubuntu
EXEC [‘ssh’, ‘-tt’, ‘-vvv’, ‘-o’, ‘ControlMaster=auto’, ‘-o’, ‘ControlPersist=60s’, ‘-o’, ‘ControlPath=/home/myuserid/.ansible/cp/ansible-ssh-%h-%p-%r’, ‘-o’, ‘Port=22’, ‘-o’, ‘IdentityFile=/home/myuserid/.ec2/myprivatekey.pem’, ‘-o’, ‘KbdInteractiveAuthentication=no’, ‘-o’, ‘PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey’, ‘-o’, ‘PasswordAuthentication=no’, ‘-o’, ‘User=ubuntu’, ‘-o’, ‘ConnectTimeout=10’, u’None’, “/bin/sh -c ‘mkdir -p $HOME/.ansible/tmp/ansible-1382270163.81-64347444346956 && chmod a+rx $HOME/.ansible/tmp/ansible-1382270163.81-64347444346956 && echo $HOME/.ansible/tmp/ansible-1382270163.81-64347444346956’”]
fatal: [None] => SSH encountered an unknown error. The output was:
OpenSSH_5.9p1 Debian-5ubuntu1, OpenSSL 1.0.1 14 Mar 2012
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket “/home/myuserid/.ansible/cp/ansible-ssh-None-22-ubuntu” does not exist
debug2: ssh_connect: needpriv 0
ssh: Could not resolve hostname None: Name or service not known

TASK: [Get the latest updates for instance] ***********************************
FATAL: no hosts matched or all hosts have already failed – aborting

I can log into the launched instance by using the private key I’m passing to the ansible-playbook command.

$ansible-playbook -v ec2_launch.yml -vvvv -i inventory/ansible_hosts –private-key=/path/to/private/key

Any idea why is this failing?

Thanks.

Looks like ssh is not open on the hosts.

I can ssh into the launched instance from command line without any issues.

$ssh -i /path/to/private/key ubuntu@X.X.X.X works just fine.

Off topic, one of the reasons I choose Ansible over other DevOps tools was because of the advertised easy configuration. However, based on my initial experience it looks like that’s not the case. Either I’m missing out on basic things or the configuration is not as easy as it seems.

-Soumya

Have you tried invoking ssh-agent, adding the key and then running ansible playbook?

1) eval $(ssh-agent)
2) ssh-add /path/to/private/kye
3) ansible-playbook <args>

@James.

I tried invoking ssh-agent and adding my private key to it before running ansible playbook. However, it still doesn’t work :-(.

I also tried the ansbile-playbook command with and without the private-key option but it still gives the same error.

I’m almost ready to give up on this now.

-Soumya

Here's the relevant portion of the -vvvv output. It's the last three
things that Ansible passed to ssh on the command line. (I've truncated the
last one to just the starting 10 characters, for clarity):

'-o', 'ConnectTimeout=10', u'None', "/bin/sh -c"

The "-o ConnectTimeout=10" is the last of the command-line options given to
ssh. The "u'None'" is interpreted by ssh as the remote hostname to connect
to. The "/bin/sh -c" and following text is interpreted by ssh as the
command to execute on the remote host.

According to the ssh error message, the hostname "u'None'" is not resolving
to a usable hostname or IP address. And it doesn't look like the kind of
name that will properly resolve.

So the evidence points to the playbook not successfully passing the EC2
instance's hostname (or ip address) to the ssh command, not a problem in
the EC2 instance itself or in ssh keys.

  -Greg

@Greg,

Thank you. Your analysis makes perfect sense.

The host name is set here in the playbook. However, since I’m launching this instance on a VPC (Virtual Private Cloud) instance. This value is not set in the ansible variable.

Use with_items to add each instances public IP to a new hostgroup for use in the next play.

  • name: Add new instances to host group
    local_action: add_host hostname={{item.public_ip}} groupname=deploy
    with_items: ${ec2.instances}

I need to use hostname={{item.private_ip}} for this work.

However, now I’m getting the following error.

GATHERING FACTS ***************************************************************

<10.128.3.113> ESTABLISH CONNECTION FOR USER: ubuntu
<10.128.3.113> EXEC [‘ssh’, ‘-tt’, ‘-vvv’, ‘-o’, ‘ControlMaster=auto’, ‘-o’, ‘ControlPersist=60s’, ‘-o’, ‘ControlPath=/home/myuserid/.ansible/cp/ansible-ssh-%h-%p-%r’, ‘-o’, ‘Port=22’, ‘-o’, ‘IdentityFile=/home/myuserid/.ec2/myprivatekey.pem’, ‘-o’, ‘KbdInteractiveAuthentication=no’, ‘-o’, ‘PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey’, ‘-o’, ‘PasswordAuthentication=no’, ‘-o’, ‘User=ubuntu’, ‘-o’, ‘ConnectTimeout=10’, u’10.128.3.113’, “/bin/sh -c ‘mkdir -p $HOME/.ansible/tmp/ansible-1382457536.35-172524262324962 && chmod a+rx $HOME/.ansible/tmp/ansible-1382457536.35-172524262324962 && echo $HOME/.ansible/tmp/ansible-1382457536.35-172524262324962’”]
fatal: [10.128.3.113] => SSH encountered an unknown error. The output was:
OpenSSH_5.9p1 Debian-5ubuntu1, OpenSSL 1.0.1 14 Mar 2012
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket “/home/myuserid/.ansible/cp/ansible-ssh-10.128.3.113-22-ubuntu” does not exist
debug2: ssh_connect: needpriv 0
debug1: Connecting to 10.128.3.113 [10.128.3.113] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: fd 3 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 9999 ms remain after connect
debug3: Incorrect RSA1 identifier
debug3: Could not load “/home/myuserid/.ec2/myprivatekey.pem” as a RSA1 public key
debug1: identity file /home/myuserid/.ec2/myprivatekey.pem type -1
debug1: identity file /home/myuserid/.ec2/myprivatekey.pem-cert type -1
debug1: Remote protocol version 2.0, remote software version OpenSSH_5.9p1 Debian-5ubuntu1
debug1: match: OpenSSH_5.9p1 Debian-5ubuntu1 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.9p1 Debian-5ubuntu1
debug2: fd 3 setting O_NONBLOCK
debug3: load_hostkeys: loading entries for host “10.128.3.113” from file “/home/myuserid/.ssh/known_hosts”
debug3: load_hostkeys: found key type ECDSA in file /home/myuserid/.ssh/known_hosts:35
debug3: load_hostkeys: loaded 1 keys
debug3: order_hostkeyalgs: prefer hostkeyalgs: ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug2: kex_parse_kexinit: ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-rsa-cert-v01@openssh.com,ssh-dss-cert-v01@openssh.com,ssh-rsa-cert-v00@openssh.com,ssh-dss-cert-v00@openssh.com,ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: kex_parse_kexinit: ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: none,zlib@openssh.com
debug2: kex_parse_kexinit: none,zlib@openssh.com
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: mac_setup: found hmac-md5
debug1: kex: server->client aes128-ctr hmac-md5 none
debug2: mac_setup: found hmac-md5
debug1: kex: client->server aes128-ctr hmac-md5 none
debug1: sending SSH2_MSG_KEX_ECDH_INIT
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ECDSA cc:cc:cc:cc:cc:29:f7:62:c4:21:4a:cc:cc:cc:cc:02
debug3: load_hostkeys: loading entries for host “10.128.3.113” from file “/home/myuserid/.ssh/known_hosts”
debug3: load_hostkeys: found key type ECDSA in file /home/myuserid/.ssh/known_hosts:35
debug3: load_hostkeys: loaded 1 keys
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
Please contact your system administrator.
Add correct host key in /home/myuserid/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /home/myuserid/.ssh/known_hosts:35
remove with: ssh-keygen -f “/home/myuserid/.ssh/known_hosts” -R 10.128.3.113
ECDSA host key for 10.128.3.113 has changed and you have requested strict checking.
Host key verification failed.

Take a look at http://www.ansibleworks.com/docs/intro_getting_started.html#host-key-checking

@James - thanks. Looks like it worked. Now the “gathering facts” part works.

GATHERING FACTS ***************************************************************
<10.128.3.96> ESTABLISH CONNECTION FOR USER: ubuntu
<10.128.3.96> EXEC [‘ssh’, ‘-tt’, ‘-vvv’, ‘-o’, ‘ControlMaster=auto’, ‘-o’, ‘ControlPersist=60s’, ‘-o’, ‘ControlPath=/home/dhkarimi/.ansible/cp/ansible-ssh-%h-%p-%r’, ‘-o’, ‘StrictHostKeyChecking=no’, ‘-o’, ‘Port=22’, ‘-o’, ‘IdentityFile=/home/myuserid/.ec2/myprivatekey.pem’, ‘-o’, ‘KbdInteractiveAuthentication=no’, ‘-o’, ‘PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey’, ‘-o’, ‘PasswordAuthentication=no’, ‘-o’, ‘User=ubuntu’, ‘-o’, ‘ConnectTimeout=10’, u’10.128.3.96’, “/bin/sh -c ‘mkdir -p $HOME/.ansible/tmp/ansible-1382460221.33-1627522380647 && chmod a+rx $HOME/.ansible/tmp/ansible-1382460221.33-1627522380647 && echo $HOME/.ansible/tmp/ansible-1382460221.33-1627522380647’”]
<10.128.3.96> REMOTE_MODULE setup
<10.128.3.96> PUT /tmp/tmpoW_ql9 TO /home/ubuntu/.ansible/tmp/ansible-1382460221.33-1627522380647/setup
<10.128.3.96> EXEC [‘ssh’, ‘-tt’, ‘-vvv’, ‘-o’, ‘ControlMaster=auto’, ‘-o’, ‘ControlPersist=60s’, ‘-o’, ‘ControlPath=/home/dhkarimi/.ansible/cp/ansible-ssh-%h-%p-%r’, ‘-o’, ‘StrictHostKeyChecking=no’, ‘-o’, ‘Port=22’, ‘-o’, ‘IdentityFile=/home/myuserid/.ec2/myprivatekey.pem’, ‘-o’, ‘KbdInteractiveAuthentication=no’, ‘-o’, ‘PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey’, ‘-o’, ‘PasswordAuthentication=no’, ‘-o’, ‘User=ubuntu’, ‘-o’, ‘ConnectTimeout=10’, u’10.128.3.96’, ‘/bin/sh -c 'sudo -k && sudo -H -S -p “[sudo via ansible, key=rvylffsgfldcrdflelruftijxekjhdqu] password: " -u root /bin/sh -c '”'“'/usr/bin/python /home/ubuntu/.ansible/tmp/ansible-1382460221.33-1627522380647/setup; rm -rf /home/ubuntu/.ansible/tmp/ansible-1382460221.33-1627522380647/ >/dev/null 2>&1'”'"''’]
ok: [10.128.3.96]

However, it fails in the next step in the playbook. I believe that is because it’s trying to log in as myuserid and not as ubuntu as defined in the following section of the playbook. The keys on the launched instance are set for the user ubuntu and not the user myuserid.

  • name: Configure instance
    hosts: deploy #must match groupname in “add_host” above
    user: ubuntu
    sudo: yes
    gather_facts: true

My understanding was that all tasks will use the user defined in the above section of the playbook.

But looks like I’ve resolved most of the important issues.

Thanks all for the help.
-Soumya

When passing "-vvvv" to ansible playbook, every task will output something like ...

<10.128.3.96> ESTABLISH CONNECTION FOR USER: ubuntu

If you suspect a task is connecting as the wrong user, take a look at the -vvvv outputs to confirm.

@James - I know that it is trying to connect a different user (myuserid) [see log below] and the ssh connection fails because the EC2 instance is configured for user ubuntu and not user myuserid.
Any idea how to fix this ?

TASK: [Get the latest updates for instance] ***********************************
<10.128.3.240> ESTABLISH CONNECTION FOR USER: myuserid

Can we see an example playbook?

Sure. I posted a version with my initial question. But here is the updated one.
I believe it fails here because of the incorrect user.

# Install the necessary software on each instance

  • name: Stage instance(s)
    hosts: cfgsvr
    connection: local
    user: dhkarimi
    gather_facts: false

vars:
keypair: mykeypair
image: ami-2efa9d47

instance_type: m1.small
subnet: subnet-XXXXXX
region: us-east-1
image: ami-2efa9d47

Launch 1 instance with the following parameters. Register the output.

You can change the number of instances to lanuch by changing the “count” variable in the location action

tasks:

  • name: Launch instance
    local_action: ec2 keypair={{keypair}} vpc_subnet_id={{subnet}} instance_type={{instance_type}} image={{image}} wait=true count=1
    register: ec2

Use with_items to add each instances public IP to a new hostgroup for use in the next play.

  • name: Add new instances to host group
    local_action: add_host hostname={{item.private_ip}} groupname=deploy
    with_items: ${ec2.instances}

  • name: Wait for the instances to boot by checking the ssh port
    local_action: wait_for host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
    with_items: ${ec2.instances}

This play targets the new host group

  • name: Configure instance
    hosts: deploy
    user: ubuntu
    sudo: yes
    gather_facts: true

# Install the necessary software on each instance
tasks:
- name: Get the latest updates for instance
action: command apt-get update

  • name: Install JDK
    apt: pkg=openjdk-6-jre-headless state=latest install_recommends=no update_cache=yes
    #action: apt pkg=java-1.7.0-openjdk state=latest

  • name: Install Maven2
    apt: pkg=maven2 state=latest update_cache=yes

Which task is connecting as dhkarimi when it should be connecting as ubuntu?

The one marked in bold in the playbook.

# Install the necessary software on each instance
tasks:

- name: Get the latest updates for instance
action: command apt-get update

This conversation should probably occur on IRC as it is of an very interactive basic-user-question type feature and is transitioning between different types of questions.

Rather than having a long thread about a lot of different topics, I would suggest taking it over there.

I had an issue that initially looked similar to this. I pored over this thread carefully, trying a number of different things. I found a workaround I hadn’t seen mentioned and wanted to mention it here in case this is useful for others.

Background/Environment:
Ansible 1.7.1 on OSX, provisioning and CentOS 6.5 EC2 instance

My goal was to write scripts to dynamically determine the IP of an EC2 instance, then run yum update/installs against that new instance.

wait_for ssh consistently indicated that I’d correctly gained the IP of the new instance, and something was answering at the ssh port, but ssh -vvvv debug output would indicate yum installs failed because ssh to the instance was failing. The ssh trace looked nearly identical to debug traces from successful manual command-line ssh tests until it failed at the end.

debug2: we did not send a packet, disable method
debug1: No more authentication methods to try.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password)

I was suspicious that ssh was being run against the wrong user, but the ansible debug output said it was using the user id I’d specified.

Workaround: I added a 1min pause before trying to run yum and have concluded that there was a race condition where the key had not been installed on the EC2 instance when ansible first tried to ssh there to run yum.

Yep, that’s a thing.

Typically you’ll only need to pause a few seconds after an SSH port comes up and before you can connect.

I’d reduce the 1 minute pause to only 5 seconds (immediately following the existing wait_for task), 10 if you are feeling like it.