Issue deploying playbook through GCP IAP behind a bastion host using ssh config and ProxyCommand

TL;DR

We have an ssh configuration issue that prevents us from deploying using ansible-playbook through GCP Identity Aware Proxy to an internal host behind a bastion host. We have requested help from GCP but we are reaching a point where they can’t help us anymore and suggested we contact you here.

Long form

In order to improve our security posture, we have enabled Identity Aware Proxy against our Compute Engine VMs (for SSH connections).
As a result, we are no longer able to deploy using ansible-playbook, as ansible ssh is unable to connect to the hosts to deploy to.

Infrastructure

Our infrastructure is as follows:

[project-a]  | [project-b]
-------------|-------------
bastion-host | backend-host

Where:

  • bastion-host is the only host that is SSH-accessible from public internet, but is protected with IAP;
  • backend-host is SSH-accessible internally via VPC-peering between project-a and project-b, but is protected internally protected with IAP.

Prior to enabling IAP, we were able to connect to backend-host without any issue using ssh-add ~/.ssh/google_compute_engine to forward our ssh key through the tunnel; not anymore.

We want to deploy to backend-host using ansible-playbook.
Given the structure of the network, we need to ssh-hop through bastion-host first to reach backend-host.

Ansible setup

Our ansible is setup as follows:

ansible.cfg

[ssh_connection]
ssh_args = -F ssh.config -C -o ControlMaster=auto -o ControlPersist=360s -o ConnectTimeout=30

ssh.config

Host bastion.example.com
   HostName <PUBLIC IP>
   IdentityFile ~/.ssh/google_compute_engine
   ServerAliveInterval 60
   ProxyCommand gcloud compute ssh bastion --project "project-a" --zone "northamerica-northeast1-a" --tunnel-through-iap --impersonate-service-account="gce-bastion-prod@project-a.iam.gserviceaccount.com"
   StrictHostKeyChecking no
   UserKnownHostsFile=/dev/null
   RequestTTY force                   # Force TTY allocation for interactive sessions
   LogLevel DEBUG3

Host backend.example.com
   HostName backend.example.com
   ProxyJump bastion.example.com
   ProxyCommand gcloud compute ssh backend --project "project-b" --zone "northamerica-northeast1-a" --impersonate-service-account="gce-backend-prod@project-b.iam.gserviceaccount.com"
   StrictHostKeyChecking no
   ServerAliveInterval 60
   UserKnownHostsFile=/dev/null
   RequestTTY force                   # Force TTY allocation for interactive sessions
   LogLevel DEBUG3

Important note

This issue is regarding the ssh.config and how to configure ansible to connect to the backend-host; the above proxy commands, when executed from the command line, successfully results in logging in to the backend-host.

Error & logs

When running

ansible-playbook Playbook.yml -l bastion -e environ=production

The ProxyCommand for host bastion.example.com is executed, and the connection is established, but then errors out:

...
WARNING: This command is using service account impersonation. All API calls will be executed as [gce-bastion-prod@project-a.iam.gserviceaccount.com].
debug1: kex_exchange_identification: banner line 0: Linux bastion 4.19.0-27-cloud-amd64 #1 SMP Debian 4.19.316-1 (2024-06-25) x86_64
debug1: kex_exchange_identification: banner line 1: 
debug1: kex_exchange_identification: banner line 2: The programs included with the Debian GNU/Linux system are free software;
debug1: kex_exchange_identification: banner line 3: the exact distribution terms for each program are described in the
debug1: kex_exchange_identification: banner line 4: individual files in /usr/share/doc/*/copyright.
debug1: kex_exchange_identification: banner line 5: 
debug1: kex_exchange_identification: banner line 6: Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
debug1: kex_exchange_identification: banner line 7: permitted by applicable law.
-bash: line 1: $'SSH-2.0-OpenSSH_9.6\\r': command not found
Connection timed out during banner exchange
Connection to UNKNOWN port 65535 timed out

I can’t seem to figure it out; I’ve checked the internet for proposed solutions to $'SSH-2.0-OpenSSH_9.6\\r': command not found; Connection timed out during banner exchange and they didn’t apply to my issue. There were also alternative, more complex answers that I have found online:

[1] google cloud platform - Ansible GCP IAP tunnel - Stack Overflow
[2] ssh - With Ansible, is it possible to connect connect to hosts that are behind Cloud IAP (Identity-Aware Proxy) in GCP? - Unix & Linux Stack Exchange

but to be honest, I’m afraid to introduce complexity that I don’t understand and as a result if each solution doesn’t work right away, I won’t know what’s wrong, and it’ll be like throwing darts in the dark :confused:

Can you help me out with this SSH configuration issue? :pray:
Thank you very much :bowing_man:

I remember seeing something similar to this, which turned out to be caused by banners that confused things downstream.
Try experimenting with ‘–quiet’ in the ‘gcloud compute ssh’ stanzas.

Dick

Hi Dick!

Thanks for your answer!

If it is that simple, I will be really happy. Unfortunately at this time, our VM instance is on an older OpenSSH version prior to -q actually removing the banners (tried --quiet, -- -q, --quiet -- -q to no effect).

I will have to upgrade the OS before I can upgrade the version of OpenSSH. I’ll get on it and let you know when it’s done. Expect an answer by Tue Oct 22nd, 2024, 9PM EDT.

Have a great week-end!
Philippe