Request to get elb instances fails randomly

Hi there,

We have been seeing a very high number of SSL error ‘The read operation timed out’ while using the aws API via ec2_elb ansible module.

{“failed”: true, “item”: “eb-development”, “module_stderr”: "Traceback (most recent call last):\n File "/root/.ansible/tmp/ansible-tmp-1464953539.92-56419966755081/ec2_elb",
line 2583, in \n main()\n File "/root/.ansible/tmp/ansible-tmp-1464953539.92-56419966755081/ec2_elb",
line 356, in main\n elb_man = ElbManager(module, instance_id, ec2_elbs, region=region, **aws_connect_params)\n File "/root/.ansible/tmp/ansible-tmp-1464953539.92-56419966755081/ec2_elb",
line 121, in init\n self.lbs = self._get_instance_lbs(ec2_elbs)\n File "/root/.ansible/tmp/ansible-tmp-1464953539.92-56419966755081/ec2_elb",
line 268, in _get_instance_lbs\n newelbs = elb.get_all_load_balancers(marker=marker)\n File "/usr/local/lib/python2.7/dist-packages/boto/ec2/elb/init.py",
line 135, in get_all_load_balancers\n [(‘member’, LoadBalancer)])\n File "/usr/local/lib/python2.7/dist-packages/boto/connection.py",
line 1171, in get_list\n body = response.read()\n File "/usr/local/lib/python2.7/dist-packages/boto/connection.py",
line 410, in read\n self._cached_response = http_client.HTTPResponse.read(self)\n File "/usr/lib/python2.7/httplib.py",
line 557, in read\n s = self._safe_read(self.length)\n File "/usr/lib/python2.7/httplib.py",
line 664, in _safe_read\n chunk = self.fp.read(min(amt, MAXAMOUNT))\n File "/usr/lib/python2.7/socket.py",
line 380, in read\n data = self._sock.recv(left)\n File "/usr/lib/python2.7/ssl.py",
line 714, in recv\n return self.read(buflen)\n File "/usr/lib/python2.7/ssl.py",
line 608, in read\n v = self._sslobj.read(len or 1024)\nssl.SSLError: (‘The read operation timed out’,)\n",
“module_stdout”: “”, “msg”: “MODULE FAILURE”, “parsed”: false}

We’re seeing this error only on API calls to register/deregister instances in to ELB when we execute these tasks:

  • name: Instance De-register
    tags: balancer
    environment: “{{ aws_env }}”
    local_action:
    module: ec2_elb
    instance_id: “{{ ansible_ec2_instance_id }}”

state: ‘absent’

  • name: Instance Register
    tags: balancer
    environment: “{{ aws_env }}”
    local_action:
    module: ec2_elb
    instance_id: “{{ ansible_ec2_instance_id }}”

ec2_elbs: “{{ item }}”
state: ‘present’
with_items: ‘{{ elbs }}’

On jenkins slaves running on Debian 8.2 with these packages:
ansible==2.0.2.0

boto==2.40.0

Python 2.7.9

python2.7-minimal 2.7.9-2

Has some other seen this? We don’t think that we reach the limit request available for aws API in our account.

Yes, seeing it. Also think it’s rate limiting.

You seem to be hitting a known bug: https://github.com/ansible/ansible-modules-core/issues/3444

It’s fixed in Ansible devel, check the bug report for a workaround. At least it worked for me.

Thanks for your replies, we tried to change nat instance to nat gateway, we have some connectivity issues because our nat intance has low network performance. This bug is related to ec2_elb_lb module instead ec2_elb, our case we have troubles with last.
I’ve seen this workaround in other issue we will try with this.

- name: registering instance to its respective groups ELB
  # instances that do not require ELBs do not need to run this part of the playbook
  when: elb_shortname is defined and lights_on|default ("false") == "true"
  local_action: ec2_elb
  args:
    region: 'us-west-2'
    enable_availability_zone: 'no'
    instance_id: "{{ instance_id }}"
    ec2_elbs: "{{ env }}-{{ elb_shortname }}"
    state: 'present'
    wait: 'yes'
    wait_timeout: '120'
  register: ec2_elb_result
  until: ec2_elb_result|success
  retries: 10
  delay: 30

Cheers

I don’t see the error - The read operation timed out anywhere in the bug?