Change of behaviour of delagate_to between Ansible 1.9.4 and 2.0.1.0

Apologies for the previous blank post.

I have had a playbook in production for about 9 months that carries out a rolling update or our application across a number of servers by using a pre_tasks section that removes a server from load balancers before deploying updated code and then adding the server back under post tasks.

It was based on the play documented

http://docs.ansible.com/ansible/guide_rolling_upgrade.html

It has been working fine under ansible 1.9.4 however after upgrading to ansible 2.0.1.0 the pre_tasks are carried out only one of the servers in the proxies group (but twice) and same for post_tasks when the file is uncommented. I have setup an example playbook on a few virtualbox machines to troubleshoot and still experience the same issue.

My test setup is the Ansible machine 192.168.56.103 (or local 127.0.0.1)

and the two proxy servers

Web1 192.168.56.101
Web2 192.168.56.102

The command I am running

[root@web ~]# ansible-playbook -i inventory/hosts ansible-test.yml -vvv

The Inventory file: (inventory/hosts)

`
[local]
127.0.0.1

[proxies]
192.168.56.101 ansible_ssh_user=vagrant
192.168.56.102 ansible_ssh_user=vagrant

`

any the playbook: (ansible-test.yml)

`

  • name: poll proxies
    hosts: proxies
    tasks:

  • name: test deployment script
    hosts: localhost

pre_tasks:

  • name: debug groups
    debug:
    msg: “{% for name in groups[‘proxies’] %} {{ name }} {% endfor %}”

  • name: take out admin backend from upstream
    shell: rpl “server {{ inventory_hostname }}” “#server {{ inventory_hostname }}” /etc/rpl
    delegate_to: “{{ item }}”
    with_items: “{{ groups[‘proxies’] }}”
    become: true

tasks:

  • name: pause
    pause:
    seconds: 60

post_tasks:

  • name: add out admin backend from upstream
    shell: rpl “#server {{ inventory_hostname }}” “server {{ inventory_hostname }}” /etc/rpl
    delegate_to: “{{ item }}”
    with_items: “{{ groups[‘proxies’] }}”
    become: true

`

The behaviour I expect is that the command rpl will be run on both 192.168.56.101/102 commenting out the server line on each machine before pausing for 60 secs for the main task. If I check the file /etc/rpl during the 60 second delay each server would show

#server localhost

once the post task has completed these files should have returned to

server localhost

under Ansible 1.9.4 this is the case however under Ansible 2.0.1.0 the behaviour is that the /etc/rpl file on 192.168.56.101 shows

##server localhost

and 192.168.56.102 shows

server localhost

checking the /var/log/secure logs I can see that Ansible hasn’t logged into 192.168.56.102 at all.

When looking at the Ansible logs they suggest that both servers have been updated

`
TASK [take out admin backend from upstream] ************************************
changed: [localhost → 192.168.56.101] => (item=192.168.56.101) => {“changed”: true, “cmd”: “rpl "server localhost" "#server localhost" /etc/rpl”, “delta”: “0:00:00.016656”, “end”: “2016-03-18 16:52:39.937608”, “item”: “192.168.56.101”, “rc”: 0, “start”: “2016-03-18 16:52:39.920952”, “stderr”: “Replacing "server localhost" with "#server localhost" (case sensitive) (partial words matched)\n.\nA Total of 1 matches replaced in 1 file searched.”, “stdout”: “\u001b[?1034h”, “stdout_lines”: [“\u001b[?1034h”], “warnings”: }
changed: [localhost → 192.168.56.102] => (item=192.168.56.102) => {“changed”: true, “cmd”: “rpl "server localhost" "#server localhost" /etc/rpl”, “delta”: “0:00:00.016714”, “end”: “2016-03-18 16:52:40.154668”, “item”: “192.168.56.102”, “rc”: 0, “start”: “2016-03-18 16:52:40.137954”, “stderr”: “Replacing "server localhost" with "#server localhost" (case sensitive) (partial words matched)\n.\nA Total of 1 matches replaced in 1 file searched.”, “stdout”: “\u001b[?1034h”, “stdout_lines”: [“\u001b[?1034h”], “warnings”: }

TASK [echo upstream] ***********************************************************
changed: [localhost → 192.168.56.101] => (item=192.168.56.101) => {“changed”: true, “cmd”: “cat /etc/rpl”, “delta”: “0:00:00.001938”, “end”: “2016-03-18 16:52:40.410069”, “item”: “192.168.56.101”, “rc”: 0, “start”: “2016-03-18 16:52:40.408131”, “stderr”: “”, “stdout”: “##server localhost”, “stdout_lines”: [“##server localhost”], “warnings”: }
changed: [localhost → 192.168.56.102] => (item=192.168.56.102) => {“changed”: true, “cmd”: “cat /etc/rpl”, “delta”: “0:00:00.001713”, “end”: “2016-03-18 16:52:40.618319”, “item”: “192.168.56.102”, “rc”: 0, “start”: “2016-03-18 16:52:40.616606”, “stderr”: “”, “stdout”: “##server localhost”, “stdout_lines”: [“##server localhost”], “warnings”: }
`

However during the 60 second pause task If I manually ssh to both web servers from the server running Ansible I can see that only one server is being updated, albeit twice. (This is to show I haven’t got the two ip’s mapped to the same server)

`
[root@web ~]# ssh vagrant@192.168.56.101 “cat /etc/rpl”
##server localhost

[root@web ~]# ssh vagrant@192.168.56.102 “cat /etc/rpl”
server localhost
`

I’m really not sure what is going on and cannot see any mention of behaviour changing between the two version of Ansible. Can anybody help?

Thanks

Check against current devel, we just fixed an issue with delegation connection vars not being set correctly.

I’ve just tested against devel and can confirm that has issue is resolved.

Thanks