Ansible version 1.5.3
Out playbook look like below:
debug.ym:
- hosts:
- cnode463
tasks:- include: roles/conf/tasks/hadoop.yml
hadoop.yml
- name: copy hadoop conf
sudo: yes
template: src={{ TEMPLATE_DIR }}/hadoop/{{item}}.j2 dest=/etc/hadoop/conf/{{item}}
with_items:- core-site.xml
- hdfs-site.xml
- hdfs-site.private.xml
- log4j.properties
- hadoop-env.sh
when running the playbook, sometime we get failed.
TASK: [copy hbase conf] *******************************************************
ok: [cnode463] => (item=hbase-site.xml)
ok: [cnode463] => (item=log4j.properties)
failed: [cnode463] => (item=hbase-env.sh) => {"failed": true, "item": "hbase-env.sh", "parsed": false}
FATAL: all hosts have already failed -- aborting
PLAY RECAP ********************************************************************
cnode463 : ok=1 changed=0 unreachable=0 failed=1
I debug the ansible code and add below code to print the result of running runner._low_level_exec_command
print "****"
print "cmd "+str(cmd)
print "out "+str(out)
print "err "+str(err)
print "____"
And last I found that _low_level_exec_command may not get the output of the cmd correctly.
the debug log is below:
****
cmd mkdir -p $HOME/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534 && echo $HOME/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534
out /home/hadoop/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534
err
____
****
cmd rc=0; [ -r "/etc/hadoop/conf/yarn-site.private.xml" ] || rc=2; [ -f "/etc/hadoop/conf/yarn-site.private.xml" ] || rc=1; [ -d "/etc/hadoop/conf/yarn-site.private.xml" ] && echo 3 && exit 0; (/usr/bin/md5sum /etc/hadoop/conf/yarn-site.private.xml ) || (/sbin/md5sum -q /etc/hadoop/conf/yarn-site.private.xml ) || (/usr/bin/digest -a md5 /etc/hadoop/conf/yarn-site.private.xml ) || (/sbin/md5 -q /etc/hadoop/conf/yarn-site.private.xml ) || (/usr/bin/md5 -n /etc/hadoop/conf/yarn-site.private.xml ) || (/bin/md5 -q /etc/hadoop/conf/yarn-site.private.xml ) || (/usr/bin/csum -h MD5 /etc/hadoop/conf/yarn-site.private.xml ) || (/bin/csum -h MD5 /etc/hadoop/conf/yarn-site.private.xml ) || (echo "${rc} /etc/hadoop/conf/yarn-site.private.xml")
out
err
____
****
cmd /usr/bin/python /home/hadoop/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534/copy; rm -rf /home/hadoop/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534/ >/dev/null 2>&1
out
err
____
failed: [cnode463] => (item=yarn-site.private.xml) => {"failed": true, "item": "yarn-site.private.xml", "parsed": false}
This problem appear more and more frequently.Is it possible to fix it ?