how to exit the ansible run, or handle it, when any remote node fails a task

Kathy_Allen · August 6, 2015, 7:05pm

Hi.

I’m working on some orchestration where I need to run a task across sets of N remote nodes. If that task fails on any one of the remote nodes, the orchestration needs to halt (or be handled somehow). In my test, I cause one node to fail and I expected the entire ansible run to bomb out, but that’s not what happened. The failed node is reported, but the playbook continues on.

How can I make ansible exit upon the failure of any one of these nodes?

Or, how can I have some kind of handler to pause the run before continuing? (I’ve not yet looked into handlers)

Playbook, plays, tasks, and output are shown below. One question about the output: for the node that failed, the task “debug: var=output” is absent. That task only fires for the successful node. Should I expect that task to also fire for the failed node? I was surprised by that.

Thanks!
kallen

`

$ cat testplaybook.yml

jorginator74 · August 6, 2015, 7:50pm

Hi

Hi.

I'm working on some orchestration where I need to run a task across
sets of N remote nodes. If that task fails on any one of the remote
nodes, the orchestration needs to halt (or be handled somehow). In my
test, I cause one node to fail and I expected the entire ansible run
to bomb out, but that's not what happened. The failed node is
reported, but the playbook continues on.

That is by design.

How can I make ansible exit upon the failure of any one of these
nodes?

http://docs.ansible.com/ansible/playbooks_delegation.html#maximum-failure-percentage

You can set mail_fail_percentage: 0

Or, how can I have some kind of handler to pause the run before
continuing? (I've not yet looked into handlers)

Don't think so ....

Playbook, plays, tasks, and output are shown below. One question about
the output: for the node that failed, the task "debug: var=output" is
absent. That task only fires for the successful node. Should I expect
that task to also fire for the failed node? I was surprised by that.

No - once a node fails (without "ignore_errors: True"), it is no longer
part of the remainder of the play, so no further tasks will be executed
on the failed node.

Hope this helps

Kathy_Allen · August 6, 2015, 8:15pm

Ah! Fantastic. Thank you. I put in max_fail_percentage, and the thing I wanted to happen happened.

I do wonder about how to more elegantly handle one of the nodes failing, with a handler. Like something simple to start: “prompt: pause here, go fix that node if you can. If you can’t, ctrl-c now.” Perhaps I should add “ignore_failures: true” and experiment?

It’s strange … I do have another task that runs per webapp node that runs a local check script – it’s a ruby program that will exit non-zero upon error condition. When any node has failed that check, the ansible run comes to a screeching halt. That play contains no max_fail_percentage and no ignore_failure: true.

We use ansible 1.8.2.

I’ll move forward with your advice. And, FWIW … this bombs out the entire run when any node fails:

`

Topic		Replies	Views
Changing actions in case a node fails Ansible Project	6	4	January 21, 2016
soliciting feedback on ideas about playbook error handling in regards to code deployment Ansible Project	10	0	November 28, 2012
Continu running tasks after a failure without ignoring? Ansible Project	3	87	October 24, 2014
workaround for serial: 1 failures stopping the entire playbook? Ansible Project	4	62	October 22, 2019
ansible didn't stop execution upon a task failure Ansible Project	5	5	December 3, 2015

how to exit the ansible run, or handle it, when any remote node fails a task

Related topics