Ansible 2.2.1 - Module pause : Strange Behavior

Hello

I just encountered a special use case of the module ‘pause’ that seems strange to me.

It seems that this module is executed only for the first host of the playbook.

I woul’d like to run a pause task inside a block.

Example :

  • block:

Arrêt de MDM

  • name: “{{ playbook_name }} - arret du service RTU_JBOSSMDM_SANS_NETRICS”
    win_service:
    name: “RTU_JBOSSMDM_SANS_NETRICS”
    state: “stopped”

Pause de 40 secondes pour laisser le temps au service de s’arrêter

  • name: “{{ playbook_name }} - attente de 40s”
    pause:
    seconds: “40”

Démarrage de MDM

  • name: “{{ playbook_name }} - arret du service RTU_JBOSSMDM_SANS_NETRICS”
    win_service:
    name: “RTU_JBOSSMDM_SANS_NETRICS”
    state: “started”

Pause de 180 secondes pour laisser le temps au service de démarrer

  • name: “{{ playbook_name }} - attente de 180s”
    pause:
    seconds: “180”
    when: (cluster == “non” or (cluster == “oui” and win_stat_disque_f_result.stat.exists == True))

The big problem is : if the condition of the block causes the first host to be excluded from the block, the pause task within this block will not be executed.

I solved the problem by removing the block and putting the ‘when’ condition on each task (except for the pause task).

It works but it seems not very clean to me.

Example :

Arrêt de MDM

  • name: “{{ playbook_name }} - arret du service RTU_JBOSSMDM_SANS_NETRICS”
    win_service:
    name: “RTU_JBOSSMDM_SANS_NETRICS”
    state: “stopped”
    when: (cluster == “non” or (cluster == “oui” and win_stat_disque_f_result.stat.exists == True))

Pause de 40 secondes pour laisser le temps au service de s’arrêter

  • name: “{{ playbook_name }} - attente de 40s”
    pause:
    seconds: “40”

Démarrage de MDM

  • name: “{{ playbook_name }} - arret du service RTU_JBOSSMDM_SANS_NETRICS”
    win_service:
    name: “RTU_JBOSSMDM_SANS_NETRICS”
    state: “started”
    when: (cluster == “non” or (cluster == “oui” and win_stat_disque_f_result.stat.exists == True))

Pause de 180 secondes pour laisser le temps au service de démarrer

  • name: “{{ playbook_name }} - attente de 180s”
    pause:
    seconds: “180”

Regards,

Fabrice Perko

This is 'expected' with pause or any other module that bypasses the
normal host loop, or if you add `run_once: yes` to a task.

This is 'expected' with pause or any other module that bypasses the
normal host loop, or if you add `run_once: yes` to a task.

https://github.com/ansible/ansible/issues/19966 is a ticket about this.

As I commented there: It's very unintuitive that if I have a playbook that
runs on ten hosts, but with a when condition such that some of the hosts
are sometimes skipped (e.g. if they're down at the moment, if they're in
the green rather than the blue group, etc), and somewhere in the middle is
a pause or other run_once task, that task will or won't run in a
completely unpredictable way based on the order of the host list.

Saying "if any hosts are skipped, a run_once task might or might not run"
seems obviously wrong to me, and not like a "feature" at all.

If run_once picked the first non-skipped host on the list, that seems like
it would completely solve this. (If the entire host list was being
skipped, then of course the task would be skipped too.) Is that possible?

                                      -Josh (jbs@care.com)

(apologies for the automatic corporate disclaimer that follows)

This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.

it would be a diff feature 'run_for_first_true'? the design of the
current one is not intuitive, but it has been used and relied on for a
long time.

it would be a diff feature 'run_for_first_true'?

Or maybe the existing feature should be renamed "run_once_or_zero_times". :^)

the design of the current one is not intuitive, but it has been used
and relied on for a long time.

I'm definitely on board with not changing the behavior of things when
people are relying on them to behave the way they always have, but does
anyone actually rely on tasks with run_once sometimes not running? What's
a use case for "I want to run this once, or zero times, depending on
whether the first host in the list is skipped"?

I may just be unimaginative, but I can't think of *any* situation where
I'd want that behavior. And it's hard to see how anyone could be relying
on it, since it's by definition reliable, when you don't know whether the
task is going to run or not.

                                      -Josh (jbs@care.com)

(apologies for the automatic corporate disclaimer that follows)

This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.

I may just be unimaginative, but I can't think of *any* situation
where I'd want that behavior. And it's hard to see how anyone could
be relying on it, since it's by definition reliable, when you don't
know whether the task is going to run or not.

Bah, "by definition not reliable" I meant to say there.

                                      -Josh (jbs@care.com)

(apologies for the automatic corporate disclaimer that follows)

This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.

I've actually used it that way when dealing with cluster masters
(normally come first in inventory), im not saying its ideal but i know
plenty of playbooks that do the same.

Ok, thanks for the information.

For me it is rather a bug: the task will be executed only on the first host (ex: groups[‘all’][0])

If groups[‘all’][0] is excluded, the task will not be executed.

I think the problem would be solved if we could use delegate_to with this module (ex: delegate_to: localhost). Currently this is not possible.

Brian, the cluster is in fact a failover cluster under windows → The master server can change.

In the end, I think it would be good to add one or two lines of information in the ‘pause’ module documentation, as it’s done with run_once : I’ve lost a few hours to figure out where the problem came from.

The pause module is executed only on the first host of the playbook. If this host is excluded from the task, the task will not be executed.

If other modules work in the same way the documentation should be more explicit.

Thank you very much.

Fabrice Perko