Maintaining state of a set of machines which are not always up or connected

what is the recommendation or best practice to apply a playbook to a set of machines which may often not be running? Shouldn’t there be a registry of which playbooks have been applied where? Even if playbooks are perfectly idempotent, it is highly inefficient to keep applying them again and again…

Would like to understand the suggested approach from the ansible designers for this use case before looking elsewhere/doing any dev work for the same

Thanks,
eskhool

what is the recommendation or best practice to apply a playbook to a set of
machines which may often not be running?

dynamic inventory might help you there.

Shouldn't there be a registry of
which playbooks have been applied where? Even if playbooks are perfectly
idempotent, it is highly inefficient to keep applying them again and
again...

Not in my experience - I'd do some measurements before you spend too much
time solving a problem that may not exist.

Would like to understand the suggested approach from the ansible designers
for this use case before looking elsewhere/doing any dev work for the same

Have a quick read of http://www.ansible.com/scaling-and-performance-whitepaper

Cheers,
Dick.

what is the recommendation or best practice to apply a playbook to a set of
machines which may often not be running?

You might want to look at running ansible-pull on boot - then each machine will get into consistent state ASAP.
But that takes some control from you, and can't really be used for orchestration.

Shouldn't there be a registry of
which playbooks have been applied where? Even if playbooks are perfectly
idempotent, it is highly inefficient to keep applying them again and
again...

No, but you can build one pretty easily using callback plugins and a wrapper for filtering.
Not that I think it's needed, as running a play on all machines is often as quick as running it on just the changed one. Modules do a pretty good job of not doing things that don't have to be done, and tasks are run in parallel, which means the only host slowing down a play is the one that you wanted to target anyway.

Would like to understand the suggested approach from the ansible designers
for this use case before looking elsewhere/doing any dev work for the same

Well, I'm not an Ansible designer, but I still hope I had some useful suggestions :slight_smile:

what is the recommendation or best practice to apply a playbook to a set
of machines which may often not be running?

​Depends what behavior you want. Should it boot up if you want to run
ansible on them, or should it fail?

Shouldn't there be a registry of which playbooks have been applied where?

​Not really. A fact cache is something in the pipeline AFAIK; would be nice
also to be able to feed facts back from specific playbook outputs too (e.g.
deployed version of things.)​

Even if playbooks are perfectly idempotent, it is highly inefficient to

keep applying them again and again...

​Not sure why this is inefficient. You have lots of hosts? Also, why would
you want to play them them again and again? Did you consider using --limit
and/or --tags?

  Serge

A really awesome solution here are the provisioning callbacks in Ansible tower.

Set up a cron tab that pings tower (anacron, firstboot, whatever) and when the machine checks in, ansible jobs will make sure that system gets the latest configuration.

This is better than ansible-pull because you get the centralized history, and it’s way easier to set up as well.

http://www.ansible.com/tower