Ansible 1.5 is released!

er … no … you lost me here.

I’m saying this works for top level playbooks, and that facts are not erased between includes:

site.yml

  • hosts: all
    gather_facts: yes
    tasks:

  • include: webserver_plays.yml

  • include: dbserver_plays.yml

I'll recap for the purposes of the list:

The question is removal of an undocumented and unclear flag, that modelled
what was a "Tri-state" boolean.

In this case, if gather_facts was not explicitly set True, it was assumed
"None", which meant "gather facts for hosts you have not talked to yet".
This means that any host that has registered something will be skipped in
later plays, and can't have facts that are dynamic.

What you call "dynamic facts" is simply rerunning setup at the start of every play. But they are not dynamic because the next task won't have them either.

In most if not all cases facts are static, and in case you want to re-enforce facts gathering, you could simply enable that for a specific play. Or even better, run the setup module within a play. There is no need for re-running the setup module at the start of every play. The overhead is big and the opportunities are scarce, especially if you properly structure your playbooks with self-contained plays.

This was an untuitive behavior, was never documented, and was removed.

It was not unintuitive, everyone understood this well, nobody complained about it and could have been documented (like many other things that are not documented yet).

And if you wonder why I am frustrated, THAT is exactly it. You make
changes without even discussing it on this list, and this is not the first
time it happened and it won't be the last.

Yes, this is true.

Ansible is our project, and we're not going to spin a wheel every time we
want to change something and say "please may I". However, we are going to
use our combined experience of talking to thousands upon thousands of
users, and do that to best help everyone here.

When that is done, if we occasionally screw up -- and we will -- it's best
to ask how we can fix it, and that will result in a technical discussion.

I asked it on IRC, send in a fix and it was accepted and reverted.

BTW That technical discussion was not held, the decision was made when you committed it for v1.5 and released it as part of v1.5. Reverting now is harder because suddenly this is the default in v1.5.

What's even more, it was not even announced as a change in the ChangeLog.

So I come back to my remark, if the majority doesn't care and this is not
a democracy, the new way is works-as-designed. Have it your way !

So yes, I don't understand why you would want to undo the speed
improvements of 1.5.0 with a (default) mandatory facts gathering at each
play. Presumably because a task can change the facts. Well, if that's your
concern, you should force a facts gathering per task (because every task
can change facts and that has also not been documented). Or instead, leave
it up to the user to decide when to gather facts, but do it at least once.
(The way it always worked since Daniel Hokka Zakrisson (dhozac) implemented
it)

The gather_facts on each play is still not mandatory, as you can have a
explicit

gather_facts: "no" on each play.

What I didn't like was how gather_facts <previous implementation> broke the
ability to gather new facts, which is a serious concern, and how it relied
on SETUP_CACHE to make that decision, which is also sketchy.

Well, my solution would have looked very differently. To me it is a design fault if variables you register end up in the SETUP_CACHE (!). That's
very alarming to me (not knowing that part of the code).

But a simple boolean to keep track whether setup was run could have been sufficient. In the ansible-provisioning project we actually have different namespaces for different facts modules, and there is a special fact module_<name> which is set if that specific module has run once. Such a mechanism could have been sufficient to fix this corner case.

BTW In the discussion of pull request you never mentioned what exact problem was fixed by removing the old behaviour. It could help if commits would give a proper description to why it was needed, and what it changes. SOmething you expect from external pull requests anyway.

Simply to ensure you have facts for each host, and then disable facts
gathering on every subsequent play to avoid the additional overhead of
running the setup module.

I know I am not going to change your mind, but I care too much to not even
try to make the case (that you seem to ignoring).

See, this is what I hate here -- you're jumping to assumptions about me
already.

Like you didn't do about me, right ? :slight_smile:

I'm not against providing a *better* way to solve this problem, that's more
clear and explicit, but the existing hack (being default behavior) was
breaking things and was not a good way to go.

What about simply looking at this as "gather_facts: auto" being a new mode,
and also being able to set that default in ansible.cfg?

How is that any different as it was. You complained about gather_facts having three states before and it not being a boolean. And now you propose to make it a three-state entity again ?

That's the kind of technical discussion I like to have.

Sure, BEFORE you release something as ansible v1.5.

You are ignoring the fact that you committed it and released it in a new release, only then when people complain to give the notion you prefer to have a proper technical discussion about it.

Sure gather_facts: auto is fine by me. But it would have been a lot better if that was what shipped with Ansible v1.5 in the first place. Now it simply looks as nobody has real control over what's going on unless you give your final say. And then we have the technical discussion...

I understand that, it is not the problem.

Consider my site.yml which has 23 different plays. To run with the new behavior I gather facts 23 times.

If I change my included plays to gather_facts: false to avoid this, I cannot run them individually anymore if they need facts (most of them do).

What I have been doing until now as a personal best practice, is always
explicitly set gather_facts: true or false on every play, setting it on
true in plays where facts are used.
This way facts gathering would trigger at the very last when I need them.
Now this new behavior will trigger extra runs, burning unneeded cycles.

I should also agree with Brian, I never encountered the key problem that is
solved here though.

On a dangerous side node, I actually never understood why registered
variables and set_fact vars live in the same namespace as gathered facts.

  Serge

Is it not what “module_setup”: “true”, was for?

Exactly, this is what we do often as well. I didn't this implication.

But now that Michael proposes to have gather_facts: auto configurable from ansible.cfg, it would suffice for me. What he thinks is best by default is not my concern, but his customer's and support's concern.

That said, I would definitely fix the logic that validates from SETUP_CACHE whether setup was already run, and make a special boolean or fact to keep track of whether it has run. Which IMO is the real bug that needs fixing.

I'll recap for the purposes of the list:

The question is removal of an undocumented and unclear flag, that modelled
what was a "Tri-state" boolean.

In this case, if gather_facts was not explicitly set True, it was assumed
"None", which meant "gather facts for hosts you have not talked to yet".
This means that any host that has registered something will be skipped in
later plays, and can't have facts that are dynamic.

What you call "dynamic facts" is simply rerunning setup at the start of
every play. But they are not dynamic because the next task won't have them
either.

I was referring to things like facts.d

And that they aren't gathered between tasks is not so important, but the
fact that they would be intuitively re-gathered if you wrote another play.

Most people do not set "gather_facts" behavior on a play at all, because
gathering facts is not expensive to them.

Brian in particular has some massively overloaded machines -- which, should
there be an ASPCA for computer machines, would be recieving some industry
protection.

Most folks also run with sufficiently high forks values :slight_smile:

In most if not all cases facts are static, and in case you want to
re-enforce facts gathering, you could simply enable that for a specific
play. Or even better, run the setup module within a play. There is no need
for re-running the setup module at the start of every play. The overhead is
big and the opportunities are scarce, especially if you properly structure
your playbooks with self-contained plays.

This was an untuitive behavior, was never documented, and was removed.

It was not unintuitive, everyone understood this well, nobody complained
about it and could have been documented (like many other things that are
not documented yet).

I'd definitely disagree that everyone knew about it -- it wasn't clear even
to *ME*.

And if you wonder why I am frustrated, THAT is exactly it. You make

changes without even discussing it on this list, and this is not the
first
time it happened and it won't be the last.

Yes, this is true.

Ansible is our project, and we're not going to spin a wheel every time we
want to change something and say "please may I". However, we are going
to
use our combined experience of talking to thousands upon thousands of
users, and do that to best help everyone here.

When that is done, if we occasionally screw up -- and we will -- it's best
to ask how we can fix it, and that will result in a technical discussion.

I asked it on IRC, send in a fix and it was accepted and reverted.

To be clear, that was merged by James Tanner, and we wasn't aware of the
previous change.

This was a communication error between us and not his fault :slight_smile:

BTW That technical discussion was not held, the decision was made when you
committed it for v1.5 and released it as part of v1.5. Reverting now is
harder because suddenly this is the default in v1.5.

What's even more, it was not even announced as a change in the ChangeLog.

That was my error.

The changelog is frequently updated retroactively and will continue to be
on many occasions. Also due to the extreme pace at which this project
moves there are times where we're going to miss something in the changelog.

So I come back to my remark, if the majority doesn't care and this is not

a democracy, the new way is works-as-designed. Have it your way !

So yes, I don't understand why you would want to undo the speed
improvements of 1.5.0 with a (default) mandatory facts gathering at each
play. Presumably because a task can change the facts. Well, if that's
your
concern, you should force a facts gathering per task (because every task
can change facts and that has also not been documented). Or instead,
leave
it up to the user to decide when to gather facts, but do it at least
once.
(The way it always worked since Daniel Hokka Zakrisson (dhozac)
implemented
it)

The gather_facts on each play is still not mandatory, as you can have a
explicit

gather_facts: "no" on each play.

What I didn't like was how gather_facts <previous implementation> broke
the
ability to gather new facts, which is a serious concern, and how it relied
on SETUP_CACHE to make that decision, which is also sketchy.

Well, my solution would have looked very differently. To me it is a design
fault if variables you register end up in the SETUP_CACHE (!). That's
very alarming to me (not knowing that part of the code).

I'm not sure why this should be alarming to you.

But a simple boolean to keep track whether setup was run could have been
sufficient.

Perhaps, though the default state should be to gather facts. Like I've
mentioned below I'd accept a "gather_facts: auto" and a configurable
default in ansible.cfg

BTW In the discussion of pull request you never mentioned what exact
problem was fixed by removing the old behaviour. It could help if commits
would give a proper description to why it was needed, and what it changes.
SOmething you expect from external pull requests anyway.

Now this is just getting nitpicky about commit messages.

Simply to ensure you have facts for each host, and then disable facts

gathering on every subsequent play to avoid the additional overhead of
running the setup module.

I know I am not going to change your mind, but I care too much to not
even
try to make the case (that you seem to ignoring).

See, this is what I hate here -- you're jumping to assumptions about me
already.

Like you didn't do about me, right ? :slight_smile:

I didn't.

I'm not against providing a *better* way to solve this problem, that's

more
clear and explicit, but the existing hack (being default behavior) was
breaking things and was not a good way to go.

What about simply looking at this as "gather_facts: auto" being a new
mode,
and also being able to set that default in ansible.cfg?

How is that any different as it was. You complained about gather_facts
having three states before and it not being a boolean. And now you propose
to make it a three-state entity again ?

That's a valid point. Perhaps it should be a new setting altogether.

gather_facts: True/False

It may also be better addressed by fact caching.

That's the kind of technical discussion I like to have.

Sure, BEFORE you release something as ansible v1.5.

You are ignoring the fact that you committed it and released it in a new
release, only then when people complain to give the notion you prefer to
have a proper technical discussion about it.

I'm always open to a discussion. And I don't really view it as complaint
-- I see it as a need to have a discussion about what the project/code
needs to be doing.

This is what this list is for.

Sure gather_facts: auto is fine by me. But it would have been a lot better
if that was what shipped with Ansible v1.5 in the first place. Now it
simply looks as nobody has real control over what's going on unless you
give your final say. And then we have the technical discussion...

I'll say this, every time this kind of attitude comes up, it makes me want
to help you less and less.

And it takes time out of when we could be helping others.

I understand that, it is not the problem.

Consider my site.yml which has 23 different plays. To run with the new
behavior I gather facts 23 times.

If I change my included plays to gather_facts: false to avoid this, I
cannot run them individually anymore if they need facts (most of them do).

Exactly, this is what we do often as well. I didn't this implication.

But now that Michael proposes to have gather_facts: auto configurable from
ansible.cfg, it would suffice for me. What he thinks is best by default is
not my concern, but his customer's and support's concern.

I'd like to let this thread drop, but I don't want to leave this
assumption. Commerce has not even been a part of this discussion. What
we do on this list is try to maintain Ansible for thousands upon thousands
of users, and try to make sure that we design and test things to ensure we
create a good product for everyone.

We are not, by any means, responding to a customer ticket to fix behavior.

However, I *AM* going to use my expertise about how to make Ansible easier
to everyone from time to time, to try and target Ansible so that it helps
the most people.

Which if people like Ansible to date, I think they have to trust that is
what we have been doing, and are continuing to do.

It's not going to be the case that the most vocal always get to win their
arguments, as that is a path to disaster for everyone involved.

When people have concerns -- PLEASE -- express them. But also spare the
allegations of evil-doing.

The transmuto-ray in the basement isn't even ready yet.

@micheal, I'm not even making the case for my machines, my fact cache
plugin has solved that issue temporarily until I get them off life support
(1/2 way there).

About the boolean that shows setup has run, IT ALREADY EXISTS and we
ALREADY CHECK IT: "module_setup": "true"

which is set ONLY by the setup plugin to indicate that it ran, this
distinguishes it from registered vars so I'm not sure how the problem you
are trying to fix occurred.

Here was the code in question, which didn’t check it. It probably should have. In either case, I’m open to having a different flag to express this though we should be avoiding tri-state booleans in the code, which is a question of how it should be expressed interface wise.

Probably a policy variable in ansible.cfg could just control what “gather_facts: True” does, and there’s always the explicit call to setup if you want it. So the “None” thing can still go.

This is the heinous line in question:

-            host_list = [h for h in host_list if h not in self.SETUP_CACHE or 'module_setup' not in self.SETUP_CACHE[h]]

You read too much into what I say. I am simply saying that a change like this affects you much more than it does me. I can patch it myself, or customize the default behaviour, whereas you have to live with the support of it at a much larger scale.

Maybe I am also saying that I don't want to support the possible fallout of such changes in the community, but that's self-evident. Do leave any paranoid thoughts out of it please, not good for the heart :slight_smile:

I suggest we now Fedex beverages to each other.

note: may have been spending last several years building up immunity to iocaine powder :slight_smile:

Now I feel stupid :slight_smile:

https://github.com/ansible/ansible/commit/fedfd187749654105a22be20c27e0050bc722d0a

But yes, that's the logic we (I guess) copied however I didn't see the setup module exposing this. So what confused me here is that this is piggy-backed by the facts-gathering logic, rather than part of the setup module as I was used to.

So in my opinion, a call to setup (through an action) should expose this as well. My bad :slight_smile:

Ok, so this pull-request implements the setup module how it should have been from the start:

     https://github.com/ansible/ansible/pull/6278

No special magic from the playbook/__init__.py, and it behaves like other facts modules, especially when run as a task.

I hope this satisfies all parties

https://github.com/ansible/ansible/pull/6508

Perhaps some assignment of play.gather_facts to the effective boolean, if None, might simplify things?

That way any code in question that might refer to it can just check play.gather_facts and get the mode.

But yeah, this config setting is, I think, exactly the way to do it to make the system self documenting.

Perhaps some renames though?

always => implicit
never => explicit
smart => smart

never might tend to imply gather_facts might be ignored.  Some of this could also be dealt with in comments in the ansible.cfg though

(Also, green bikesheds!)







 

I'll make a resubmit with the changes ... but the font is going to be
yellow!

I changed the settings, but kept the trinary boolean (I know it bugs you) as I think it is the simplest implementation.

We’ll take a look.

It’s probably better if we can get this to have predicate methods like “play.should_gather_facts()” or something if we want to escape the need for having to repeat test-logic for it in the code every time this field is accessed. But that can wait.