I don’t object to finding other solutions, but felt i was stumbling trying to get the point across. The use case is being able to make task-by-contract guarantees:
A) the environment of the task is configured correctly and the task will always succeed,
B) the environment is misconfigured and the task fails before doing anything,
C) the environment is mostly configured, but the task knows how to fill in the blanks)
Each of these being checked/applied a priori with respect to task execution proper. always, in this context meaning if the task does fail, it won’t have been because of configuration. Particularly, some of the tasks require complex data structures, and those data structures need to be recursively validated with the same 3 goals.
At issue is goal C. For the case of scalars, the existing precedence rules function sufficiently; either the parameter has a user-supplied value, or it takes on any role-default if specified.
However, for complex data types, this is not the case.
If X is a dict(), having fields {x=a,y=b,z=None}, specified by user
and
If Y is a template for our expectation of X, having fields x=a’,y=b’,z=c’,
then
We would like the effective value of X, as seen by the play and its tasks, to be X’ {a=X.a, b=X.b, c=Y.c’}
thereby specifying piece-wise defaults for complex data X.
My method of achieving this is a custom module ‘requirements’ which provides all 3 guarantees. It particularly addresses the corner case of C for complex data by modifying the self.runner.vars_cache[host][‘X’] (in keeping with our example). The alternative, to return {ansible_facts:{X:{x:X.a,y:X.b,z:Y.c’}}}, would only update the setup cache, and subsequent tasks would not see the returned value because the user-supplied value in the vars_cache would take precedence when combining to create the environment for subsequent tasks. Therefore, X would have the original value, and X.z would still be None when accessed.
Regarding side-effects. This largely depends on the life time of the vars_cache. For my particular purpose, the ideal lifetime for these modifications to the environment is until end of play, including any included tasks. If the vars_cache outlives the play, then there could be unintended consequences, but, so far, that doesn’t appear to be the case. My rationale here is that any play invoking the ‘requirements’ module is imposing constraints for that play, and not the playbook as a whole. Were the piece-wise-defaults to live beyond the play, there might be issues, though the key to defending against that scenario is in rational variable naming (i.e. prefixes, etc.), which i do.
It is certainly possible to reconstruct the tasks to not require complex data, but generally impractical to do so, when tasks may require lists of dozens to hundreds of items (think manually unrolling a loop of 0-100 X_0_a X_0_b, X_0_c X_1_a, X_1_b … X_99_a … ). The alternative, in placing testing and defaulting logic throughout all of the various consumers of that data is inherently a bad idea; it detracts from the cohesiveness of the design. For the occasional parameter, it’s not a big deal, but structures that are used heavily in this mode will make the playbooks inherently un-maintainable, and tied in time to their origin.
As towards simplicity and doing too much. This feature actually allows for greater design simplicity for tasks, templates, etc. without having to re-verify or re-default a variable at each invocation, the logic becomes cleaner and visually comprehensible. Not only that, but communicates expectations in cogent and sensible way. By counterexample, having a key task fail during a rolling upgrade of 1000 machines because X.z was undefined when accessed in a template is not the optimum way to communicate mis-configuration to the end user, especially when it was known before machine 0 was touched that there would be a problem. More importantly, though, we may have something sensible to do to touch up X.z before execution (rule C). The important point is that, in order to hit all three bases, we need the ability to incrementally modify X such that X is neither the user-suplied value nor the default-value, but combination of the two. Insisting that X.z is defined up-front is unnecessarily harsh when a sensible default exists, Insisting that X is either entirely user-supplied or entirely default-supplied is unnecessarily inflexible. The middle ground lies squarely in the corner case of C.
I could also imagine a number of other scenarios where a task may want to override the user(inventory, etc.)-supplied values. Here is one:
In a large organization, with administrative and non-administrative users use ansible playbooks to accomplish certain tasks. The playbooks, designed by the administrator, includes some auditing tasks in certain plays. These tasks, for whatever reason, depend on parameters that the administrator does not want to allow the user to modify (credentials, logtime, etc.). Said user modifies the them anyway, inducing error into the audit trails. Admittedly, this example is contrived, and there are better ways to avoid such issues, but it nonetheless illustrates a viable scenario in which it may be desirable to have the ability to mandate the content of the effective environment for subsequent tasks.(i.e. don’t allow user J to modify “audit_username”)
So the question is not whether a particular case can be shoehorned into an ansible-idiomatic way of doing things, because that is almost always the case, modulo some hoop-jumping. The question is “how can we expand the ansible-idiomatic” way of doing things to reduce the hoop jumping. If the answer is six lines of code, then I find it hard to argue against the position (of course, there may be side effects and other concerns, but someone familiar with the code base could illuminate those).
In any event, sorry for the wall of text, and I really do appreciate your help.
Regards,
Chris