YAML formatting best practices - tasks with many params

In the documentation, and for terseness, many task examples are given like:

  • apt: pkg=package_name state=installed update_cache=yes

However, for version control, legibility, etc., it’s usually preferred (at least as far as I’ve seen) to use a parameter-per-line approach, and there are two basic ways to do this with normal YAML syntax, first, with the YAML escape character and the parameters formatted the same (key=value):

  • apt: >
    pkg=package_name
    state=installed
    update_cache=yes

And second, using a normal YAML collection:

  • apt:
    pkg: package_name
    state: installed
    update_cache: yes

To my eye, both are valid approaches—with the edge towards normal YAML collections, since syntax highlighting will allow for the keys to be highlighted as such, and values will be in the normal string/bool/constant mode. So visually, I lean towards the second option.

Going further, though, how would you handle tasks that use command or shell (or something similar), where the main portion of the task is not a key-value set, but just a given parameter?

  • command: >
    /usr/bin/executable --option 1
    creates=/some/path/here
    chdir=/home/johndoe

I know there’s the option of adding an ‘args’ parameter and splitting out creates/chdir/other options into another separate collection, but that seems like an anti-pattern to me. Additionally, you would still have the command itself, which I like having on it’s own line for clarity’s sake:

  • command: >
    /usr/bin/executable --option 1
    args:
    creates: /some/path/here
    chdir: /home/johndoe

How do you deal with key-value pairs in your tasks? What is the preferred and/or most used method? From what I’ve seen, it’s usually a bit of a hodgepodge, and there are still many playbooks, roles, and examples out there with one-line tasks which are impossible to read/maintain unless extremely simple.

In the documentation, and for terseness, many task examples are given like:

  - apt: pkg=package_name state=installed update_cache=yes

However, for version control, legibility, etc., it's usually preferred (at
least as far as I've seen) to use a parameter-per-line approach, and there
are two basic ways to do this with normal YAML syntax, first, with the YAML
escape character and the parameters formatted the same (key=value):

This has not been the case.

  - apt: >
      pkg=package_name
      state=installed
      update_cache=yes

I don't care for this form, as if you have to pass structured data, you
need to go to what you have next:

And second, using a normal YAML collection:

  - apt:
      pkg: package_name
      state: installed
      update_cache: yes

This is ideal for longer line things.

To my eye, both are valid approaches—with the edge towards normal YAML
collections, since syntax highlighting will allow for the keys to be
highlighted as such, and values will be in the normal string/bool/constant
mode. So visually, I lean towards the second option.

Syntax highlighting is not just a question of YAML, but also the variables,
if you want that.

This is why someone, for instance, wrote a Sublime Text plugin.

Going further, though, how would you handle tasks that use `command` or
`shell` (or something similar), where the main portion of the task is not a
key-value set, but just a given parameter?

  - command: >
      /usr/bin/executable --option 1
      creates=/some/path/here
      chdir=/home/johndoe

Keeping them on one line is generally common.

I know there's the option of adding an 'args' parameter and splitting out
creates/chdir/other options into another separate collection, but that
seems like an anti-pattern to me. Additionally, you would still have the
command itself, which I like having on it's own line for clarity's sake:

  - command: >
      /usr/bin/executable --option 1
      args:
        creates: /some/path/here

        chdir: /home/johndoe

This is an imaginary non-syntax, because you're mixing a string argument
with a hash, but I think that's what you meant.

Args is only there for a legacy support, and no longer used.

How do you deal with key-value pairs in your tasks? What is the preferred
and/or most used method? From what I've seen, it's usually a bit of a
hodgepodge, and there are still many playbooks, roles, and examples out
there with one-line tasks which are impossible to read/maintain unless
extremely simple.

All are valid, though the form you have with ">" is less desirable than
passing a dictionary when you are already breaking things up into multiple
lines.

In the documentation, and for terseness, many task examples are given like:

  • apt: pkg=package_name state=installed update_cache=yes

However, for version control, legibility, etc., it’s usually preferred (at least as far as I’ve seen) to use a parameter-per-line approach, and there are two basic ways to do this with normal YAML syntax, first, with the YAML escape character and the parameters formatted the same (key=value):

This has not been the case.

To each his own :slight_smile: - but I’ve seen multiline more often than not, especially when tasks have 3+ parameters. Usually there’s a mix, but it’s hard to digest a task with 6+ parameters on one line.

  • apt: >
    pkg=package_name
    state=installed
    update_cache=yes

I don’t care for this form, as if you have to pass structured data, you need to go to what you have next:

And second, using a normal YAML collection:

  • apt:
    pkg: package_name
    state: installed
    update_cache: yes

This is ideal for longer line things.

Makes sense, and I will be moving over most of my playbooks to this form (I used to use the > method more often, since I would type out the task on one line, then break it up by parameter afterwards… now I’m starting with parameter-per-line and it’s easier to use this method).

To my eye, both are valid approaches—with the edge towards normal YAML collections, since syntax highlighting will allow for the keys to be highlighted as such, and values will be in the normal string/bool/constant mode. So visually, I lean towards the second option.

Syntax highlighting is not just a question of YAML, but also the variables, if you want that.

This is why someone, for instance, wrote a Sublime Text plugin.

That’s the plugin I’m using, and it makes --syntax-check almost redundant, since it makes any YAML-specific errors pretty glaring. Especially for someone newer to YAML, I’d highly recommend finding a way to write it with syntax highlighting.

Going further, though, how would you handle tasks that use command or shell (or something similar), where the main portion of the task is not a key-value set, but just a given parameter?

  • command: >
    /usr/bin/executable --option 1
    creates=/some/path/here
    chdir=/home/johndoe

Keeping them on one line is generally common.

I know it’s best to find ways to not have to use extra options with commands/shell commands, but it seems I always have to drop down to that level once or twice for a given playbook, and it’s easier for me to digest what’s going on if it’s on multiple lines (not saying it’s best, or everyone should do it…). But for consistency’s sake, I can see just using one-line here.

I know there’s the option of adding an ‘args’ parameter and splitting out creates/chdir/other options into another separate collection, but that seems like an anti-pattern to me. Additionally, you would still have the command itself, which I like having on it’s own line for clarity’s sake:

  • command: >
    /usr/bin/executable --option 1
    args:
    creates: /some/path/here

chdir: /home/johndoe

This is an imaginary non-syntax, because you’re mixing a string argument with a hash, but I think that’s what you meant.

Args is only there for a legacy support, and no longer used.

Ah, didn’t know that—it’s currently displayed as one of the examples on the command module docs page (http://docs.ansible.com/command_module.html) — should I open a PR to remove that example?

How do you deal with key-value pairs in your tasks? What is the preferred and/or most used method? From what I’ve seen, it’s usually a bit of a hodgepodge, and there are still many playbooks, roles, and examples out there with one-line tasks which are impossible to read/maintain unless extremely simple.

All are valid, though the form you have with “>” is less desirable than passing a dictionary when you are already breaking things up into multiple lines.

True!

Thanks for the input,
Jeff Geerling

In the documentation, and for terseness, many task examples are given
like:

  - apt: pkg=package_name state=installed update_cache=yes

However, for version control, legibility, etc., it's usually preferred
(at least as far as I've seen) to use a parameter-per-line approach, and
there are two basic ways to do this with normal YAML syntax, first, with
the YAML escape character and the parameters formatted the same (key=value):

This has not been the case.

To each his own :slight_smile: - but I've seen multiline more often than not,
especially when tasks have 3+ parameters. Usually there's a mix, but it's
hard to digest a task with 6+ parameters on one line.

My feeling is, in this day of widescreen monitors and laptops, there's
plenty of room in nearly all cases, and 79 character line wrap is obsolete.

Making more concise playbooks makes them easier to read and skim, rather
than things being several pages long.

I do believe in significant use of whitespace between lines, giving every
task a "name:" attribute, and things like that.

Ah, didn't know that—it's currently displayed as one of the examples on
the command module docs page (http://docs.ansible.com/command_module.html)
— should I open a PR to remove that example?

I think this is the confusion: your args is not indented at the right level
basicaly, if you move it back to the level of command, it would be correct
and ok.

To throw in a third option, this is often what I do:

- name: ensure .ssh directory present
  file: path={{ home_dir }}/.ssh state=directory mode=0700
        owner={{ file_owner }} group={{ file_owner }}

No ">" is needed if you indent like this. I like to hard wrap at 80 chars
for easier diff reading but one line per param is a bit overkill for my
taste.

There’s plenty of room for argument there, though… I work primarily on an 11 MacBook Air and a 9.7" iPad. At my desk, I will hook the Air up to a 24" monitor, but I still have 3-5 windows on the display, and like being able to stack at least two windows side-by-side, meaning I get a max of maybe 120 characters comfortably.

There’s that (anecdotal evidence, of course), and the fact that most languages discourage placing multiple statements on one line (Python’s own PEP 8 style guide states “Compound statements (multiple statements on the same line) are generally discouraged.”, the Linux kernel coding style prohibits it, ).

There are other good reasons, too:

  • Easier to read, and less chance that future you/other developer would glance over an important variable when debugging.
  • Better for VCS, since each line diff is highlighted (and better support in diffing software for line-by-line diff than intra-line diff highlighting).
  • Less error-prone, and easier to maintain (need to nix a param just dd/Ctrl-K the line and that param is gone).
    This argument is more philosophical than practical in some ways, but in my experience, splitting things to multiple lines and breaking up task lists into short playbooks (usually < 100 lines per playbook) makes it easier for me to jump back into something I haven’t touched in months and debug/rework it, and for me to be able to see differences more easily in GitHub PRs.

But Ansible playbooks are not Python, nor do we assume folks will need to know Python.

Further, these are not statements, but parameters, which are commonly if not always placed on one line.

But Ansible playbooks are not Python, nor do we assume folks will need
to know Python.

Sure; I think that was just an example of a coding style that recommends
short lines (along with the Linux kernel coding style).

FWIW, I think the diff argument is pretty compelling: I find it a lot
easier to read a diff of some parameter changes when there's one parameter
per line rather than one long line full of parameters. Maybe my archaic
line-oriented usage of diff is as archaic as my eighty-column terminal
windows, and all the cool kids these days are using color-highlighted
diffs that make it trivial to spot the one word that's changed between two
long lines of text, but I'm not, and suspect many aren't.

All that said, this is just a style question, right? In which case I think
it's more important that the Ansible-using folks on your team agree on a
style, than that all Ansible users everywhere agree on one.

                                      -Josh (jbs@care.com)

This email is intended for the person(s) to whom it is addressed and may contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, distribution, copying, or disclosure by any person other than the addressee(s) is strictly prohibited. If you have received this email in error, please notify the sender immediately by return email and delete the message and any attachments from your system.

Yep, style is a team thing.

Part of this discussion was because Jeff is writing a book, I would prefer if we didn’t advocate that simple key=value on one line was a bad practice in a book when it conflicts with the core docs, or that alternatively we should encourage
structured args versus splitting on a line, since there’s going to a point where that’s needed.

The line-oriented argument is compelling if you want to see things in context, though I think in many cases people will be able to tell from the diff easily enough anyway.

Ansible itself was intended to be easily diffable because of not using explicit dependencies and being order based and things like that.

Ultimately it’s “what works for your team”, absolutely.

Exactly why I’m asking here :slight_smile:

I want to present a style that I use and like, but make sure to not say “this is the Ansible way”, but rather, "here are the common ways tasks are written… I prefer this style (whatever it turns out to be… I’m still not completely sold on any one way as a general rule) for reasons X, Y, and Z, but if you find that one of these other methods works better for you or your team, please use it. The important thing is to use a consistent style that helps you be productive and write effective playbooks…

Something along those lines, but basically, since there is no official style guide for Ansible (just examples in docs and the examples repos), and since I see about a hundred different styles in the wild, I want to be comprehensive but not lead people astray.

Also, one other thing I forgot to mention, LeanPub in particular, and most other publishers as well, has a hard limit of 78-81 characters per line for an 8x10 book, so parameter-per-line works out for that nicely. Otherwise I’d have arbitrary line breaks all over my code examples, and that makes the book incredibly difficult to read.

Until book layouts are fully fluid, that’s going to be an issue with any published work.

-Jeff

No 4px fonts?

Haha, believe me, I’ve tried a lot of tricks to see how much I could fit on a line :slight_smile:

I used the second style present by Jeff which involved apt: > format. At one time, I don’t know how long ago this was in the ansible docs as a coding style.

Even with the widest screens available it is very difficult to read everything on one line. For version control as mentioned it is also a lot easier to break it up.

Take for example this playbook of mine: https://github.com/protobox/protobox/blob/master/lib/ansible/applications/wordpress/tasks/application.yml

It would be difficult to read a lot of those on a single line. The drawback if this approach recently broke in ansible 1.6-1.7ish and took a week or more
to be resolved. This required some hacking and back porting to an earlier version to get working again. Luckily this was fixed and now unit tested so it
should not happen again. However it is something to be aware of that there are more official “syntaxes” that are supported by ansible.

Overall I wish the project had a coding style guide that helps show different usages for teams to decide. I for one did not know about the 3rd format which is apt: pkg:.

At this point (and likely for some time, maybe forever?) there is no formal specification/best practices listing for Ansible, and that’s both a blessing and a curse. It’s nice to say “this way is the best”, but it also means that there’s less freedom to use a style that fits your needs.

I’ll be writing up a blog post soon (and adding the info to Ansible for DevOps in a Best Practices appendix) summarizing these different styles, and showing why I use the one I use—but also mentioning why it is probably not for everyone.

I think the discussion in this thread’s been incredibly helpful, and I’m glad Michael gave some of his reasons for preferring the params-on-one-line approach. I think it also has to do with how much time an individual spends more on the ‘dev’ side or the ‘ops’ side of the devops equation—it seems to me people with more of an ops/get-it-done approach favor terseness, while people more on the dev/purist side prefer rigid syntax, multiline for VCS/diff ease, and structure rather than freedom. (All generalizations, I know… but that’s my experience :).

Now, how about we argue about ordering of parameters?

apt: pkg=name state=installed update_cache=yes

vs.

apt: update_cache=yes pkg=name state=installed

(Hopefully the sarcasm is obvious here… but people have spent a lot of time bikeshedding on things like alphabetical ordering of CSS properties, and we don’t need to bog down adoption of Ansible by being too pedantic about YAML syntax :).

-Jeff

My hope was not for more of a standard “ansible way”, but here are the available options those of us who have only worked with YML because of ansible. Until this post, I did not know that some of the options existed. With the number of contributors to the project, I hope I am not alone in being newer to YML. Another benefit of this is we know what formats are supported by ansible. It took me a lot of guesswork and research to discover I could put them on multiple lines in the first place. It took me another 2 years to now realize I don’t have to use apt: >.

These are all arguments for using full YAML instead of the k=v style.
Not only do you get the benefits of multiple lines, but you also
preserve the type of the arguments whereas in the k=v style everything
is a string.

I agree. I should say I have never learned full YAML and how that relates to ansible way of setting up the arguments. Everything I have picked up was from different playbooks and examples from around the web. That’s why it would be great to have a generic use case document that ansible could point to for conventions and why it is good to do it that way. I imagine a blog post would suffice so I am looking forward to Jeff’s post.

Hopefully I can re-work all my playbooks soon.

I’ve posted the blog post: YAML best practices for Ansible playbooks - tasks

Only proofread once, so please let me know if you find any points of contention or outright lies!

-Jeff

Super easy to see this discussion bikeshed.

Our policy here is to show examples and let people learn by example, without having to teach YAML.

You can use YAML and not really know all of YAML – and that’s awesome.

For instance, we shy away from Anchors.

I think that’s enough on this topic and would wish people stay away from declaring what a “Best Practice” is.

Jeff’s article seems to scare me away from Ansible by making it seem there are too many rules.

This is why I strongly believe in examples, and just showing what it is - exactly the way you learn to speak.

You can skip what a dangling participle is, etc.