What to do about PyYAML

What’s the Ansible project’s position wrt the upstream PyYAML project if it remains unreceptive to important issues (#234) and pull requests (#842)?

With the deprecation of the community.general.yaml callback and the continued hobbling of to_nice_yaml() and callback_result_format=yaml by PyYAML’s inability to produce aesthetically acceptable results, the experience of using Ansible is being detrimentally impacted.

PyYAML’s output is the worst kind of “technically correct”: it’s valid while being an utterly undesirable aesthetic eyesore. In our shop, where Ansible is practically the reason for YAML’s existence, this disconnect is jarring, even a bit embarrassing.

3 Likes

As author of the tickets you link and part of the ansible-core maintainers, I don’t think my position matters much, the position of the pyyaml maintainers does.

I would comment on the tickets as both a reminder they exist and to increase the relevance this has on users.

3 Likes

What’s the Ansible project’s position wrt the upstream PyYAML project if it remains unreceptive to important issues (#234) and pull requests (#842)?

They should all be taken out back and beaten severely. :upside_down_face:

The Ansible Core team effectively are the maintainers of PyYAML (more specifically, me). While I wish I had more time to put into it, PyYAML’s current place in the Python/YAML ecosystem seems to be “boring but stable”, and for what Ansible (and most other folks) need from it, that’s okay. It’s a creaky ~20 year old codebase that’s not very friendly to drive-by contributions, and the swift appearance of pitchfork-wielding masses at the gates on the rare occasions we do have to break stuff in it doesn’t increase my desire to accept most change requests, since every change breaks someone’s workflow. Nearly every “simple” change like the ones you point to have un-seen consequences or need to be implemented in more than one place to maintain consistent behavior between the libyaml and pure-Python implementations, and tested accordingly. Even for someone that’s very familiar with the codebase, evaluating those changes and ensuring that they’re well-tested can be very expensive, and comes at the cost of other stuff we need to do in Ansible itself.

I already go to bed nearly every night feeling crappy about how much more I wanted to get done on Ansible, and I feel guilty that we can’t do more to improve PyYAML, but spinning up new maintainers on that codebase is a mammoth undertaking. :frowning:

1 Like

Um, no. I see the emoji, but there’s a serious point to be made here. Regardless of how the project got to this point, any punishment (either externally administered or self-inflected) is a pointless distraction.

Step zero: Stop feeling guilty. There is nothing personal about this. It’s just software. It does a thing; it doesn’t do some other thing. It’s hard to separate yourself from a project you’ve spent years on. But it isn’t you and you aren’t it. If software could care, it should be grateful for the hours you’ve spent on it. But software cannot care, and as painful as it is to admit, Bill Murray was right: It Just Doesn’t Matter. Of course it would be great if you could do all those things you imagine, but you can only see that horizon because you’ve gotten to where you are now, and that’s nothing to feel crappy or guilty about.

That’s why I started my initial post with the focus on the Ansible project’s position on a particular upstream project rather than that project itself. [Um, that is, before I got all snippy and too clever and sort of lost the thread a bit there]. I don’t particularly care whether PyYAML somehow starts doing what I want. What I really want to know is Ansible’s direction. I can see three ways forward:

  1. For the time being, “boring but stable” is good enough for Ansible. We’ll take whatever PyYAML can offer and otherwise get by without.
  2. Somebody steps up and figures out how to make PyYAML accept parameters to control the currently missing aspects of formatting without breaking current behavior. (Medium term I expect this is the most desirable option.)
  3. The current “boring but stable” PyYAML becomes feature frozen in place and a new PyYAML2 project comes along – either a major rewrite or from scratch – that’s easier to maintain and provides the features PyYAML can’t. Then Ansible switches to that as its upstream YAML monger.

What isn’t an option for the Ansible project is to ignore the issue; that’s choosing #1. Ansible is a major software project that currently can’t output YAML in the same format its own input canonically uses, and that its users expect to see. That ought to change.

Unless the output is syntactically invalid then I’m honestly not sure I would argue this is something we should care too much about. It sucks that it doesn’t look the way people may want it to and it cannot be controlled but this is quite far down the list of things I think we need to worry about.

There are a few other YAML libraries out there that plugins could be written to use but the number one priority for Yaml in Ansible is to ensure that it doesn’t break things and that it doesn’t slow things down.

3 Likes