According to our currently published release schedule, we wanted to release Ansible 12.0.0rc1 this week, and Ansible 12.0.0 next week (if no blocker was found by the end of this week).
(Note that originally, in two weeks we’d also would have gotten ansible-core 2.19.1, but that has now been moved one week back. Lucky us, I guess…)
In any case, we now have to figure out how to proceed. There are basically two courses of action (correct me if I missed others):
Continue with 12.0.0rc1 next week and try to get 12.0.0 GA out soon, i.e. the week after. This means that ansible.netcommon might get fixed, but maybe there’s just a workaround implemented, or to be manually implemented by users (“known issue” in the changelog/porting guide).
Wait with 12.0.0rc1 until ansible.netcommon is fixed. The downside is that this currently looks like it requires ansible-core 2.19.1, and thus a) it will take some more weeks, and b) it will bring Ansible 12 and ansible-core 2.19 out of sync (in sync means that ansible-core 2.19.x is followed a day later by Ansible 12.x.0).
I’m personally in favor of 1. Many collections included in Ansible 12 work fine with ansible-core 2.19 (at least to current knowlege), and delaying everything just because some collections didn’t test in time is unfair to the ones who did ensure that it should work. Also the faster Ansible 12 is out, the faster we’ll get more feedback for all collections, if some more problems are lurking somewhere that haven’t been found yet…
My vote goes to first option to have 12.0.0rc1 next week and 12.0.0 GA in the next week. Having the release done in this way give us chance for accumulating feedback for collections also it will keep the sync core and community package versions.
I really don’t want to release GA with multiple broken collections, especially a foundational one that other collections depend on. There’s a good chance that people who don’t pin dependencies will be broken by this. Also, is it possible that the release of netcommon that adds Core 2.19 support could also come along with breaking changes that would be problematic to ship after GA?
Having the version numbers out of sync seems to me like a much less significant issue than releasing Ansible 12 GA with major issues like this. Having the version numbers in sync is never something we’ve promised anyways, and at various points in the past, they have diverged.
Part of the value we provide with the community package is making sure that collections follow the Collection Requirements and that they are compatible with the Core version included in Ansible. Releasing a broken GA goes against this in my opinion.
I really don’t want to release GA with multiple broken collections, especially a foundational one that other collections depend on.
That’s a very good point. Ansible Community 12 will break almost every network automation setup out there until netcommon is fixed. Not just some, not over half, but nearly 100% of every network automation workflow that uses Ansible work break if they setup Ansible Community 12. Even those that use Jinja (which my testing showed it’s working), but when we try to upload those generated configs via the network modules they’ll almost certainly hit the netcommon issues.
Fortunately people aren’t likely to upgrade to 12 as soon as it’s released(network folks tend to stick with platforms until they’re kicked off), but it is a significant part of the Ansible use case (network automation).
And most networking people (AFAIK) install Ansible via pip3 install ansible, which currently installs 11. In my opinion it should stay that way until netcommon is fixed, or at least give it some more time to get fixed.
If we extend the Ansible 12.0.0 pre-release cycle by an indeterminate amount of time to give ansible.netcommon a chance to get fixed (with potentially new ansible-core releases as well), I think we should also allow new features for other collections and re-feature-freeze later. (I wouldn’t allow new major releases in general though, unless they are needed for some reason.)
Otherwise the collection versions included in 12.0.0 will be quite outdated for some collections which did release new features since July 22nd.
There are a lot (I don’t have the exact number) of collections that depend on ansible.netcommon, since the release of Ansible Core 2.19 we’ve already had multiple bug reports about this broken collection.
So I vote we extend the ansible 12 release.
I think we should also allow new features for other collections and re-feature-freeze later. (I wouldn’t allow new major releases in general though, unless they are needed for some reason.)
I don’t have a strong view for or against this. Though whatever we decide, let’s consider what this means for future releases.
I think that’s a good idea. I am also hesitant to hold back all the other collections while we’re waiting on netcommon. I guess we also might need to allow updates to other network collections that haven’t been able to test with Core 2.19 due to netcommon being broken to be able to pull in other fixes, so it’s probably easier to just remove --feature-freeze for now and allow semver-compatible updates from all collections.
All of these collections use netcommon in some way or another, and most have one or two dozen modules. The most important of those modules that they almost all have are the config (download/upload, replace configs) and command (issue random commands) modules.
CloudEngine
CNOS
Dell OS6/9/10
ENOS
Arista EOS
ERIC_ECCLI
EXOS
FRR
ICX
Cisco IOS
Cisco IOS XR
Cisco NXOS
Juniper Junos
NETVISOR
Extreme NOS
Extreme VOSS
RouterOS
SLX-OS
VyOS
Westermo WeOS 4 (never heard of that one)
Network administrators can use either the various modules to modify specific aspects of a configuration (such as arista.eos.eos_vlans, cisco.ios.ios_vlans, junipernetworks.junos.junos_ospf), or they render a configuration in the native configuration syntax (such as via Jinja) and then use a variation of the configs module that they almost all have to upload that config. Most network devices are “single configuration file” devices, which means there’s only one file that encapsulates the entire configuration state of the network device, which is really handy (as opposed to servers/hosts, where config state is scattered all over the file system).
Essentially, just about every use case for network administration is broken with 2.19. IMO, 2.19 shouldn’t have been released with netcommon still broken, and if “unreleasing” 2.19 was an option, from a representative of the network world, that’s what I’d go with. But I’m gonna guess putting the toothpaste back in the tube isn’t an option
For future releases, maybe adding a more general “use case” set of tests that check various common use cases for Ansible?
Another thing might be to give responsibility of the network modules to the vendors to maintain. Since Red Hat maintains most of them, it really went under the radar. A few of the network vendors I talked to were surprised about it. And Red Hat didn’t do anything with the netcommon issues that were brought up before 2.19’s release.
We could also think about bifurcating the collections. Network administrators are using the sectional modules (those that affect just VLANs, OSPF, BGP, etc.) less and less, and are instead relying on just two modules from each collection.
Config
Commands
Cisco, Arista, Juniper, etc., all have a variation of the config/commands modules. These I think are what are used the vast majority of the time.
I don’t know what exactly the percentage of network administrators that just use config and commands are versus the others, but I’m guess it’s high (as in more use the config/commands only).
That would make the maintenance and testing burden considerably less for whomever ends up with the responsibility.
@SteeringCommittee@release-managers we need more opinions on how to proceed (do rc1 next week, or continue with b3; and if b3, what about feature freeze?). We basically need to have a decision until Tuesday, which should be based on a large enough consensus. Thanks.
If the netcommon incompatibility with ansible-core 2.19 wreaks havoc with so many other collections, I think we should wait with rc1 for another week. Hopefully, the issues are fixed by then. If we do, I’m with you and think we should release another beta.
I’m not sure about feature freeze, though. Do you mean to allow new minor releases or even new major releases? I’d say minor would probably be fine, but I’m not so sure about major releases.
No worries about being AFK It’s summer anyway (at least in the northern hemisphere), so quite a few will be away. (So it’s more important to get opinions from the ones who aren’t )
I don’t think allowing major changes is a good idea (except for emergency reasons, which I hope won’t happen). I can think of three realistic scenarios:
Release 12.0.0rc1 on the upcoming Tuesday, and 12.0.0 GA the week after, no matter whether ansible.netcommon gets fixed for rc1. (A fix will likely require ansible-core 2.19.1, which won’t be in 12.0.0 GA anyway in this scenario).
Continue with beta releases until ansible-core 2.19.1 is out and ansible.netcommon is fixed (hopefully in ~2 weeks), then release 12.0.0rc1, and hopefully 12.0.0 GA one week after. This will be in ~3 weeks then. (Two weeks later than the first scenario.) In this scenario, feature freeze will be still the case, unless we explicitly allow certain collection releases to also contain minor changes (might be necessary for ansible.netcommon.)
Basically unfreeze Ansible 12 for features (but not for breaking changes) until ansible-core 2.19.1 and a new ansible.netcommon release are out. Then freeze again with a new beta release, one week after 12.0.0rc1, and hopefully one week after that 12.0.0 GA. This will be at least three weeks after the first scenario.
I personally tend to 1 or 3, since there are quite a few collections for which 2 means that 12.0.0 will contain a rather old version, missing both bugfixes and features.
I tend to 3. 1 sounds like Ansible 12.0.0 might be unusable, at least in (large?) parts, for the networking community. And your objections to 2 sound reasonable to me.
Would “unfreeze” also mean we can add new collections? I’ve opened Add ravendb.ravendb to Ansible 11 and 12 and if it does, we could un-draft and merge this.
@gundalow@anwesha since you expressed opinions here as well, does 3 sound good for you as well? Lacking other RM/SC opinions I’d guess it would be 3 then, assuming that’s also OK for you.
I was inclined towards and voted for option 1 before because having 12.0.0 with ansible 2.19.0 core (with workaround of netcommon) would give us the opprtunity to maintain our versioning norm of ansible-core & ansible package. Here, the release manager in me was overpowering my end user of the ansible community package voice.
I realized that the end user only cares about the quality of the package and if it is working correctly and serving their purposes.
Another reason favoring unfreezing is it is only fair for all the collection maintainers to have the same privilege of adding new features as netcommon.
Ansible 12.0.0b3 was released today with feature freeze disabled as was discussed. We still need to update the roadmap and put out an announcement about that. For now, there is an extra bolded paragraph in the release announcement form post about the change (thanks @felixfontein for adding that!).
We need to update the roadmap now that we’re off schedule for Ansible 12.0.0, but I’m not sure what date to put in for the Ansible 12.0.0 RC and GA releases. It would be good to hear from the netcommon and other network collection maintainers (I believe some of these are maintained by Partner Engineering?) on expected timelines, as I don’t think we should delay the release of Ansible 12 indefinitely. Am I correct that the main blocker at this point is Core updates that are expected to be part of 2.19.1?
Releasing 12.0.0 GA with Core 2.19.1 to give the network collections a little more time is fine with me, but I don’t think waiting any longer is a good idea, especially considering the lifecycle of a single Ansible release is already relatively short (6 months).