(sorry for the double reply, I accidentally added the second reply to something I started writing before but didn’t finish… I’ve edited it so it is more or less finished now )
Ok, maybe let’s sum up the current state of the discussion:
- There are some concerns about the size (both in absolute numbers, but also when comparing it to how many
ansible
package users can use these collections); there is currently no agreement on whether this is really a serious problem on its own. - It is hard (or impossible?) to run all sanity tests for these collections, since it takes a very, very long time to run them. (Especially the pylint test takes a huge amount of time, due to the large code size and high complexity of some of the Python files.) There is no agreement whether this is really a serious problem on its own.
- At least for fortinet.fortimanager, a lot of sanity tests fail (around 13,000). This is considered a problem, but it seems that most of the failures are of the same type and it should be easy to add ignore.txt entries for them. (It’s also one of the somewhat debatable sanity tests that fails, namely that all option names should be Python identifiers; I guess the reason they’re not is that the modules use the same names as the product itself uses.)
- My observation, when looking at some of the largest modules, is that they don’t have to be this large. A lot of the size seems to come from huge auto-generated structures that can be stored way more efficiently. Doing that would likely also improve the collection size and the time needed to run the sanity tests.
- Finally there were some comments on the documentation quality of the modules. A lot of the descriptions for at least some of the large modules seem to be auto-generated and not helpful at all. This is something we haven’t really talked about yet.
I agree, we should define the behaviors or outcomes that are problematic. Size may be a cause or contributing factor to some of those.
It’s a bit like alerting on CPU or memory usage: it’s not a very useful metric on its own, it’s usually a proxy for some effect like your application responding to requests too slowly (which is the actual metric).
Yeah that makes sense.
I’m a little less sure about this one. Very hard to measure usefulness, and hugely subject to bias. I agree there needs to be something along these lines, not sure what it looks like…
If the problem for us is about verifying that tests pass, then we need to first decide how we determine that (run tests independently of the collection’s CI, use the collection’s public CI results, have CI results be reported to somewhere the committee can check, something else, etc.). Not every solution requires that tests are run by us, at our cost (of time or money or whatever).
I agree that this is impossible to objectively measure. But seeing that collections for two network devices need roughly 3 x as much space as all community.*
collections included in ansible
taken together is a good indication for me that the network collections are a lot bigger than they should be. (Even the Amazon collections, which are probably used by a lot more ansible
users than the Fortinet collections, need rougly only 5-6% of the space used by the Fortinet collections.)
I like the idea that each collection is responsible for running CI tests vs trying to run them against the whole package. Is this a change in requirements that they must report (or make available a report) of sanity test results for each release?
I’m also not in favor of setting up any numbers (like size).
I’d keep the SC policy statements as general as possible in order to have some room for maneuver (as proposed in the PR).
IMO, if the committee has a collective agreement that a collection shouldn’t be in the package (for whatever reason based on common sense), the collection should be excluded/not added.
It’s not a disaster for the vendor anyway, it’ll still work after running a single command ansible-galaxy collection install ...
. If one collection out of ~120 with a very narrow use case makes the package twice heavier, it’s on its own a good reason to exclude it. Another justification could be: if they submitted it for inclusion now, would we include it?
Hi guys,
I am Xing from Fortinet. Thanks @felixfontein for bringing up this topic, and thank you all for the discussion. We take this issue very seriously and will do our best to improve it.
Sorry that we did not realize the size of our collections is too big, and it brings some issues to the community. We will reduce the size of our collections to an acceptable number ASAP.
As for the sanity test, FortiOS collection should be passed it. The major issue of FortiManager collection is about the variable name. We plan to have a major version change early this year, which will change all variables’ names to meet the sanity test requirement. Currently, we have the ignore.txt file as a temporary solution.
Please let me know if you have any questions or requests. You are very welcome to raise any issues of our collections.
Happy New Year!
Thanks,
Xing
I just saw there was a new fortinet.fortios release (2.3.5); the tarball size shrank from 2.7 MB to 1.5 MB, and the extracated size from 46 MB to 18 MB! I think that’s awesome news The collection release missed the Ansible 9.2.0 release by a few hours, so it’s not in there yet, but we’ll see the result of the reduction in the next Ansible relelease!
Thanks for @felixfontein 's update! Yes, we released fortinet.fortios v2.3.5 last week. And we will have a new release of fortinet.fortimanager this week. The extracted size will be less than 30 MB.
Thanks,
Xing
We released fortinet.fortimanager v2.4.0 today. This release also fixed most sanity test issues. Currently, the only issue we put in the ignore file is about the keyword message
. We will change this variable’s name on the major release of v3.0.0.
Please let me know if you have any other requests or concerns about our collections.
Thanks,
Xing
@SteeringCommittee where are we at with this? Is the problem now resolved, or do you see more issues here?
@felixfontein What do you think, do those collections now look OK in Ansible 9.3.0?
The size of the collections definitely improved! The fortinet
namespace is now slightly smaller than the community
namespace. It’s still large though, there are only two fortinet
collections, while there’s a longer list of community
collections.
Pylint is now running a lot faster! For fortinet.fortios it finished in < 1.5 minutes (the other sanity tests needed ~5 minutes, no errors were reported), for fortinet.fortimanager in ~1 minute (the other sanity tests needed ~6.5 minutes, there were some failures, related to not having an ignore.txt file for the ansible-core version I’m using).
pylint
remarked unidiomatic-typecheck: Use isinstance() rather than type() for a typecheck.
in both collections; I’m not sure whether this is related to using ansible-core devel (the fortimanager collection has ignore.txt files for up to 2.16; the fortios collection has no ignore.txt files at all).
Parts of the docs still don’t look very helpful (like fortinet.fortimanager.fmgr_fsp_vlan module – no description — Ansible Documentation), but due to the massive amount of modules and module options this is kind of hard to fix consistently.
So the situation right now is definitely much better than when I opened this thread! There’s still the open discussion on how large a collection should be, or how specialized collections included in Ansible should be - but these discussions shouldn’t be continued in here I guess.
One problem that is left (violation of inclusion rules) is that there is no visible CI in the collection repos. @lix-fortinet how about adding some basic CI (for example using GitHub Actions, like the sanity test part of collection_template/.github/workflows/ansible-test.yml at main · ansible-collections/collection_template · GitHub)?
(It also seems that development happens outside these repositories and that for a new release, the current version is simply copied over and commited in one commit. But this isn’t forbidden by our rules.)
Hi @felixfontein,
Thank you for your feedback. We will check and improve the issues you mentioned. As for the size, yes, our collections will be larger than other collections since we have more modules. We have developed more modules to offer users greater flexibility in configuring their devices. But we will keep working on reducing the collection size.
Thanks,
Xing