Galaxy importer: collection dependency handling

I noticed that apparently galaxy-importer does not care about collection dependencies that have been installed before running it, and it itself does no attempt at installing collection dependencies, and apparently also doesn’t do that when publishing collections to https://galaxy.ansible.com/.

This has a serious drawback: some collections use docs fragments from collections they depend on! Since galaxy-importer does not use any collection except the one to import, this means that any module / plugin whose documentation depends on another collection has no documentation rendered in the UI at https://galaxy.ansible.com!

Examples:

I think this is something that should get fixed.

CC @tannerjc since you’ve looked at the docs loading code in galaxy-importer recently.

4 Likes

(And yes, I do understand that this isn’t easy to fix :slight_smile: I mainly noticed when I was trying to get GitHub - ansible-community/github-action-test-galaxy-import: GitHub Action for testing Galaxy import of an Ansible collection to actually consider dependencies. I tried to look into how this was done on Zuul for community.aws, in the end realizing that they simply disabled the ansible-doc part of the import, and then digging into the galaxy-importer code, to find out that it has no mechanism to use any collection dependencies, no matter whether they have been installed before or not.)

This also affects other tests run by the importer, like ansible-lint, see Ansible-galaxy lint unexpected warnings.

Pinging @galaxy in case you didn’t see this so far. I think this is rather serious since it breaks documentation rendering for all collections that use docs fragments from other collections.

I’ve started to re-use code from vmware.vmware in community.vmware 6.0.0 and run into this, too. This error isn’t really great, but it looks like this also leads to missing documentation on galaxy (see here for an example). And this is really problematic.

community.aws seems to have the same problem, see here and here for examples. But @felixfontein already mentioned this. However, there are possibly more collections affected that I don’t know about. Actually, I think this not only possible but probable.

@galaxy Do you have any plans to fix this? And if not, why not? This looks like a real problem to me.

@AWS Since community.aws is also affected, I wonder what you think about this.

In the realm of Supported collections (with the capital S), ansible.netcommon also has two doc fragments that might be used by other Supported collections, whose docs will also not be shown correctly on Galaxy due to this bug.

(Edit: I did check the collections included in Ansible 12, none of them use these doc fragments except ansible.netcommon itself.)

1 Like

Personally, I agree this is a bug that would hopefully get fixed.

I did some digging when the “galaxy-importer” tests were first added to Zuul, and it’s more frustrating than galaxy-importer “not supporting” the installation of dependencies, it (at least at the time) explicitly runs in a self-contained context and ignores any dependencies that might have already been installed, so we couldn’t even work around the issue by explicitly installing the dependency as part of the setup for the test. Hence just disabling the ansible-doc pieces and moving to the ansible-community/github-docs-build GitHub action.

Copying the fragments over from amazon.aws into community.aws would be problematic since the actual behaviour (including common parameters) is coming from amazon.aws and not community.aws

I’m not sure if I understand. If the actual behavior and common parameters come from amazon.aws, the docs fragments should also. This way, both would automatically be aligned.

Oh, wait! Let’s say c.a supports a.a X and later. If there’s a new a.a release Y with changes to common parameters, it’s not clear what docs fragments are the correct ones: Those from a.a X or a.a Y? After all, it depends on the a.a version people have installed. Is this what you’re meaning?

I had commented somewhere else on why this might not be a feasible task, although I can understand why it wouldn’t be assumed to be complicated from just the community galaxy perspective, however…

I foresee both technical hurdles and potentially security related hurdles to overcome. I’d have to defer to much more in depth evaluations from the galaxy team to understand all of the implications. Although, for one, galaxy-importer would have to write an ansible.cfg, of which it doesn’t have the context for, so it would be dependent on being provided that context from galaxy_ng, creating some complex circular dependencies.

Then it would have to run ansible-galaxy to install the deps. ansible-galaxy CLI is the owner of the dependency resolution mechanism, so it would also create additional dependencies. As such, I don’t see a way that galaxy_ng could reasonably be expected on it’s own to just suck in the appropriate tarball of the dep. So galaxy_ng would end up invoking galaxy-importer, to invoke ansible-galaxy, to call galaxy_ng. There are likely to be hurdles with this approach from many common proxy configurations, in which the environment running ansible-galaxy may not be able to directly consume the galaxy_ng APIs due to network security complexities.

Also imagine the automation hub, or onprem hub experiences where there is the potential to have any arbitrary number of “repos”. Which repo would we install the deps from? Amongst like I mention, galaxy-importer doesn’t have the context of the repos available, so galaxy_ng would have to provide that to importer in some way. Which also comes with RBAC considerations too, in which galaxy_ng also doesn’t have a way to bypass it’s own security restrictions to be able to allow ansible-galaxy to make API calls.

@sivel Would it be possible to have a better documentation on galaxy? For example, this is just empty. This looks like there’s no documentation at all, but this is not true.

What I mean is: Show the documentation that is not from another collection, and mention that other parameters come from another collection and are therefore not shown? Or at least mention that the documentation cannot be shown / rendered because it depends on other collection(s)?

The correct fragments are those from the version of a.a you’re using.

The “common” code in amazon.aws is usually directly related to configuring connections to the backend API (for example passing AuthN/AuthZ tokens), we have some code which will automatically add the relevant common parameters like “access_key” (~username) and “secret_key” (~password) to the argument_spec, before then passing it down through to AnsibleModule. In theory, if Amazon added a new authentication mechanism, and we added support with amazon.aws 20.0.0 through some new parameters, you could actually use community.aws 19.0.0 with amazon.aws 20.0.0 and it would work and the community.aws modules would happily accept the extra parameters…

I definitely don’t think this is easy to fix, it definitely needs a lot of architectural decisions (and insight) to come up with a good way to do this. I’m mostly annoyed that there has been zero feedback from anyone with more knowledge than the community until now. At least a “we see it is a problem and discussed it, but it’s far from trivial and we currently do not prioritize it” would have been nice.

Regarding dependency resolution: it might be a good idea to make it possible to use ansible-galaxy (the CLI tool)'s resolution algorithm available (at least only for the galaxy_ng backend) so it can use the resolution algorithm together with its database (and potential outside sources, depending on how galaxy_ng is configured) to figure out the dependent collections that are needed, and provide their tarballs to galaxy-importer. Which can then extract all of them and run the import process as before. This would allow galaxy-importer to not need to be able to resolve or download any dependencies, and make it easy for galaxy_ng to figure out what’s needed to provide to galaxy-importer.

This obviously is still not trivial to implement and needs more thoughts and design etc., but sounds doable.

IIRC Galaxy doesn’t allow to upload collections where it cannot find the dependencies for. (Or has that requirement been lifted for galaxy_ng? I think the last time I accidentally tried this was in 2020…) So somehow it does know where dependencies should be available from.

I’ve never used AH/galaxy_ng on-prem and don’t know about the “repo” feature, so I don’t know what exactly this means (besides some guesstimates from what you wrote) and can’t say anything about whether this is a big problem or not…

But a solution that works in many cases would already be a lot better than the current one (which ignores dependencies completely).

(Also there has been the implication that thanks to galaxy_ng’s doc viewing capabilities, no separate docsite is needed anymore for collections. These claims likely didn’t come from folks who knew the problems this has, but considering that more and more users seem to use the Galaxy docs over docs.ansible.com this is becoming a bigger problem.)

1 Like

Personally, I’ve started to point people who are looking for documentation to galaxy. I wasn’t aware about those problems until I stumbled upon them a couple of days ago :-/

I might have seen those problems before and just forgot. But anyway, the current situation is somewhat unfortunate…