Problem statement
In the Ansible community and partner engineering team, we would like to ask the community for advice to help us solve the following issue.
Currently, Red Hat partners who joined the program to get their collections certified and available on Automation Hub often face rejections of collection tarballs they upload based on errors from the Galaxy-importer log. This causes a lot of friction in the process we’d like to minimize by giving partners, and the community in general, a solution that will help them catch and fix those errors on their end before uploading collections to Automation Hub or Galaxy.
On Automation Hub, Galaxy-importer performs:
- Collection building and basic checks like its metadata validation.
- Running Ansible Lint with the production profile.
- Running Ansible sanity tests.
Our initial vision on how to solve this
We’ve created the ansible-collections/certification repository for collection certification onboarding which, among a few other things such as a README template, will contain a GitHub workflow we want to encourage partners to use in their repositories.
This solution has the following properties:
- The repository is public.
- It is a separate repository.
- It is kept minimalist: it contains only necessary items (such as jobs in the workflow) for the purpose of content certification.
Let’s now discuss the properties
-
The repo is public under the
ansible-collectionsorg because the larger community can also benefit from it to improve quality of their content before uploading it to Galaxy. The community contributions are welcome there. Many partners are also active community members and have their collections on Galaxy and included in the Ansible community package. -
It is a separate repository. However, we have a feeling that there might be some overlap with what we have in the collection_template repo used as a template for initializing new collections repos on GitHub. We also refer to it from our collection inclusion requirements as a source of templates for testing (contains the ansible-test workflow), execution-environment.yml, README, LICENSE, etc. We have considered merging them, but I personally think that keeping the content from ansible-collections/certification/ in its dedicated repo will be less confusing.
-
The content is intentionally kept as minimal as possible. There are a lot of good and useful GitHub workflows and actions we could recommend (e.g., for releasing, for running integration and unit tests), but we intentionally decided to have only one cert-tests.yml workflow that contains only the checks from Automation Hub. However, from there, we could refer to other community resources such as the
collection_templaterepo if maintainers want to get a workflow for unit and integration tests or consult the community package inclusion requirements to learn community best practices. -
On the implementation, see the cert-tests workflow:
- We decided to minimize potential points of failure and not to refer to any other reusable workflows/actions except the
ansible-community/ansible-test-gh-action@release/v1one. The Galaxy-importer and Lint checks are pretty straightforward, so we don’t want to depend, say, on any of ansible/ansible-content-actions workflows some of which, in turn, use other tooling such astox-ansibleunder the hood. This approach could be reconsidered though if responsible teams make a strong commitment to ensure their stability. - We decided not to include unit, integration or any other unnecessary checks to keep things simple.
- There’s also a test module we run the workflow against on a scheduled basis to make sure everything works.
What do you think about this effort and the implementation?
We’d love to hear from you in the comments!