The new Galaxy is completely broken

With a “normal” software release, the plug would be pulled and the old state restored and the errors, which are known, fixed.

Why not with the Ansible Galaxy?

The problems that are visible in the forum alone would justify a rollback.

At the moment, you can’t publish a new version of a role. Issues on GitHub are not taken note of, or does only Jira count?

When you move around the Galaxy and look at roles (not collections!) I notice some unpleasant things … like version notes that can only be considered in line with part of reality.
Screenshot #1
Screenshot #2

What prevents those responsible from admitting a mistake, drawing the consequences and - as mentioned at the beginning - rolling back to eliminate the worst errors?

Am I frustrated?

YES!

I have some bug fixes in the pipeline that I would really like to release.

And I have noticed too many avoidable errors in the Ansible environment in the last few months that make me doubt that professional work is being done here.

Sorry for the rant!

10 Likes

While I can understand your frustration, I would like to point out that the new Galaxy is not completely broken. https://galaxy.ansible.com/ does work fine for many collections and roles, and for most of them is a huge improvement over the old Galaxy - for example, consider the documentation viewer, and that you can now have a role and collection of the same name (something that was asked for many times for by community members).

Reverting now (or even just a few days after the launch) to the old Galaxy would also be quite a disaster, since all roles and collections published since the switch would be gone. (And migrating these back isn’t always possible, for example in case someone decided to add a new role/collection with the same name as an existing collection/role.)

Regarding issues on GitHub: there is no issue tracker on GitHub for the new Galaxy. The issues in GitHub - ansible/galaxy: Source code behind the Galaxy hub at http://galaxy.ansible.com are for the old codebase (despite what its README says). I’ve checked on Matrix, @rochacbruno told me that the Partner Engineering team is monitoring the issues in that repo and keeps the Galaxy team informed. So creating issues there probably adds some delay than, say, creating a forum post here (which are also monitored by the Galaxy team directly). I also reported that the README really needs an update to point out that the repository is not for the new Galaxy codebase; I guess it will be updated on the next work-day(s).

(I fully agree that not being able to file proper issues on GH is not great. I don’t have an RH account so I cannot even try to create an issue in that Jira instance, but even if that would work, having to use Jira instead of GH issues is something I’m unhappy about. I guess the Galaxy team decided for this approach since they have multiple inter-connected repositories, and Jira makes it easier to track issues that affect multiple of these repositories. But :person_shrugging:)

4 Likes

As for the documentation viewer … I actually rather used the GitHub source, as I could also see there how actively the role is being worked on.
I can now have a role and a collection with the same name? Nice feature, but against that, for me, there are too many core functions that are simply broken.

The new fancy web UI also has some annoying issues outstanding.
The role version display is wrong, I have to search for my roles / collections to see them, the import status is missing, …

But the worst for me are … I can import a new role, but I can’t update any more!

This is quite a problem with error corrections… bad.

And don’t get me started on the insufficient documentation.

3 Likes

Lucky you, i cannot even find my own role back :wink: and yes its broken as hell. I am pretty sure most people using this work in IT. So to say that i have a hard time connecting the dots says enough.

2 Likes

No notes anymore to filter out badly-reviewed roles, no OS filtering to only see roles relevant to our platform of choice, no sort by download count or last update either (to avoid outdated or abandoned ones), …

Which means that if you search, let’s say, a role for “nginx”, you have to click on individual roles in 74 pages and look at each of their details to find one that would be relevant.
And, of course, putting the distribution name in the search field doesn’t even use the tag list of the roles.

I’m sorry, but are the developers of the new website even Ansible users?

“the new Galaxy is not completely broken”?
Well, a teacup with no handle may not be broken but it would be impossible to hold while the water is too hot, which is a critical use case.

So no, it’s not broken, it’s just a UX disaster due to the lack of widely expected basic features.
The current state of the website should be expected from an alpha release, not a production release!

I don’t care about the documentation viewer if I can’t find a role for which I’d like to read its documentation.
Yeah the teacup has gold leaves on the side now, but I still can’t hold it while the water is hot.

5 Likes

I’m teaching noobs that the correct URL to use is https://old-galaxy.ansible.com, and I’d like to report an error in the banner at the top of that site:

Pardon the sarcasm, but the new site is really completely broken. The fact that not even a list of OS/versions is available on roles makes it all but useless.

6 Likes

Just in case there’s any doubt, all these things are still being read by the Galaxy Team, and they know there’s still work to do. We were discussing some of it just yesterday, and new patches are going live frequently. I’m also pushing for a retrospective writeup, as some of the things I’ve heard privately about why some of this happened is actually pretty interesting, and I’d love to share it if I can.

GitHub issues may return for Galaxy (the team are considering it), but part of the problem is that GalaxyNG made of more parts and services now, so knowing where to open the issue isn’t always clear as there are multiple repos. I’m not saying you fine folks can’t work it out - but we definitely hear the confusion from many parts of the community, which is a big part of why we created this forum. Starting the discussion here and then moving it to GH/Jira as needed helps keep the initial troubleshooting in a consistent place where the next person can find it.

As Felix rightly pointed out, we can’t revert to the old (completely unmaintained, frequently down) Galaxy without even more pain, so we’re going to have to ride it out. If that means keeping on using the old server for now, great. I get that the transition is frustrating (perhaps an understatement), and a certain amount of snark and venting is fine - no one here has crossed the line yet. But please do try to keep our community civil - a community’s standards are de-facto the lowest level of behaviour that it doesn’t act against, and I believe we are a good community. There are ways to make your point and report issues without resorting to personal insults or unreasonable demands. Please don’t make me have to go and find links to the Code of Conduct …

10 Likes

But if i login on the old galaxy, it jumps to the new one… so back to where i started :man_facepalming:

2 Likes

Thanks for posting this Greg. I see the communication here as passion (although there is well appreciated frustration), but I also honestly struggle with clear and respectful communication when I get upset about something. We are all human, so assume that folks have good intent on all sides of the argument. It is up to all of us to always be respectful and just be nice to each other, Greg shouldn’t have to even pipe up IMHO, it is sad he had to.

One of the things I do… you are gonna laugh, is use AI tools like ChatGPT, you can chat with them, and have them be more respectful.

For example I stole a paragraph from above and redid it->

I’ve noticed some changes on the new website that have made the user experience challenging for me. Previously, there were features like notes filtering for roles based on reviews, OS filtering to narrow down roles by platform, and options to sort by download count or last update. These helped ensure relevance and timeliness.

Realize that there is tons of people using the forum, of different cultures and languages, so it is even more important to just be “nice” because we are all on the same team here :heart:

4 Likes

Ah… so it’s not just me :frowning:

As with the initial rollout of Collections, it seems like the majority of the work has ignored the (small but?) passionate group of users who still just use roles as the base unit of Ansible content (and either rely on the community Ansible distro or manage collections in a separate manner).

I have been patient the past few weeks trying to give things time to settle, but as this is affecting role users directly now (I try publishing bugfixes and updates… then everyone starts hitting problems), I hope that the voices in this thread can push towards maybe an “all hands on not-breaking a significant portion of the community’s previously-working tooling”.

Like, if everyone’s not focused on stabilizing Galaxy, they need to be.

I may sound like a village idiot, but Michael DeHaan just released an alpha build of JetPorch, and while it is far (VERY far) from what Ansible is today… it may solve my simpler use cases, and having to spend all my limited OSS dev time working on communicating to my users about Galaxy’s brokenness is a very heavy incentive towards looking towards greener pastures. Even if I have to step in a bit of :poop: getting through them :stuck_out_tongue:

I still use/love/promote Ansible, and love the team, but the rollout has not been fun to deal with as an end user and relatively popular role maintainer.

9 Likes

2 posts were split to a new topic: Improvements to new galaxy’s 404 and redirects

I join to the complains.

I’ve searched for a bug in my code for weeks, because ansible-galaxy role import Gwerlas system crashed with the unknown field in galaxy_info error. Of course, it has always worked well in the old Galaxy.

Today, without any changes, it works.

So,

I’m not able to find my roles in the new Galaxy, it’s frustrating.

Then, rewriting the URL of another role, I finally find my role.

When I go to the documentation or versions tab of any of my role, there are HTML codes in the notifications panel.

Screenshots :

Regards,

What I find truly insane is how this was greenlit to be released.

No one can tell me none of these glaring functional issues were unknown. You can’t even go past the home page and not stumble across either removed features or actual broken stuff.
And this doesn’t even cover the obviously messed up data migration.

I mean sure at this point it’s too late to pull the plug and revert. But you’re not telling me you could’ve done at release when all the issues were plain obvious within seconds.
I work in IT. I’ve been through countless botched and rushed releases, but this is an impressive achievement to deploy something this broken and not immediately revert it.
To add to that, even if it meant having to keep a system running that falls appart at the seams would have been by far the better choice. I say that as I’m desperately keeping an unmaintainable broken piece of crap alive while the replacement is being developed until it’s feature complete, stable and throughly tested (including the data migration).

Also I find this truly concerning on a reliability and stability standpoint. I mean I use Ansible to manage my infrastructure. My company uses Ansible to manage their infrastructure. And seeing that there’s no care taken to ensure people can continue to use this software in its entirety instantly killed all my trust in the stack. Looking at this I regret have put as much time into this technology as I have. Genuinely if I could jump ship I would, as it’s clearly demonstrated that there’s no sense of responsibility behind the people maintaining this project. And shiny new things are more important than working things.

In summary I’m flabbergasted at how we’re here. There were hundreds of opportunities to avert this dissaster and none were taken. To say I’m frustrated would be an understatement.

4 Likes

Just do not use Ansible Galaxy anymore.
You can add git repositories to your requirements.yaml (or however you name it) directly:

  - name: userx.whatever
    src: https://github.com/userx/ansible-role-whatever
    version: 2f13355c336fccd0760b7dad5be21ccfec73ff25

Then just use ansible-galaxy install --role-file requirements.yaml as always; maybe you have to use --force.

It’s sad, but we have to continue working.

3 Likes

Hey everyone,
First, I just want to say I’m sorry you all are dealing with issues and are running into problems with usage on the new galaxy site. As you mentioned @jpmens, we have the old versions of the UI running at https://old-galaxy.ansible.com/, so if you’d prefer to use the filtering and sorting there, please do so. Here is some info about old-galaxy.ansible.com:

Filtering and sorting are one of the highest priority items the team is working to improve. We worked to bring back role download count sorting, which has now landed. We are actively looking at ways to add a lot more improvements to search, and I’ll post our proposal there and what will be covered shortly. We’ve already gotten a good idea from the community on enhancements to our 404 redirection solution and are now working to investigate that further.

I can understand that there are issues and frustrations with using the site, and I also appreciate everyone who has taken the time to share that feedback here on the forum. I can tell you that the galaxy development team really wants to create a service that is awesome to use for our community and continually being improved. I can also say confidently this is our team’s highest priority and focus. We are currently working to ramp more people up so we can parallelize efforts and get both fixes and enhancements in at a faster rate.

I do feel like it needs to be said though–editorializing in a negative way and making unfair assumptions about the development team’s skills or motivations is unhelpful. The team is putting in a lot of effort to make improvements and fix the issues we know about. The negativity can be demotivating, which is hard when we have a lot of work to do to keep things moving in a positive direction. In short, keep the feedback coming, but please remember, at the end of the day, the development team is comprised of people who are working hard on Galaxy.

Also if you are having issues, please give us some context. Link us to a specific thing that is broken. Tell us specifically what you tried to do if there was an uploading issue you ran into. These help us a lot in coming up with reproduction scenarios and ultimately fixes. And continue to engage with us here because like I said, we want to make this an awesome service for our community to use.

6 Likes

Hey Devs,
thank you everyone for contributing to the new Galaxy and putting your time and effort into building the site, really appreciate it!

Things don’t completely work as people (me including) might expect it, yet, but as mentioned already, you are working hard on it to get features up and running. Please take your time and don’t let yourself pulled down by angry or annoyed posts. I’ll sit down and wait patiently. You rock! Thanks for your work.

6 Likes

Hello,

Having already posted one of the “frustrated feedbacks” I wanted to refrain from writing anything else, but this may be a good opportunity to explain my point of view.
I’d like it to be both constructive and a source of mutual understanding, even if it’s painful to read (it also may be twice as painful for many, as English isn’t my main language).

First, my message was written after wasting quite a bit of time on the new website, and before I knew there was a fallback: it didn’t help the tone, despite all the rewordings I’ve made.

As a side-note, the old version seemingly requires so much work to maintain that switching to the new version ASAP was absolutely necessary, but at the same time the old version is planned to be kept available for a while?
It seems like there’s a contradiction there, but I’m not “in” the project so I’m probably missing key elements (like, “read-only is OK”?).

But yeah, like BrainStone above I’m one of the people slowly creating their own library of roles for internal use at the companies they work for, while borrowing from time to time from the splendid library of public roles already available on Galaxy when it makes more sense, and before maybe becoming one of the public contributors later.
All this work is necessary to build up enough momentum around Ansible inside the company I work for to be able to “go to the next level” in the coming months and years (RHCE, Tower, …), and collections don’t matter to me at this stage (they’re more of a hindrance really).

On top of receiving an unexpected “stick in the wheel” from an entity proud to be “on the same side” with this update, some of us still have in mind the heated debates of last summer and felt solace in the fact that the Ansible team, at least, seemed unaffected by this shift in mindset.
But many users like me stayed on alert, and when this weird-tasting update happened… this didn’t call for a usual response either, and yes it comes from the same willingness to keep the project safe, working and usable.

As I don’t use Galaxy to find new roles very often, I don’t remember seeing a banner saying “please test the new version and tell us what you think, help us help you”: if it was there for the last few months, then maybe it wasn’t visible enough.
If I was just stupid and blanked it the same way I mentally ignore ads, then okay, I could admit that publicly.
I don’t remember either receiving an e-mail, as a registered user, telling me “beware, we’re gonna update the Galaxy website on TARGET_DATE, don’t panic and here’s what you should be expecting!”.
And right now, I don’t see a banner saying “while we’re hard at work on this new release, users looking for browsing features of the previous version can use (old URL) for the time being”. Why? Are you expecting everyone to look for the FAQ?
I only saw the “Share your feedback” banner and, guess what, that’s exactly what I did in the way the website’s UX intended me to do, as in “without reading anything else first”.

When I asked “are the developers of the new website even Ansible users?” in my previous post, it was a real question and not an assumption on the developers’ skills, as for me a user would have put filters and sorting options at the very top of the “beta release” checklist (not the “alpha” checklist, I’m not a monster).
I still can’t understand what happened here and, yes, a post-mortem of the launch could be interesting when you’re ready to talk about it.

But after my first message, seeing how both role users and roles developers have issues with -NG, one can only wonder how the release cycle got handled and if the release wasn’t rushed because of a sudden deadline from above: it happens, we’ve all been there, and if it’s the case we know you can’t admit it publicly (but it would still be concerning regarding the direction of the project).

If no rollback is easily doable for the time being, first I assume it means that -NG could be “fixed” rather quickly so it’s good news, but is it really better for you to have to (I suppose) manually fix the issues popping up in new role updates in the meantime?

I’m just a small fry here among giants, but as someone who spent a few years in web development, most of their professional life as a sysadmin, often used FOSS and sometimes contributed to it (code, translations and money) for more than 20 years now (…yeah), the final key points to understand my previous post, this one and maybe some future ones (I hope not):

  • I have no idea how much of the work and decisions regarding Galaxy are made by Red Hat employees having to follow decisions from someone at Product Management who isn’t a product user, but I suppose it’s the vast majority. I may be wrong. But I’m sure RH practices dogfooding well enough to see and report most Galaxy-NG issues internally.
  • I would never have allowed myself to publish that version if I was in charge of it (but on the other hand, maybe nothing wouldn’t have come out at all due to too much perfectionism, I know that all too well).
  • Even if I was OK with it, most bosses I’ve had would’ve been very hard on me if I suggested to go live with it: they would’ve seen early on in the project that the deadline wasn’t realistic enough and acted accordingly.
  • I salute the work you’re currently doing, probably way ahead of schedule and maybe even at the detriment of your own free time, but it still should’ve preceded the public release.
  • I don’t think messing up something and using “please don’t be mean” as a shield is a responsible answer, both in paid and volunteer work, as harsh as it may sound.
  • I’m sure some of you are already harsh enough with yourselves and don’t need a clown like me to add anything, but if you’re in this situation then I believe you have a clean conscience already and I’d buy you a drink if I could.
  • I ultimately believe that someone, somewhere, took an unwise decision and thought that having the devs deal with the aftermath was the smartest solution. I may be wrong, again. But if not, it still calls for some form of (as polite as possible) protest to avoid it happening again, for the benefit of both the website users (role users and developers) and the website devs.
  • It’s not too late to communicate outside of this forum, but I know it would be like admitting that everything wasn’t perfect, and that any communication that’s not 100% positive may need to be approved/rewritten/respinned by many corporate layers (especially since last summer).

Sorry if it’s not the constructive feedback you were looking for, and cheers to everyone.

4 Likes

In the meantime, I’ve posted ablog post with a quick fix for the one completely broken aspect disrupting a bunch of my role user’s ability to install roles from Galaxy currently: Ansible Galaxy error ‘Unable to compare role versions’.

I do have faith these bugs will be worked—many already are. I don’t necessarily agree that more warnings would’ve helped all that much (historically throwing more banners on sites just leads to notification fatigue…), and even invested users like me only scratched the surface and made sure the bare minimum was working (but even that wasn’t enough, heh! I never tested importing a role from GitHub into NG before launch… not sure if that was even possible? Maybe).

My hope is that the right lessons can be learned, the right fixes can be made, and then feature development can continue. I just fear that the AAP push few new feature development will not allow the tech debt created by role shoehorning into Galaxy NG to be fully dealt with. But happy to be proven wrong!

9 Likes

There’s a separate breakage I’m getting flagged on now: Cannot install some roles anymore, tries to import from GitHub user `None` - #2 by felixfontein

Hate to say it but I believe this is about the only way forward. Galaxy is Not Good anymore…