[Update: Fixed] Ansible Galaxy & Automation Hub service degradation info thread (Jul 8, 2024)

Ansible Galaxy service degradation info

Hello everyone,
We are currently experiencing some issues with our S3 file storage that we are actively working to fix.

During the interruption, artifact upload and download will not work, and avatars will not be displayed, as they utilize this file storage tool.

Database access is not impacted, so browsing/searching still works as expected.

We do not currently have an ETA for when things will be fixed, but I will share in this thread as soon as we do.

  • Update 12:07 ET: This was fixed for galaxy.ansible.com, avatars should show up and downloads/uploads should work now
  • Update 2:20 ET: We made a similar change for Hybrid Cloud Console
  • Update 2:24 ET: We made a similar change for old-galaxy.ansible.com, at this point, everything galaxy related should be fixed.
16 Likes

currently all galaxy publish attempts return 500 without error codes or anything.
took some time to find information that service is degraded/malfunction and is not ‘intentionally tweaked into breakage’.

please post this information on main page.

I’m made it a banner, so it’ll appear at the top of every forum page (you can dismiss it for yourself, it’ll save that in a cookie).

Thanks for posting @jlmitch5!

could it be posted on main galaxy page (https://galaxy.ansible.com)?

2 Likes

Update: This should be fixed, avatars should show up and downloads/uploads should work now. Let me know if you have any issues with anything still working

2 Likes

Looks like this issue still persists in RedHat’s enterprise Ansible Automation Hub. Curious that it’s the same exact issue. I’ve filed a case with RedHat for this and am just leaving this comment here as a note.

3 Likes

Thanks for the info, let me look into this.

Based on the issue, it doesn’t seem possible this could effect either Hub on console.redhat.com or Private Automation Hub (aside from interaction with galaxy.ansible.com through syncing).

1 Like

Actually nevermind, I see the same issue on console.redhat.com now @isuftin_at_usgs

I’ll make sure the fix is applied there and update you ASAP

1 Like

It looks like the issue has been resolved on the RedHat end as well. The only difference was the S3 error came from automation-hub-prd.s3.us-east-2.amazonaws.com instead of ansible-galaxy-ng.s3.dualstack.us-east-1.amazonaws.com. But again, it seems to be working fine now.

Quick edit: Looks like the issue persists. I was able to grab a tarball once but now the issue is back again on RH AH.

1 Like

@isuftin_at_usgs noted, we can see it on our end not working now too, I’ll let you know as soon as we get that fixed

Might this also be causing an issue downloading galaxy collections in Ansible Automation Platform? We have users that pull the community.docker collection into playbooks who are receiving the following error:

ERROR! Failed to download collection tar from ‘server0’: HTTP Error 403: Forbidden

From the AAP job output, file in question is:

Downloading https://galaxy.ansible.com/api/v3/plugin/ansible/content/published/collections/artifacts/community-docker-3.10.4.tar.gz to /var/lib/awx/projects/.__awx_cache/_271__grantssps_automated_jobs/stage/tmp/ansible-local-879kqtywl8/tmpa5szuoxt/community-docker-3.10.4-ogc3bdm0

1 Like

That’s still having issues for you @vguaglione? The link is not 403’ing anymore, I can click it and it downloads the collection tarball. If you try to rerun the job does it still have issues? If so, we might have some sort of cache we need to clear. If the rerun works, you should be good to go.

Looks like we’re good now. Jobs just synced and completed successfully. Thanks!

We’re still seeing errors on old-galaxy.

12:06:36  Starting collection install process
12:06:36  Downloading https://old-galaxy.ansible.com/download/community-mysql-3.6.0.tar.gz to /home/jenkins/.ansible/tmp/ansible-local-327gfhsx_gr/tmp4096k0dr/community-mysql-3.6.0-q6fyc0td
12:06:36  ERROR! Failed to download collection tar from 'default': HTTP Error 403: Forbidden
$ curl -L https://old-galaxy.ansible.com/download/community-mysql-3.6.0.tar.gz
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidAccessKeyId</Code><Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>AKIA5DPYWLYOGHQ73CV2</AWSAccessKeyId><RequestId>17V8H5VYAVDDD0H9</RequestId><HostId>kUDkxSDJMAoScLAAX7K54xVPEhfEpudzh90vOjuoAQKZVLYmkBtdPWHwzL3OAiz2FBiWPwPg3sG9dhLLtvs8BRHSmOeus60CexBHbtS7mA4=</HostId></Error>

Our on-prem aap hub, is having issues. I’m able to pull from galaxy.ansible.com but not from console.redhat.com/api/automation-hub/content/published

403, message='Forbidden', url=URL('https://automation-hub-prd.s3.us-east-2.amazonaws.com/artifact/f4/621c3c44c6443d9f5e0a34dcc88e769670fc678311e48759afb5ac2e81275d?response-content-disposition=attachment%3Bfilename%3Dansible-windows-2.4.0.tar.gz&response-content-type=application%2Fgzip&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5DPYWLYOPMNU3JFQ%2F20240708%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20240708T170933Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=865f7ea756a4a648387c756b533b2fc7949f18a844f1b5349395c620d781a40c')

Is there a way a user can re-pin a banner after having dismissed it? Asking for a friend…

Seems to be working at this point on our end. Can you try again?

1 Like

Hey @utoddl , the goal of the pin is to bring awareness to this thread, you don’t need to re-pin it, you can subscribe to the topic (which you did by replying) or just visit the News & Announcements category, where we moved it for greater visibility, and it will be there!

1 Like

Yes, everything is working as expected. Thank you.

1 Like