[AWX 20+] Cannot extend base image anymore, as it seems network is unreachable since the migration to CentOS 9

mbutton77 · March 23, 2023, 8:31am

Hi,

I recently opened a bug (which is not really a bug, more a question) and I was advised to post it here as it seems more relevant.

Here is the summary :

I’ve encountered a weird behavior while migrating from AWX 17 to 21.11.0.

We are not directly using the AWX base image in the Operator, but we are first extending it with some packages (to suit my client needs).
In 17, we used to define our own YUM repo and installed those packages. Everything was smooth and we could generate our own image without any problem.

We decided to migrate to the latest version (21.11.0 at the time of writing). And then our build pipeline failed. At first, we thought it was because the new internal mirrors towards CentOS9 repo were not up, but that’s not it : no external URL is reachable.

I tried to diagnose the problem, by logging into the image and and performing some network investigations but as the image does not have any network tool (ping, ip, host, dig, tracepath …) it’s very hard to tell what’s wrong. I checked the resolv.conf, the hosts file, the selinux config, access.conf, etc and nothing obvious came out of it.

I checked all the versions and the problem seems to appear in version 20 (with the switch to CentOS9).
Again, I may miss something obvious but I carefully read the docs (AWX + CentOS), browsed the current issues and couldn’t find the slightest clue.

For now, as I’m in an early stage, I just dropped the installation of additional packages, but as they were security related, I won’t be able to go in production without them.

All is detailed here :
https://github.com/ansible/awx/issues/13543

Thanks in advance for any piece of information, advice or experience on that.

kurokobo1 · March 23, 2023, 12:42pm

Hi,

getaddrinfo() thread failed to start

What version of Docker are you using?

Some times old Docker causes similar issue since new glibc installed in CentOS 9 can’t be worked on old Docker.
I’d recommend you to try it again with newer Docker.

If your issue still exists with the latest Docker, you should start your investigation with plain CentOS Stream 9 image instead of AWX.

mbutton77 · March 23, 2023, 3:59pm

Hi,

Nice suggestion, I will try that right away and give you an updated status.

Thanks !

mbutton77 · March 23, 2023, 5:06pm

Alas, I have reached the same conclusion.

I was initially working with a RHEL 7 machine with Docker 18.03.
My latest test was on a RHEL 8, with Podman 2.0.5.

Ok, I will investigate directly with a raw CentOS 9 image and see what I can do with that.

Thanks for the reply anyway.

kurokobo1 · March 23, 2023, 6:35pm

Hi,

Both Docker 18.03 and Podman 2.0.5 are too old
I don’t think such old Docker or Podman can handle the security hardened wrapper for syscalls implemented in glibc 2.34.

mbutton77 · March 24, 2023, 10:12am

Hey,

Thanks for your input. I will try to see what are my options here as my client has a determined path regarding the upgrade of packages.
But you are right, it has nothing to do with AWX, it’s more a matter of CentOS 9 and the container runtime.

I guess we can passivate the thread for now, but I will post the information on my future tests here and in the “bug” I opened.

Thanks again for your valuable input @kurokobo.

mbutton77 · March 24, 2023, 2:00pm

Hi again,

Ok, I found an alternate repo to get a more recent podman package (4.2.0) but unfortunately, the result remains unchanged :

[root@max-rhel8 ~]# podman --version
podman version 4.2.0
[root@max-rhel8 ~]# podman run --rm -it --entrypoint=bash d7456a00e6af
bash-5.1$ curl -kv https://artifactory.internal.com/artifactory

Could not resolve host: artifactory.internal.com
Closing connection 0
curl: (6) Could not resolve host: artifactory.internal.com
bash-5.1$

But you’re right, I’ll dig deeper with a base Centos 9 image.

Anyway, thanks again for your suggestions.

kurokobo1 · March 24, 2023, 4:30pm

Hi,

the result remains unchanged

I see your error has been changed.

On old Docker / Podman:

curl: (6) getaddrinfo() thread failed to start

On newer Podman

curl: (6) Could not resolve host: artifactory.internal.com

So I think your initial issue has been solved on newer Podman but there is a different issue now.
I guess it’s DNS related issue. Try double-checking DNS settings inside the container or around Podman.

Topic		Replies	Views
New installation fails on building Docker image (can't get server API version) AWX Project awx , aws	4	31	September 20, 2017
AWX Upgrade 14.1.0 to 17.0.1 AWX Project awx	2	8	May 21, 2021
AWX Upgrade 14.1.0 to 17.0.1 Ansible Project awx	3	13	May 20, 2021
Could not resolve host: mirrors.centos.org during the build of the docker container image AWX Project awx	8	77	June 14, 2023
Awx base image on Ubuntu AWX Project awx , ubuntu	0	3	November 29, 2022

[AWX 20+] Cannot extend base image anymore, as it seems network is unreachable since the migration to CentOS 9

Related topics