I want to use a custom pod specification with a container group.
The AWX install is in namespace1
I have created a second namespace namespace2 where I want to run container group jobs.
Both namespaces are on the same Kubernetes cluster.
I created a service account with secret in namespace2. It has the rights to create secrets and run pods in namespace2
I created a credential of type Kubernetes Bearer Token using the certificate data and token of the service account from namespace2
I also created a container group that is linked to this credential
My expectation is that when the EE image runs, it uses this service account credential and creates a pod in namespace2
However, when running I get this error,
Error creating pod: pods is forbidden: User "system:serviceaccount:namespace2:serviceaccountname" cannot create resource "pods" in API group "" in the namespace "namespace1"
Why is it trying to create a pod in namespace1 instead of namespace2???
Bit frustrated. Any guidance is most appreciated!!
I have tried the following variations without success,
serviceAccountName: serviceaccountname and also serviceAccountName: default
automountServiceAccountToken: true and also automountServiceAccountToken: false
I also granted serviceaccountname pretty much all rights just to rule that out as the issue.
I have attached to it a credential of type “OpenShift or Kubernetes API Bearer Token” with my kubernetes API endpoint defined in that cred.
This works as expected in my own AWX deployment, so all I can think of is to double check your AWX job template is configured to use the correct instance group for the namespace you want to use.
@mcen1 - I have the same Credential (k8s API bearer token) configured both on the container group definition as well as on the template definition. Should I be only specifying it on the template and not on the container group? Or vice-versa? But otherwise, I do have the credential connected to both the template as well as the container group.
You don’t need the credential on the job template, just associated with the instance group. Since you are not seeing the right namespace, I suspect your job template isn’t configured to use that instance group and is just defaulting to the default instance group.
If you had an incorrect cred or one with insufficient permissions, you’d see an error about being unable to connect or create a pod or some other aspect of the necessary permissions. Your seeing pods in the wrong namespace seems to me like something simple is missing right now for you.
Try posting your job template with any sensitive info redacted, also the kubernetes cred you’re using too.
Didnt realize there was a verbosity switch on the Template Tried that. But the output message is exactly the same. No extra details than in Normal mode of logging.
The “Instance Groups” is set to Mycontainergroup01…which is the name of my Container Group. I just have a dummy host called localhost with this variable yaml
Gotcha. I’m out of ideas on things to check, your setup looks correct to me. Maybe there’s some deep kubernetes/awx lore I’m missing out on that made mine work
Still no success. What I am concluding is that the container group set on the template is not getting used. Although the completed jobs show up with the correct container group name, I don’t think its really using the container group. That would explain why it is not using the right namespace. Also, other changes in my customized pod specification do not seem to be getting applied. Any thoughts on why it might be ignoring my container group??
Don’t know why it would be happening like what you’re describing. Just for giggles, in the AWX UI, go to instance groups and click on the “default” container group, then click on the Jobs tab. See if your jobs are landing in the default container group instead of your desired container group.
Good suggestion. I checked that also. The executed jobs all appear under the right container group instance…but I dont feel its using the custom pod specification. Reason I am suspecting this is because,
There is a custom namespace that it is ignoring
There is a volume mount I am trying to add. But when I query for mounts, it doesnt show up in the Ansible playbook
So I feel its either picking the right container group but then ignoring the custom pod spec. OR its not using the container group at all. Since the jobs show up on the right container group, I am thinking its more the former where its not applying the custom pod spec.
In this link it states, Note that the image used by a job running in a container group is always overridden by the Execution Environment associated with the job.
Does that mean it will also ignore the custom pod spec? If I blank out the Execution environment on the template, it still picks “AWX EE (24.6.1)” as default.
Also, if you look at a recent job run from your job template’s “Job” history tab and click on it, do you see that job run indeed used the container group you expected it to in the “job details” tab (see pic below)? If so, can you click on container group name’s link from that piece of the UI and validate the container group is the one with your desired pod spec?
And I really appreciate all the interest and troubleshooting advice you have provided! Otherwise, its so easy to lose heart and walk away if community support is lacking So thanks again!
Version is core 2.18.7
Yes, the container group its showing on the job is the right one.