Updating the Worker krb5.conf file during pod creation

Kevin_Knox · February 16, 2023, 7:33pm

I have a working configuration from a previous admin that successfully updates the krb5.conf file on the Web node, the Task Node, and the EE node, but it does not update the krb5.conf file on the ephemeral worker nodes. I received advice to create a Cluster Group to build this into the worker nodes, but I have thus far failed to understand that process and have dropped back to question whether maybe I’m already doing that? I don’t know.

This configuration works for the 3 permanent nodes:

Kevin_Knox · February 16, 2023, 9:07pm

I think I have an answer, but I am having trouble applying it in OpenShift. I have created a new ConfigMap with the value I want, and I tested it in a sandbox. I’m able to make my new config map show up on the 3 persistent nodes. I’ve then found this config should apply to the ephemeral worker nodes (application nodes, they are called I believe.)

extra_volume_mounts: |

name: krb5instance
mountPath: /etc/krb5.conf
subPath: krb5.conf
extra_volumes: |
name: krb5instance
configMap:
defaultMode: 420
items:
key: krb5.conf
path: krb5.conf
name: awx-krb5-conf-cm

So, I made the change listed here and applied it to my OpenShift cluster. It says the config was changed, but the YAML has not picked up my changes. In order to make my sandbox pick it up, I needed to delete my entire namespace/project. That’s not exciting when I have a ton of work invested in the namespace now. I could take a backup, but then the backup is stored in the namespace and I cannot delete the namespace. I seem to be monkey trapped, and I cannot even test that the application node picks up the new config map without doing this.

Anyone with OpenShift experience who knows how to make the platform actually pick up the reconfiguration the oc command says it published?

Kevin_Knox · February 17, 2023, 8:24pm

OK. I’m confused. I have this set up exactly the way I think it should be set up, but what I expect should happen is not happening. I have set it up such that krb5.conf is mounted to all the persistent nodes and I have set it up such that the Application nodes have the exact same configuration, but the mount is not there.

To be specific:
web_extra_volume_mounts, task_extra_volume_mounts, and ee_extra_volume_mounts all do receive the ConfigMap specified as awx-krb5-conf-cm that I am calling krb5instance.
Per my understanding of the documentation, extra_volume_mounts should cause the same mount to appear in every Application node when it spins up. Instead, I get the Linux default.

Documentation I’m trying to follow:
9. Advanced Configurations for AWX Operator — Automation Controller Administration Guide v4.0.0 (ansible.com)

extra_volumes: Specify extra volumes to add to the application pod

I’m lost.

apiVersion: v1
kind: ConfigMap
metadata:
name: awx-krb5-conf-cm
namespace: it-myapp-awx
data:
krb5.conf: |

Added to instance.yaml file

To opt out of the system crypto-policies configuration of krb5, remove the

symlink at /etc/krb5.conf.d/crypto-policies which will not be recreated.

includedir /etc/krb5.conf.d/

[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt
spake_preauth_groups = edwards25519
default_realm = CORP.mycompany.com

Added below 2 entries to resolve ‘KDC has no support for encryption type while getting initial credentials’ error while connecting to Windows IIS Server

default_tgs_enctypes = arcfour-hmac-md5 des-cbc-crc des-cbc-md5
default_tkt_enctypes = arcfour-hmac-md5 des-cbc-crc des-cbc-md5

default_ccache_name = KEYRING:persistent:%{uid}

[realms]
CORP.mycompany.com = {
kdc = ohmycompanyhqdc024.corp.mycompany.com
admin_server = ohmycompanyhqdc024.corp.mycompany.com
}

[domain_realm]
.corp.mycompany.com = CORP.mycompany.com
corp.mycompany.com = CORP.mycompany.com

kurokobo1 · February 18, 2023, 4:09pm

Hi,

If your goal is to use Kerberos authentication to connect Windows hosts for your playbook, I hope my (unofficial) guide helps you.
https://github.com/kurokobo/awx-on-k3s/blob/main/tips/use-kerberos.md

F.Y.I., if Kerberos authentication is only required to run the Playbook, then any *_extra_volume_mount is not required because Kerberos authenticaion is done in the ephemeral “automation job pod” (Perhaps this is what you are describing as a “worker node”).
Mounting customized files in the automation job pod is outside the scope of the AWX Operator. Using Container Groups in AWX (as described in my guide) is my recommendation.

Regards,

Kevin_Knox · February 20, 2023, 1:51am

Thank you, @kurokobo.

You pointed this out to me in January. I worked on it for a few hours and was dragged away from the problem for a month. When I came back, I was able to make the Pod spin up successfully, but I was not able to make the job run in the Pod. The job always spun up in the default automation job pod. I went back to first documentation and found the document I reference above that explicitly states the extra_volume_mounts are for the automation jobs. I allowed myself to be seduced by that happy promise, but I guess that was a blunder on my part. I will return to trying to make the Container Groups do the magic when I get back to work on Tuesday.

I think there’s a chance the gap in my configuration is addressed in your documentation here:

Specify Instance Group that has custom pod spec that mounts your ConfigMap as /etc/krb5.conf.
I still don’t know how to do that. I don’t know what that looks like in a Job Template, but I guess it’s something I can find and set. If I’m right, then I’m about 95% of the way home because I have the Pod up and running. I just need to shift the workload to it.

Thank you, again!

Kevin

Kevin_Knox · February 21, 2023, 8:19pm

Disappointment, @kurokobo. Sorry to bother again. I was pretty close to sure I’d had the Pod running before. I cannot get it running now. I definitely installed it before, but I cannot remember for sure whether it was merely there or actually running. Idiot.

OK. Here’s where I am. When I run this config, I get a parsing error:

kurokobo1 · February 22, 2023, 4:30am

Hi,

Error:
{“status”: “error”, “job_explanation”: “Failed to JSON parse a line from transmit stream.”} {“eof”: true}

Where is this error displayed? I can’t reproduce your issue on my side with your configuration.
With the configuration you’ve provided, the Automation Job Pod can not be created in the first place, so your job should have failed with another error (“at least one container must be named worker”) before your error occurs.

Anyway, the pod spec for Container Group is not actual pod spec that completely used as-is to create pod, but is just template of pod spec.
So, some specs in the custom pod spec will be ignored or overridden by AWX.

Here are some considerations;

name: kerberosworker

The name for the container where Ansible Runner is running MUST has the name “worker”. This is one restriction for Receptor and AWX.
So changing this name causes “at least one container must be named worker” error. You can not modify this and you have to specify “worker”.

name: kerberos-pod

The name for the pod (metadata.name) will be overridden as “automation-job-*****” by AWX, so it’s okay that there is no pod name in custom pod spec.

namespace: it-myapp-awx

Ensure this is the namespace where your AWX is running.
If this is different namespace from the namespace where your AWX is running, you have to consider about ServiceAccount, Role, and RoleBinding since the SA for AWX has no permission to create pod on the different namespace.

image: ‘quay-remote.rtfx.mycompanysc.com/ansible/awx-ee:latest’

This will be overridden by the EE image that specified in the Job Template.
You should specify EE image not in custom pod spec but in Job Template.

Regards,

Kevin_Knox · February 22, 2023, 3:32pm

I have posted 2 messages, and neither is showing now. At the bottom of the screen I see the message, “Message has been deleted”. Can anyone explain what that means?

Kevin_Knox · February 22, 2023, 3:44pm

I attached a file to each of the 2 messages that were deleted. I can only assume attaching that text file is what caused the messages to be deleted. That was a lot of typing, and it’s lost now. Sigh. Shame on me for not keeping a copy.

Regroup and try again.

@kurokobo, Thank you for explaining the configurations to me. Sadly, the configurations you explain do not work as you explain. If I leave the out the metadata.name, I get a Pod creation error. (Error “Required value: name or generateName is required” for field “metadata.name”.). If I leave off the image tag, I get a similar error (Error “Required value” for field “spec.containers[0].image”.) And, yes, the namespace is correct for my project.

If I add in the generateName and image tag, then I get a stream of events around creating the Pod. It pulls the image, starts the container, creates the container, then goes directly to backoff (Back-off restarting failed container). If I look in the container current or previous logs, there are only the 2 lines you have already seen ({“status”: “error”, “job_explanation”: “Failed to JSON parse a line from transmit stream.”} {“eof”: true}). I do not know how to see the logs behind that log entry.

My assessment so far, given that I’m both a newbie and not showing many signs of cleverness here, is that there’s something wrong with the way I’m installing the controller, the deployment, or the AWX Pod itself. I don’t yet know where to start in tearing that apart to see what I’ve done wrong. I will try some random hacking and see where it gets me.

Thank you,

Kevin

kurokobo1 · February 22, 2023, 4:05pm

Hi,

Okay, maybe now I understand that you are trying to create pod manually by saving the spec as a yaml file and invoke “kubectl create (or apply) -f your-file.yaml”.

You should put your pod spec into “Custom pod spec” field for your “Container Group”.
See and follow my guide again: https://github.com/kurokobo/awx-on-k3s/blob/main/tips/use-kerberos.md#create-container-group

Open AWX UI and open “Instance Groups” under “Administration”, then press “Add” > “Add container group”.
Enter “Name” as you like (e.g. “kerberos”) and toggle “Customize pod specification”.
Put following YAML string to “Custom pod spec” and press “Save”

Then specify your Container Group in Job Template.
There is no need to save the pod spec to a file or run the kubectl command with that file. Just input the spec into the field in the Web UI.
AWX will then create the pod for you when it runs the job.

Regards,

Kevin_Knox · February 22, 2023, 4:15pm

Speechless. Yes. That worked the first time and was brilliantly easy. Thank you, @kurokobo.

I don’t guess I need to explain how I was interpreting everything in light of things I’d already figured out the hard way, and therefore just blind to the words I was reading. Thank you for showing me the way.

Kevin

Topic		Replies	Views
Need to deploy custom file to worker nodes AWX Project awx , windows , ee	7	23	January 24, 2023
Howto to deploy kerberos (krb5.conf) info during awx operator install? Ansible Project awx	2	74	July 12, 2022
How do I configure krb5.conf? AWX Project awx	6	126	April 6, 2021
Kerberos Support AWX Project awx , windows	10	63	September 25, 2021
Howto to deploy kerberos (krb5.conf) info during awx operator install? AWX Project awx	2	16	June 14, 2022

Updating the Worker krb5.conf file during pod creation

Added to instance.yaml file

To opt out of the system crypto-policies configuration of krb5, remove the

symlink at /etc/krb5.conf.d/crypto-policies which will not be recreated.

Added below 2 entries to resolve ‘KDC has no support for encryption type while getting initial credentials’ error while connecting to Windows IIS Server

default_ccache_name = KEYRING:persistent:%{uid}

Related topics