According the documentation (https://github.com/ansible/awx-operator/blob/devel/docs/user-guide/advanced-configuration/pods-termination-grace-period.md) and announced in the AWX-Operator version 1.3.0. We are trying to deploy this new feature “termination_grace_period_seconds” via Helm value file but receiving error. What is wrong in our Helm value file and how we can use this feature via Helm values? Thanks in advance.
Error: UPGRADE FAILED: error validating “”: error validating data: ValidationError(AWX.spec): unknown field “termination_grace_period_seconds” in com.ansible.awx.v1beta1.AWX.spec
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
awx-dev-awx-operator awx 46 2023-09-28 09:13:25.71129122 +0000 UTC deployed awx-operator-2.5.3 2.5.3
- name: custom-awx-ee
I don’t have any explanation, but what if you uninstall your release, then deploy from scratch ?
Probably you need to update operator before add new fields.
I can try but usage of this option into value file is correct and it is possible to use via Helm templates, right?
Current version of AWX-Operator is 2.5.3 and I’ve tried to upgrade with the new field to the latest version 2.7.1 without success and the same error.
You need to upgrade operator without termination_grace_period_seconds spec, then, after operator upgraded, reapply with termination_grace_period_seconds Field.
This is something which we did in the past because we’ve already updated the operator multiple times in the past.
Unfortunately, uninstall/install the operator via helm doesn’t helped and the error is the same. On new test cluster it is possible to install Operator with “termination_grace_period_seconds” value. We are using latest helm version. Any suggestions?
Error: UPGRADE FAILED: error validating "": error validating data: ValidationError(AWX.spec): unknown field "termination_grace_period_seconds" in com.ansible.awx.v1beta1.AWX.spec
helm.go:84: [debug] error validating "": error validating data: ValidationError(AWX.spec): unknown field "termination_grace_period_seconds" in com.ansible.awx.v1beta1.AWX.spec
it seems like the CRD haven’t been updated with the newer schema that contains termination_grace_period_seconds
You can verify this by doing
kubectl get crd awxs.awx.ansible.com -o yaml | grep termination_grace_period_seconds
You are totally right and that was the missing part. Deleting the resource and re-installing of the Operator did the trick. Thank you very much for your help.
I spent some time looking at the termination grace period bits that were recently added, but I’m unable to find if it will provide the same grace period to workflow jobs.
What I mean is, if you define a grace period (it’s essentially a guess for the time a given pod) and if the pods safely terminate on a given node, does that mean the workflow itself (stored in the DB) will make sure that if a given node goes down for maintenance after the grace period, would the workflow survive if child workflow node jobs are running on other nodes?
I haven’t had a chance to specifically test this as it would require a lot of scaffolding to ensure I have visibility into what is terminated and when (if expected or not).
It’s been something I’ve been digging through the past couple weeks and everything seems to relate to workflow jobs for me.
Sorry for the noise.