Error: UPGRADE FAILED: error validating “”: error validating data: ValidationError(AWX.spec): unknown field “termination_grace_period_seconds” in com.ansible.awx.v1beta1.AWX.spec
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
awx-dev-awx-operator awx 46 2023-09-28 09:13:25.71129122 +0000 UTC deployed awx-operator-2.5.3 2.5.3
I mean,
You need to upgrade operator without termination_grace_period_seconds spec, then, after operator upgraded, reapply with termination_grace_period_seconds Field.
Unfortunately, uninstall/install the operator via helm doesn’t helped and the error is the same. On new test cluster it is possible to install Operator with “termination_grace_period_seconds” value. We are using latest helm version. Any suggestions?
You are totally right and that was the missing part. Deleting the resource and re-installing of the Operator did the trick. Thank you very much for your help.
I spent some time looking at the termination grace period bits that were recently added, but I’m unable to find if it will provide the same grace period to workflow jobs.
What I mean is, if you define a grace period (it’s essentially a guess for the time a given pod) and if the pods safely terminate on a given node, does that mean the workflow itself (stored in the DB) will make sure that if a given node goes down for maintenance after the grace period, would the workflow survive if child workflow node jobs are running on other nodes?
I haven’t had a chance to specifically test this as it would require a lot of scaffolding to ensure I have visibility into what is terminated and when (if expected or not).
It’s been something I’ve been digging through the past couple weeks and everything seems to relate to workflow jobs for me.