Skip to content

nginx breaking new cluster creations for 1.24 and 1.25 due to pre-hook manifests

From https://cern.service-now.com/nav_to.do?uri=%2Fincident.do%3Fsys_id%3D2c544e6e87242910094a65770cbb358e

The user complains that the cluster template is not correctly installing the cluster. Checking the cluster, it looks like nginx is breaking the helm bring up of the cluster due to the pre-hook problem (verified also in kube-prometheus-stack)

$ sk8s 7d57fd86-114f-4dfb-ad7a-84dd4d0b2875
$ ka get po
NAMESPACE       NAME                                               READY   STATUS              RESTARTS   AGE
kube-system     cern-magnum-ingress-nginx-admission-create-kj59g   0/1     Pending             0          54m
kube-system     k8s-keystone-auth-x2k59                            1/1     Running             0          54m
magnum-tiller   install-cern-magnum-job-8zvhq                      0/1     Error               0          48m
magnum-tiller   install-cern-magnum-job-fg5k4                      0/1     Error               0          48m
magnum-tiller   install-cern-magnum-job-gl5gr                      0/1     Error               0          54m
magnum-tiller   install-cern-magnum-job-jlkhg                      0/1     Error               0          48m
magnum-tiller   install-cern-magnum-job-jpg5m                      0/1     Error               0          48m
magnum-tiller   install-cern-magnum-job-sbhsb                      0/1     Error               0          48m
magnum-tiller   tiller-deploy-5bc784966d-hzjqs                     0/1     ContainerCreating   0          54m

Checking the ingress-nginx version from the release:

$ helm list -A
NAME       	NAMESPACE  	REVISION	UPDATED                                	STATUS	CHART             	APP VERSION
cern-magnum	kube-system	1       	2023-01-25 12:24:46.885866308 +0000 UTC	failed	cern-magnum-0.12.0

We know nginx is version 4.0.6

$ cd /tmp
$ helm pull stable/ingress-nginx --version 4.0.6 --untar=true
$ cd ingress-nginx
$ grep -R "helm.sh/hook"
dtomasgu@theMachine:/tmp/ingress-nginx$ grep -R "helm.sh/hook"
templates/admission-webhooks/job-patch/role.yaml:    "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
templates/admission-webhooks/job-patch/role.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/psp.yaml:    "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
templates/admission-webhooks/job-patch/psp.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/clusterrole.yaml:    "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
templates/admission-webhooks/job-patch/clusterrole.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/rolebinding.yaml:    "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
templates/admission-webhooks/job-patch/rolebinding.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/clusterrolebinding.yaml:    "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
templates/admission-webhooks/job-patch/clusterrolebinding.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/job-patchWebhook.yaml:    "helm.sh/hook": post-install,post-upgrade
templates/admission-webhooks/job-patch/job-patchWebhook.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/job-createSecret.yaml:    "helm.sh/hook": pre-install,pre-upgrade
templates/admission-webhooks/job-patch/job-createSecret.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
templates/admission-webhooks/job-patch/serviceaccount.yaml:    "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
templates/admission-webhooks/job-patch/serviceaccount.yaml:    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded