Mount failures in backup, code hardening required
Got a weird error in last CI run, backup job failed with
mounting pvc-96e51aa8-4ba1-4a92-ab2b-ab2d6be181bc in /mnt JOB_UID: backup-volumes-cephfs-test-11647251 ...
mount: wrong fs type, bad option, bad superblock on ,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
I suspect something failed when retrieving information from Manila to mount the PV, resulting in invalid mount
command because some variable was empty.
Looking at https://gitlab.cern.ch/paas-tools/okd4-deployment/backup-cephfs-volumes/-/blob/master/backup_pvs.sh some hardening is required:
- data piped to
jq
is not properly quoted.echo $ITEM | jq ...
will NOT properly pass the contents of$ITEM
tojq
.echo "$ITEM" | jq ...
is the proper form. - there is not validation of any of the variables obtained via
jq
. EveryNAMESPACE_CSI_DRIVER=$(echo ...)
should be followed by sth likeif [ -z "${NAMESPACE_CSI_DRIVER}" ]; then echo "Failed to retrieve CSI driver namespace" 1>&2; exit 1; fi
- when retrieving data from Openstack service (https://gitlab.cern.ch/paas-tools/okd4-deployment/backup-cephfs-volumes/-/blob/6cefc2f6/backup_pvs.sh#L29-36), use
jq
to improve resiliency of output parsing. Use same solution as https://gitlab.cern.ch/paas-tools/okd4-install/-/merge_requests/222#note_4094204 (code in https://gitlab.cern.ch/paas-tools/okd4-install/-/merge_requests/222/diffs?commit_id=61d58023012ea20e47a84c65993223e3361a79f1)
Edited by Alexandre Lossent