Skip to content

Resolve "Make CI pipeline run for CTA with PostgreSQL Scheduler DB"

Summary

Problem starting PostgresSched DB in the CI of GitLab [1] despite it works locally with no problems using the same scripts. Julien gave me an advice that the main change between my local setup and the GitLab CI is in the way images are handles and cleaned up. We need to wait for the image to start up and run before running the init function.

[1] https://gitlab.cern.ch/cta/CTA/-/jobs/35427849

$ cd continuousintegration/orchestration/
$ ./run_systemtest.sh -n ${NAMESPACE} -p ${CI_PIPELINE_ID} -s ${TEST_SCRIPT} ${EXTENDED_OPTIONS}
Cleaning up old namespaces:
DONE
Mon Jan 22 09:40:20 AM CET 2024: Launching ./create_instance.sh -n cta-admin-6766181gitf3b1c260-662g -D -O -p 6766181 -d internal_postgres.yaml -o internal_pgsched.yaml 2>&1
Creating instance for image built on commit f3b1c260 with gitlab pipeline ID 6766181
Creating instance using docker image with tag: 6766181gitf3b1c260
DB content will be wiped
schedule data store content will be wiped
Creating cta-admin-6766181gitf3b1c260-662g instance namespace/cta-admin-6766181gitf3b1c260-662g created
Copying ctaregsecret secret in cta-admin-6766181gitf3b1c260-662g namespace
secret/ctaregsecret created
configmap/init created
creating configmaps in instance
service/postgres-sched created
configmap/objectstore-config created
pod/postgres-sched created
service/postgres created
configmap/database-config created
pod/postgres created
configmap/eos-config created
configmap/eoscta-config created
Requesting an unused mhvtl libraryWarning: metadata.annotations[volume.beta.kubernetes.io/storage-class]: deprecated since v1.8; use "storageClassName" attribute instead
persistentvolumeclaim/claimlibrary created
.OK
configmap/library-config created
Got library: sg0
Requesting an unused log volume
persistentvolumeclaim/claimlogs created
Requesting an unused stg volume
persistentvolumeclaim/claimstg created
Creating services in instance
service/ctacli created
service/ctaeos created
service/ctafrontend created
service/kdc created
Creating pods in instance
pod/init created
Waiting for init...........................................................init pod in Error status here are its last log lines:
database.postgres.database
database.postgres.password
database.postgres.path
database.postgres.server
database.postgres.username
database.type
Wiping objectstore
Aborting: create failed: PostgresConn connection failed: could not translate host name "postgres-sched" to address: Name or service not known
ERROR: Could not create scheduler schema. cta-scheduler-schema-create /etc/cta/cta-scheduler.conf FAILED
ERROR: init pod in ErERROR: init pod in Error state. Initialization failed.
67
FAILURE: cleaning up environment
Error from server (NotFound): pods "ctacli" not found
Collecting stdout logs of pods to /tmp/cta-admin-6766181gitf3b1c260-662g_delete_VbOm
Error from server (NotFound): pods "client" not found
Error from server (NotFound): pods "ctacli" not found
Error from server (NotFound): pods "ctaeos" not found
Error from server (NotFound): pods "ctafrontend" not found
Error from server (NotFound): pods "kdc" not found
Error from server (NotFound): pods "tpsrv01" not found
Error from server (NotFound): pods "tpsrv01" not found
Error from server (NotFound): pods "tpsrv02" not found
Error from server (NotFound): pods "tpsrv02" not found
Error from server (NotFound): pods "ctacli" not found
Saving logs as artifacts
Deleting cta-admin-6766181gitf3b1c260-662g instance
namespace "cta-admin-6766181gitf3b1c260-662g" deleted
.OK
Status of library pool after test:
NAME    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS    REASON   AGE
log00   100Gi      RWX            Recycle          Available                                    47d
sg0     1Mi        RWO            Recycle          Available           librarydevice            31d
stg00   2Gi        RWX            Recycle          Available                                    47d
Uploading artifacts for failed job
00:01
Uploading artifacts...
Runtime platform                                    arch=amd64 os=linux pid=1971267 revision=f5da3c5a version=16.6.1
pod_logs: found 14 matching artifact files and directories 
Uploading artifacts as "archive" to coordinator... 201 Created  id=35427849 responseStatus=201 Created token=64_J334v
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit status 1

Requires manual tests in pre-production

NO

References

Closes #383 (closed)

Merge request reports

Loading