Skip to content

CI refactoring

Ryunosuke O'Neil requested to merge roneil/ci-matrix-refactoring into master

Related issue: #235 (closed)

  • Jobs are defined using the parallel matrix keyword.
  • Scripts which run the tests are to be placed in scripts/ci/jobs/TEST_NAME.sh
  • CI configuration is split across YAML files
    • main.yaml included into .gitlab-ci.yml - entry point for the workflow
      • should be clear what is running, and on what.
    • common*.yaml - do not actually contain any jobs, just re-usable keys
      • common-build.yaml - parallel:matrix: configurations for builds
      • common-run.yaml - parallel:matrix: configurations for run jobs, to be run on each device
      • common.yaml - keys reusable across all jobs
    • devices.yaml - contains keys to set the correct tags: and variables:
  • there is also a README.md at scripts/ci/config/README.md to give a brief idea of how to add tests + devices
  • extends: is used now - YAML anchors have been removed since they are not usable if a file is included
  • build.sh restarts the build from the last target if it failed due to an OOM error
    • less job failures & avoids rerunning builds from the beginning
    • waits between retries - wait time is randomly picked and scales with number of tries
    • if the job times out after 1h30m the job is failed + retried as before
    • example: https://gitlab.cern.ch/lhcb/Allen/-/jobs/13019643#L459
  • Catch2 executable is called directly in run_built_tests
    • Calling the executable allows a junit XML report to be generated and passed to the GitLab CI unit test report feature.
    • The unit tests are run on each device with RelWithDebInfo + Debug clang10 builds.
    • if more Catch2 executable targets become available then they should also be added to the run_built_tests.sh script
  • post_telegraf.py is commented out
  • allowed to fail:
    • "full test" physics-efficiency - scifi_v6 efficiency comparisons sometimes don't match #232 (closed)
    • "full test" run-changes - flaky
    • "test" run-changes - flaky
  • I tried to experiment with 'metrics', however these only work with the premium tiers and above (therefore can be ignored)

At the moment there is a 'minimal' pipeline (merge requests and master, web, schedules) and 'full' pipeline (master, web, schedules)


Todo:

  • "full" pipeline in MR can be triggered manually, auto-runs for master, web, schedules
  • build (partially complete)
  • throughput
  • run changes with / w/o
  • efficiencies
  • publishing
  • (correct me if something is missing) parallel matrix: configurations match all combinations from old configuration
  • address discussions in !553 (closed)
Edited by Ryunosuke O'Neil

Merge request reports