Add HIP tests to CI

Merged Daniel Hugo Campora Perez requested to merge dcampora_hip_tests into master

This MR adds tests that are run with AMD hardware to the CI.

  • Use ROCm version 4.0.0 from cvmfs to compile Allen (thanks to @bcouturi ).
  • Use mi100 runners, which correspond to the recently installed AMD MI100s in LHCb Online (thanks to @ksawczuk ).
  • Use a running configuration that works sufficiently well on the hardware of -n 8000 --events-per-slice 8000 -r 100 -c 0 -t 6 -m 5000. It looks from a first investigation that the kernel invocation code has a high overhead, which would explain why configurations with higher number of events are needed.
  • Use kernel calling convention from CUDA as well in HIP as it is supported.
  • Set HIP_ARCH to be gfx908 by default (required by MI100).
Edited by Daniel Hugo Campora Perez

Merge request reports