Non-deterministic lumi counters on 2024 data
@sandrese updated the lumi test in MooreOnline to sort events by odinEventNumber to have a proper comparison (since events appeared to not always be sorted the same way), run with a single slice and to use 2024 data (MooreOnline!395). This revealed a few issues when running with Allen on 2024-patches
:
-
n_invalid_chanid
counter is non-deterministic on both CPU and GPU, counter fluctuates around +-1. It also differs a lot between CPU and GPU (CPU O(x2) the value of the GPU) (@ahennequ ) - SciFi lumi counters are non-deterministic on both CPU and GPU, counters fluctuate around +-1 (maybe related to the first issue)
- the PV lumi counters are non-deterministic on GPU only and differ between CPU and GPU (@dcraik ) - on tests with the old samples they were all set to zero, suspecting that the VELO was not included on the old MEPs that was used for the test.
- on the old data (2022), plume counters have very large differences between CPU and GPU, this is no longer the case with 2024 data, which seems to point to an issue with the old decoding. (@espedica )
All tests were performed locally on the same machine (n4050101), using x86_64_v3-el9-gcc12-opt+g
for the CPU and x86_64_v3-el9-gcc12+cuda12_1-opt+g
for the GPU builds.
Logs coming soon.