Some Sequences are Causing Fluctuations

Fluctuating Sequences

Some of the validation sequences are fluctuating. It is hard to observe them on the 500 events Bs2PhiPhi MC sample we use in Allen ci, but they can be easily produced with a 9482 event minBias sample. I have tested the following sequences on both 2080Ti and A5000 in the Online farm (50 times each GPU):

Sequence	Fluctuating?	Frequency
`hlt1_pp_valiation` (forward)	No	-
`hlt1_pp_matching_no_ut_validation`	No	-
`hlt1_pp_matching_validation` (matching with UT)	No	-
`hlt1_pp_forward_then_matching_validation`	Yes	13/100
`hlt1_pp_forward_then_matching_and_downstream_validation`	Yes	36/100
`downstream_valiation`	Yes	21 /100

The fluctuations are happening on both 2080Ti and A5000.

Details of fluctuations

The logs are attached. A5000_fluctuations.zip 2080Ti_fluctuations.zip

What I have observed so far:

1. `hlt1_pp_forward_then_matching_validation`

The fluctuation in this sequence is only in the seed_xz_validation part, which can have 1 less reconstructed seed sometimes. The number of reconstructed counters in seed_xz_validation is exactly the same as in hlt1_pp_matching_no_ut_validation and hlt1_pp_matching_validation, when it is not fluctuating.

Given that the numbers in seed_xz_validation matches the matching only sequences, this implies that we do not remove SciFi hits for seed_xz during validation, so fluctuations in the forward tracking should not affect it. But if this is the case, why does seed_xz_validation not fluctuate in hlt1_pp_matching_no_ut_validation and hlt1_pp_matching_validation?

2. `downstream_validation`

Fluctuating number of reconstructed downstream tracks: number of reconstructed decreases by 1 when fluctuating. This sequence has long_validator, seed_validator and unmached_seed_validator all of which do not fluctuate.

What's very weird is that fluctuation happens in the denominator of the ghost counter (number of reconstructed), while the numerator of ghost counter does not fluctuate. This should cause a fluctuation in the numerators of efficiency counters, since you are reconstructing 1 more/less track, but this does not happen at all. The only explanation I can think of is that the 1 fluctuating track is actually a long track?

3. `hlt1_pp_forward_then_matching_and_downstream_validation`

This is just a combination of the same effects from hlt1_pp_forward_then_matching_validation and downstream_validation.

Reproduce the Fluctuations

To reproduce these results, first edit the following files to have options.evt_max = -1:

Moore/Hlt/Moore/tests/options/default_input_and_conds_hlt1.py
Moore/Hlt/RecoConf/options/mdf_for_standalone_Allen.py

then run a detdesc stack build of Moore with the following command:

./Moore/run gaudirun.py Moore/Hlt/Moore/tests/options/default_input_and_conds_hlt1_retinacluster.py Moore/Hlt/RecoConf/options/mdf_for_standalone_Allen.py

You should have a 9482 event minBias MDF sample in a folder called dump.

Finally, run Allen standalone multiple times, where Allen is compiled with the /cvmfs/lhcb.cern.ch/lib/lhcb/lcg-toolchains/LCG_105a/x86_64_v3-el9-gcc12+cuda12_1-opt+g.cmake toolchain (what we are using for data-taking):

for sequence in downstream_validation hlt1_pp_forward_then_matching_validation hlt1_pp_forward_then_matching_and_downstream_validation hlt1_pp_validation hlt1_pp_matching_validation hlt1_pp_matching_no_ut_validation
do
  for i in {0..49}
  do
    ./toolchain/wrapper ./Allen --sequence ${sequence} -n 10000 -t 5 --events-per-slice 1000 -m 1400 --host-memory 600 --mdf <path_to_mdf_file> -g <path_to_geometry_dump> > fluctuations/<GPU_card>_2024-patches_${sequence}_${i}.txt
  done
done

Edited Jun 25, 2024 by Da Yu Tou