Some Sequences are Causing Fluctuations
Fluctuating Sequences
Some of the validation sequences are fluctuating. It is hard to observe them on the 500 events Bs2PhiPhi MC sample we use in Allen ci, but they can be easily produced with a 9482 event minBias sample. I have tested the following sequences on both 2080Ti
and A5000
in the Online farm (50 times each GPU):
Sequence | Fluctuating? | Frequency |
---|---|---|
hlt1_pp_valiation (forward) |
No | - |
hlt1_pp_matching_no_ut_validation |
No | - |
hlt1_pp_matching_validation (matching with UT) |
No | - |
hlt1_pp_forward_then_matching_validation |
Yes | 13/100 |
hlt1_pp_forward_then_matching_and_downstream_validation |
Yes | 36/100 |
downstream_valiation |
Yes | 21 /100 |
The fluctuations are happening on both 2080Ti and A5000.
Details of fluctuations
The logs are attached. A5000_fluctuations.zip 2080Ti_fluctuations.zip
What I have observed so far:
hlt1_pp_forward_then_matching_validation
1. The fluctuation in this sequence is only in the seed_xz_validation
part, which can have 1 less reconstructed seed sometimes. The number of reconstructed counters in seed_xz_validation
is exactly the same as in hlt1_pp_matching_no_ut_validation
and hlt1_pp_matching_validation
, when it is not fluctuating.
Given that the numbers in seed_xz_validation
matches the matching only sequences, this implies that we do not remove SciFi hits for seed_xz
during validation, so fluctuations in the forward tracking should not affect it. But if this is the case, why does seed_xz_validation
not fluctuate in hlt1_pp_matching_no_ut_validation
and hlt1_pp_matching_validation
?
downstream_validation
2. Fluctuating number of reconstructed downstream tracks: number of reconstructed decreases by 1 when fluctuating. This sequence has long_validator
, seed_validator
and unmached_seed_validator
all of which do not fluctuate.
What's very weird is that fluctuation happens in the denominator of the ghost counter (number of reconstructed), while the numerator of ghost counter does not fluctuate. This should cause a fluctuation in the numerators of efficiency counters, since you are reconstructing 1 more/less track, but this does not happen at all. The only explanation I can think of is that the 1 fluctuating track is actually a long track?
hlt1_pp_forward_then_matching_and_downstream_validation
3. This is just a combination of the same effects from hlt1_pp_forward_then_matching_validation
and downstream_validation
.
Reproduce the Fluctuations
To reproduce these results, first edit the following files to have options.evt_max = -1
:
Moore/Hlt/Moore/tests/options/default_input_and_conds_hlt1.py
Moore/Hlt/RecoConf/options/mdf_for_standalone_Allen.py
then run a detdesc
stack build of Moore
with the following command:
./Moore/run gaudirun.py Moore/Hlt/Moore/tests/options/default_input_and_conds_hlt1_retinacluster.py Moore/Hlt/RecoConf/options/mdf_for_standalone_Allen.py
You should have a 9482 event minBias MDF sample in a folder called dump
.
Finally, run Allen standalone multiple times, where Allen is compiled with the /cvmfs/lhcb.cern.ch/lib/lhcb/lcg-toolchains/LCG_105a/x86_64_v3-el9-gcc12+cuda12_1-opt+g.cmake
toolchain (what we are using for data-taking):
for sequence in downstream_validation hlt1_pp_forward_then_matching_validation hlt1_pp_forward_then_matching_and_downstream_validation hlt1_pp_validation hlt1_pp_matching_validation hlt1_pp_matching_no_ut_validation
do
for i in {0..49}
do
./toolchain/wrapper ./Allen --sequence ${sequence} -n 10000 -t 5 --events-per-slice 1000 -m 1400 --host-memory 600 --mdf <path_to_mdf_file> -g <path_to_geometry_dump> > fluctuations/<GPU_card>_2024-patches_${sequence}_${i}.txt
done
done