Parametrized Kalman fill all memory resources
We tried to benchmark the HLT1 sequence + Parametrized Kalman Fit on the TDR branch. The outcome is in attachment depending on the veloUT and forward PT thresholds as well as the IP Cut in VeloUT. When we look to the timing of the algorithms for no PV cuts and loose tracking efficiencies, we get a timing distribution among the algorithms as in attachment.
Clearly, it's not the algorithm itself to be slow, but the slow-down is driven by a too high memory consumption of that sequence. The machine we use for the test has 128 Gb RAM and the max resident memory in the slowest configuration is 138 Gb. When the throughput comes back to the normal (11.400 evt/s/node) the max resident memory usage is 44 Gb. Running without the Kalman it is 13 Gb.
Therefore, there is probably a massive slow-down happening due to memory consumption on the Parameterized Kalman.