Skip to content

Optimisation of some HLT1 algorithms

Arthur Marius Hennequin requested to merge ahennequ_optim into master

The commits in this MR are cherry-picked from !742 (merged)

Pipelines:

master: https://mattermost.web.cern.ch/lhcb/pl/qtg9f1ddk38smna988fqmxcqty

NVIDIA GeForce RTX 3090    │███████████████████████████████████████████████  198.62 kHz (0.92x)
NVIDIA RTX A6000           │██████████████████████████████████████████████   192.52 kHz (0.92x)
NVIDIA RTX A5000           │█████████████████████████████████████            156.63 kHz (0.93x)
NVIDIA GeForce RTX 2080 Ti │██████████████████████████████                   126.94 kHz (0.91x)
MI100                      │████████████████████████                         102.81 kHz (0.89x)
AMD EPYC 7502 32-Core      │████                                             16.82 kHz (0.97x)
                           ┼─────┴─────┼─────┴─────┼─────┴─────┼─────┴─────┼
                           0           50         100         150         200 

this branch: https://mattermost.web.cern.ch/lhcb/pl/9ohkanksijfa3jgseowkhaw9se

NVIDIA GeForce RTX 3090    │██████████████████████████████████████████         214.11 kHz (0.99x)
NVIDIA RTX A6000           │█████████████████████████████████████████          206.76 kHz (0.99x)
NVIDIA RTX A5000           │█████████████████████████████████                  169.45 kHz (1.00x)
NVIDIA GeForce RTX 2080 Ti │███████████████████████████                        136.83 kHz (0.98x)
MI100                      │████████████████████                               100.12 kHz (0.87x)
AMD EPYC 7502 32-Core      │███                                                17.21 kHz (0.99x)
                           ┼────┴────┼────┴────┼────┴────┼────┴────┼────┴────┼
                           0         50       100       150       200       250    

A5000: +8.2%

2080ti: +7.8%

Private tests (2080ti):

master
125309.630236 events/s +0%

Optimize small algorithms
126428.685902 events/s +0.9%

Take advantage of unused threads in pv_beamline_peak
128121.802201 events/s +2.2%

Fix shape of sv fitter kernel
131485.748047 events/s +4.9%

Slight improvements to is_muon
133981.343451 events/s +6.9%

Remove unnecessary sort in pv finder
136317.593650 events/s +8.7%

Fix velo consolidate tracks block_dim
137101.771522 events/s +9.4%

@dcampora @gligorov @cagapopo @lohenry

Edited by Arthur Marius Hennequin

Merge request reports

Loading