Improvements to prefix sum, sorting algorithms and VELO clustering by Arthur. (!1679) · Merge requests · LHCb / Allen

Da Yu Tou requested to merge rebase_ahennequ_scan into 2024-patches Jun 14, 2024

This MR is a rebased version of !1509 (closed).

Description copied from !1509 (closed).

Introduce a new test and benchmark to compare different implementations of the prefix_sum:

cpu1: default implementation of host_prefix_sum
cuda1: blelloch's scan implementation using 1 element per thread
cuda2: blelloch's scan implementation using 4 element per thread
cuda3: blelloch's scan implementation using a single kernel, sliding on the array

Closes #500 (closed)

Implements a new sorting algorithm.

Implements a new velo clustering algorithm.

Details in https://indico.cern.ch/event/1370609/contributions/5928258/attachments/2847533/4979150/SumSortClustHLT1.pdf

Throughput Test on Real Data

The throughput on real data with varying levels of mu shows a 10-15% increase in throughput with the HLT1 matching without and with UT sequences (hlt1_pp_matchig_no_ut and hlt1_pp_matching):

Checks of D0 and D+ Reconstruction

This MR and 2024-patches have been tested on real data. The dataset is the MEP dumps of Run 297083 with mu=4 which contains a total of 29.5M events. The sequence used was hlt1_pp_matching_and_downstream. The number of D0 and D+ reconstructed and triggered by Hlt1OneTrackMVA || Hlt1TwoTrackMVA is identical between this MR and 2024-patches.

Edited Jun 17, 2024 by Da Yu Tou

Improvements to prefix sum, sorting algorithms and VELO clustering by Arthur.

Description copied from !1509 (closed).

Throughput Test on Real Data

Checks of D0 and D+ Reconstruction

Merge request reports