Skip to content

UT sorted decoding

Daniel Hugo Campora Perez requested to merge dcampora_ut_decoding_in_place into master

Closes https://gitlab.cern.ch/lhcb-parallelization/cuda_hlt/issues/42

A faster UT decoder has been implemented.

  • It is composed of: ut_calculate_number_of_hits, prefix_sum_reduce_ut_hits, prefix_sum_single_block_ut_hits, prefix_sum_scan_ut_hits, ut_pre_decode, ut_find_permutation and ut_decode_raw_banks_in_order.
  • Anything prior to ut_pre_decode was already existing.
  • ut_pre_decode is a lightweight decoding that only decodes the yBegin parameter, and the raw_bank and hit_index of the current hit in a combined int.
  • ut_find_permutation finds a permutation based on the newly created yBegin array.
  • ut_decode_raw_banks_in_order decodes all raw banks according to the permutation already established in the previous step.
  • Moved UTDefinitions to UT/common.
  • Created UTDecoding namespace for several static constexpr parameters.
  • Performance of the entire UT decoding + sorting sequence has been improved by about 33% (it now runs at about 334.7 kHz).
Edited by Daniel Hugo Campora Perez

Merge request reports