Skip to content

GPU Calorimeter Reconstruction - Now With Splitter and Moments

Latest version of GPU calorimeter reconstruction.

Previous merge request here: !56059 (merged)

Overview of the package and some implementation details.

This version of the code implements topological cluster growing, topological cluster splitting and cluster moments calculation on the GPU. Growing and splitting can agree entirely with the equivalent CPU algorithms if cells with the same energy or signal-to-noise ratio are ordered by their respective indices and the same criteria is used for secondary local maxima elimination. Moments, for those that are simple sums over cell properties, agree within floating point noise, but those that are related to the shower axis are more problematic, stemming from accuracy issues compounded by strict cut-offs in angle and so on. This will be the object of future investigation, but the present version is otherwise functional.

Another significant change is the fact that the comparison between CPU and GPU implementations can now be done entirely within Athena, making the former standalone plotter essentially superfluous. Given the significant lack of coupling between the actual GPU implementation of the algorithms and Athena, it is also straightforward to adapt them to a standalone environment, so the previously available tools to run them outside Athena are no longer relevant for development efforts, even more so since the relevant information and/or the relevant comparisons between CPU and GPU can be plotted or outputted from within Athena. Thus, the standalone tools have been removed from the package (with the latest - though unpolished - version of them being available in a separate repository, in case future developments might require their re-introduction).

Major additions:

  • Topo-Automaton Cluster Splitting (see TopoAutomatonSplitting.h, TopoAutomatonSplitting.cxx, TopoAutomatonSplittingImpl.h and TopoAutomatonSplittingImpl.cu).
  • Moments Calculation (see GPUClusterInfoAndMomentsCalculator.h, GPUClusterInfoAndMomentsCalculator.cxx, TGPUClusterInfoAndMomentsCalculatorImpl.h, GPUClusterInfoAndMomentsCalculatorImpl.cu, GPUToAthenaImporterWithMoments.h and GPUToAthenaImporterWithMoments.cxx).
  • Plotting through monitored variables (see CaloGPUClusterAndCellDataMonitor.h, CaloGPUClusterAndCellDataMonitorOptions.h and CaloGPUClusterAndCellDataMonitor.cxx).

Major changes:

  • Fixed all remaining discrepancies between the GPU and CPU implementations of cluster growing.
  • Overhauled GPU data structures to support cluster splitting and moments calculation.
  • Added full support for neighbour options on the GPU side (all possible neighbours are exported, together with the offsets for the each option, then NeighOffsets and NeighArr provide a more comfortable way to access them).
  • Implemented a faster, more optimized conversion between CPU and GPU data structures in the cases where the CaloCellContainer is ordered and complete.
  • Simplified configuration in CaloRecGPUComfigurator.py to reduce the surface area of the API.
  • Significantly expanded the available tests/examples (especially taking into account the cluster splitter and moments calculator).
  • Removed standalone tools (plotter, optimizer, method checker).

Minor additions:

  • CaloMomentsDumper for testing and/or validating the moments calculation.
  • ExtraTagDefinitions.h holds several different ways of partitioning the 64 bits of the cell tags that have been used during the development process (which might still be relevant for future implementation efforts, if different algorithmic approaches are desired).
  • FPHelpers.h encapsulates some operations useful for dealing with floating point numbers in an IEEE-754 compatible format, including conversions between them, software-emulated addition/subtraction in non-standard formats and using the native hardware implementation in the relevant cases: float, double and the 16-bit floating point formats with CUDA support, namely __half (corresponding to what is commonly known as the float16 format) and __nv_bfloat16 (the bfloat16 format, which differs from the previous in the choice of exponent and mantissa sizes). Currently, only the abstractions dealing with floating point bit patterns in total ordering (flipped sign bit, flipped other bits in originally negative numbers) are used, but having this more extensive support of arbitrary floating point formats might be useful for adapting the precision in potential alternative implementations.

Minor changes:

  • Added support for using CUDA pinned host memory as a potential MemoryContext in Helpers.h
  • Overhauled CaloCellsCounterCPU and CaloCellsCounterGPU to have a more readily understandable and consistent format (eventually for text-based regression testing).
  • Added a reindex method to CaloClusterCellLink::iterator to allow ensuring the first cell in the clusters will always correspond to the seed cell (which is necessary to be consistent with factors of 2π in calculations related to φ)
Edited by Nuno Dos Santos Fernandes

Merge request reports

Loading