Skip to content

Implement bulk copying of PlacedVolumes in CudaManager

Stephan Hageboeck requested to merge shageboe/PlacedVolumeBulkCopy into master
  • Implement bulk copying of PlacedVolume and Transformation3D instances.
  • Add a test to validate a GPU geometry.
  • Small fixes for CUDA function attributes and const correctness.

Using an AdePT simulation example, I ran timing tests on a TeslaV100 using the trackML geometry. Copying of PlacedVolume instances and Transformation3D each gets ~10x faster, total copying of the geometry 5x faster. The rest is strictly identical, and the geometry validation test added here confirms that.

Diff of AdePT's example11 synchronising trackML.gdml:

    New:                                                            Old:
    INFO: using default trackML.gdml for option -gdml_name          INFO: using default trackML.gdml for option -gdml_name
    INFO: using default 0 for option -cache_depth                   INFO: using default 0 for option -cache_depth
    INFO: using default 1 for option -particles                     INFO: using default 1 for option -particles
    INFO: using default 100 for option -energy                      INFO: using default 100 for option -energy
    (II) vgdml::Frontend::Load: VecGeom millimeter is 1             (II) vgdml::Frontend::Load: VecGeom millimeter is 1
    Starting synchronization to GPU.                                Starting synchronization to GPU.
    Allocating geometry on GPU...Allocating logical volumes... OK   Allocating geometry on GPU...Allocating logical volumes... OK
    Allocating unplaced volumes... OK: #elems in alloc_mem=2, mem   Allocating unplaced volumes... OK: #elems in alloc_mem=2, mem
    Allocating placed volumes... OK                                 Allocating placed volumes... OK
    Allocating navigation index table... OK                         Allocating navigation index table... OK
    Allocating transformations... OK: #elems in alloc_mem=5, mem_   Allocating transformations... OK: #elems in alloc_mem=5, mem_
    Allocating daughter lists... OK                                 Allocating daughter lists... OK
     geometry OK: #elems in alloc_mem=7, mem_map=38013, dau_gpu_c    geometry OK: #elems in alloc_mem=7, mem_map=38013, dau_gpu_c
    NUMBER OF PLACED VOLUMES 18789                                  NUMBER OF PLACED VOLUMES 18789
    NUMBER OF UNPLACED VOLUMES 145                                  NUMBER OF UNPLACED VOLUMES 145
    Copying geometry to GPU...                                      Copying geometry to GPU...
    
    Copying logical volumes... OK;  TIME NEEDED 0.000615441s      | Copying logical volumes... OK;  TIME NEEDED 0.000695619s
    Copying unplaced volumes... OK; TIME NEEDED 0.000503695s      | Copying unplaced volumes... OK; TIME NEEDED 0.000411785s
    Copying transformations_... OK; TIME NEEDED 0.00657207s       | Copying transformations_... OK; TIME NEEDED 0.0558993s
    Copying placed volumes... OK;   TIME NEEDED 0.00866507s       | Copying placed volumes... OK;   TIME NEEDED 0.0913828s
    Copying daughter arrays... OK;  TIME NEEDED 0.00384211s       | Copying daughter arrays... OK;  TIME NEEDED 0.00892739s
    Geometry synchronized to GPU in 0.036591 s.                   | Geometry synchronized to GPU in 0.173805 s.
         ---  InitElectronData ...                                       ---  InitElectronData ...
         ---  BuildELossTables ...                                       ---  BuildELossTables ...
    ...                                                             ...
    iter  221 -- tracks in flight:     2 energy deposition:    77   iter  221 -- tracks in flight:     2 energy deposition:    77
    iter  222 -- tracks in flight:     0 energy deposition:    77   iter  222 -- tracks in flight:     0 energy deposition:    77
    Run time: 0.0552                                              | Run time: 0.0532
Edited by Stephan Hageboeck

Merge request reports