Skip to content

Add bulk copy functions for UnplacedPolyhedron and UnplacedPolycone

Stephan Hageboeck requested to merge hageboeck_UnplacedBulkCopy into master

Polyhedra and Polycones consume the majority of the setup time because of inefficient memory management when transferring those to the GPU. Using two bulk-copy functions, all constructor arguments to recreate the volumes on the device are transferred in one go, and volumes are constructed in parallel. This speeds up the transfers by more than 10x.

One thing that I don't like about the solution is that the CudaManager now needs to include those two volumes. It also includes other volumes to do some special handling, but this is now coupling things together which are not necessarily related ...

Here are two runs with the cms2018 geometry:

New                                                                                        Old
------------------------------------------------------------------------------------       ----------------------------------------------------------------------------------
Test command: test/GeometryTest persistency/gdml/gdmls/cms2018.gdml                        Test command: test/GeometryTest persistency/gdml/gdmls/cms2018.gdml
(II) vgdml::Frontend::Load: VecGeom millimeter is 0.1                                      (II) vgdml::Frontend::Load: VecGeom millimeter is 0.1
     4e-05s     5 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE1EEE            3.9e-05s     5 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE1EEE
   1.3e-05s     4 N7vecgeom3cxx12SUnplacedTrdINS0_8TrdTypes12UniversalTrdEEE                  1.4e-05s     4 N7vecgeom3cxx12SUnplacedTrdINS0_8TrdTypes12UniversalTrdEEE
      0.51s    81 N7vecgeom3cxx18UnplacedPolyhedronE                                 <-->         6.1s    81 N7vecgeom3cxx18UnplacedPolyhedronE
   0.00015s    42 N7vecgeom3cxx14UnplacedTorus2E                                              0.00013s    42 N7vecgeom3cxx14UnplacedTorus2E
   0.00061s   189 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE2EEE            0.00062s   189 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE2EEE
    0.0044s  1669 N7vecgeom3cxx11UnplacedBoxE                                                  0.0044s  1669 N7vecgeom3cxx11UnplacedBoxE
    0.0023s   852 N7vecgeom3cxx13SUnplacedTubeINS0_9TubeTypes13UniversalTubeEEE                0.0022s   852 N7vecgeom3cxx13SUnplacedTubeINS0_9TubeTypes13UniversalTubeEEE
    0.0011s   359 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE0EEE             0.0011s   359 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE0EEE
    0.0001s    35 N7vecgeom3cxx13SUnplacedConeINS0_9ConeTypes13UniversalConeEEE               0.00011s    35 N7vecgeom3cxx13SUnplacedConeINS0_9ConeTypes13UniversalConeEEE
     0.065s   121 N7vecgeom3cxx17SUnplacedPolyconeINS0_9ConeTypes13UniversalConeEEE  <-->         3.4s   121 N7vecgeom3cxx17SUnplacedPolyconeINS0_9ConeTypes13UniversalConeEEE
    0.0029s   144 N7vecgeom3cxx15UnplacedCutTubeE                                              0.0035s   144 N7vecgeom3cxx15UnplacedCutTubeE
    0.0036s  1107 N7vecgeom3cxx17UnplacedTrapezoidE                                            0.0038s  1107 N7vecgeom3cxx17UnplacedTrapezoidE
Visiting device geometry ...  2104795 visited.                                             Visiting device geometry ...  2104795 visited.
Comparing to host geometry ... 2104795 volumes. Done.                                      Comparing to host geometry ... 2104795 volumes. Done.
Edited by Stephan Hageboeck

Merge request reports