Add bulk copy functions for UnplacedPolyhedron and UnplacedPolycone
Polyhedra and Polycones consume the majority of the setup time because of inefficient memory management when transferring those to the GPU. Using two bulk-copy functions, all constructor arguments to recreate the volumes on the device are transferred in one go, and volumes are constructed in parallel. This speeds up the transfers by more than 10x.
One thing that I don't like about the solution is that the CudaManager now needs to include those two volumes. It also includes other volumes to do some special handling, but this is now coupling things together which are not necessarily related ...
Here are two runs with the cms2018 geometry:
New Old
------------------------------------------------------------------------------------ ----------------------------------------------------------------------------------
Test command: test/GeometryTest persistency/gdml/gdmls/cms2018.gdml Test command: test/GeometryTest persistency/gdml/gdmls/cms2018.gdml
(II) vgdml::Frontend::Load: VecGeom millimeter is 0.1 (II) vgdml::Frontend::Load: VecGeom millimeter is 0.1
4e-05s 5 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE1EEE 3.9e-05s 5 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE1EEE
1.3e-05s 4 N7vecgeom3cxx12SUnplacedTrdINS0_8TrdTypes12UniversalTrdEEE 1.4e-05s 4 N7vecgeom3cxx12SUnplacedTrdINS0_8TrdTypes12UniversalTrdEEE
0.51s 81 N7vecgeom3cxx18UnplacedPolyhedronE <--> 6.1s 81 N7vecgeom3cxx18UnplacedPolyhedronE
0.00015s 42 N7vecgeom3cxx14UnplacedTorus2E 0.00013s 42 N7vecgeom3cxx14UnplacedTorus2E
0.00061s 189 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE2EEE 0.00062s 189 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE2EEE
0.0044s 1669 N7vecgeom3cxx11UnplacedBoxE 0.0044s 1669 N7vecgeom3cxx11UnplacedBoxE
0.0023s 852 N7vecgeom3cxx13SUnplacedTubeINS0_9TubeTypes13UniversalTubeEEE 0.0022s 852 N7vecgeom3cxx13SUnplacedTubeINS0_9TubeTypes13UniversalTubeEEE
0.0011s 359 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE0EEE 0.0011s 359 N7vecgeom3cxx21UnplacedBooleanVolumeILNS_16BooleanOperationE0EEE
0.0001s 35 N7vecgeom3cxx13SUnplacedConeINS0_9ConeTypes13UniversalConeEEE 0.00011s 35 N7vecgeom3cxx13SUnplacedConeINS0_9ConeTypes13UniversalConeEEE
0.065s 121 N7vecgeom3cxx17SUnplacedPolyconeINS0_9ConeTypes13UniversalConeEEE <--> 3.4s 121 N7vecgeom3cxx17SUnplacedPolyconeINS0_9ConeTypes13UniversalConeEEE
0.0029s 144 N7vecgeom3cxx15UnplacedCutTubeE 0.0035s 144 N7vecgeom3cxx15UnplacedCutTubeE
0.0036s 1107 N7vecgeom3cxx17UnplacedTrapezoidE 0.0038s 1107 N7vecgeom3cxx17UnplacedTrapezoidE
Visiting device geometry ... 2104795 visited. Visiting device geometry ... 2104795 visited.
Comparing to host geometry ... 2104795 volumes. Done. Comparing to host geometry ... 2104795 volumes. Done.
Edited by Stephan Hageboeck