-
Stephan Hageboeck authored
When copying host instances with bulk transfers, fewer kernels are invoked, which reduces the overhead, and construction on the GPU runs in parallel. With the trackML geometry, bulk copying placed volumes and transformations is 10x faster on a TeslaV100 GPU. The total transfer time reduces 5x. Diff of AdePT't example11 synchronising trackML.gdml: ``` New: Old: INFO: using default trackML.gdml for option -gdml_name INFO: using default trackML.gdml for option -gdml_name INFO: using default 0 for option -cache_depth INFO: using default 0 for option -cache_depth INFO: using default 1 for option -particles INFO: using default 1 for option -particles INFO: using default 100 for option -energy INFO: using default 100 for option -energy (II) vgdml::Frontend::Load: VecGeom millimeter is 1 (II) vgdml::Frontend::Load: VecGeom millimeter is 1 Starting synchronization to GPU. Starting synchronization to GPU. Allocating geometry on GPU...Allocating logical volumes... OK Allocating geometry on GPU...Allocating logical volumes... OK Allocating unplaced volumes... OK: #elems in alloc_mem=2, mem Allocating unplaced volumes... OK: #elems in alloc_mem=2, mem Allocating placed volumes... OK Allocating placed volumes... OK Allocating navigation index table... OK Allocating navigation index table... OK Allocating transformations... OK: #elems in alloc_mem=5, mem_ Allocating transformations... OK: #elems in alloc_mem=5, mem_ Allocating daughter lists... OK Allocating daughter lists... OK geometry OK: #elems in alloc_mem=7, mem_map=38013, dau_gpu_c geometry OK: #elems in alloc_mem=7, mem_map=38013, dau_gpu_c NUMBER OF PLACED VOLUMES 18789 NUMBER OF PLACED VOLUMES 18789 NUMBER OF UNPLACED VOLUMES 145 NUMBER OF UNPLACED VOLUMES 145 Copying geometry to GPU... Copying geometry to GPU... Copying logical volumes... OK; TIME NEEDED 0.000615441s | Copying logical volumes... OK; TIME NEEDED 0.000695619s Copying unplaced volumes... OK; TIME NEEDED 0.000503695s | Copying unplaced volumes... OK; TIME NEEDED 0.000411785s Copying transformations_... OK; TIME NEEDED 0.00657207s | Copying transformations_... OK; TIME NEEDED 0.0558993s Copying placed volumes... OK; TIME NEEDED 0.00866507s | Copying placed volumes... OK; TIME NEEDED 0.0913828s Copying daughter arrays... OK; TIME NEEDED 0.00384211s | Copying daughter arrays... OK; TIME NEEDED 0.00892739s Geometry synchronized to GPU in 0.036591 s. | Geometry synchronized to GPU in 0.173805 s. --- InitElectronData ... --- InitElectronData ... --- BuildELossTables ... --- BuildELossTables ... ... ... iter 221 -- tracks in flight: 2 energy deposition: 77 iter 221 -- tracks in flight: 2 energy deposition: 77 iter 222 -- tracks in flight: 0 energy deposition: 77 iter 222 -- tracks in flight: 0 energy deposition: 77 Run time: 0.0552 | Run time: 0.0532 ```
5d8be486