This project is mirrored from https://gitlab.cern.ch/VecGeom/VecGeom.
Pull mirroring failed .
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
- 11 Sep, 2021 2 commits
-
-
All accesses and the related variables are already unsigned, make the underlying data type match for consistency.
-
They were doing exactly the same, only the returned data was slightly different.
-
- 03 Sep, 2021 1 commit
-
-
Gabriele Cosmo authored
-
- 02 Sep, 2021 2 commits
-
-
Andrei Gheata authored
-
-
- 01 Sep, 2021 1 commit
-
-
Andrei Gheata authored
-
- 25 Aug, 2021 10 commits
-
-
- 20 Aug, 2021 5 commits
-
-
Jonas Hahnfeld authored
The default constructor of Planes was marked for the device, but the constructor of Quadrilaterals was also marked for the host (despite having a different implementation there). Get rid of this problem by properly initializing the arrays in Quadrilaterals.
-
Jonas Hahnfeld authored
See previous commit; the compiler warning was a bit misleading, which is why I missed this before.
-
Jonas Hahnfeld authored
CUDA support in Clang requires that all definitions and declarations specify a consistent set of attributes.
-
Jonas Hahnfeld authored
On CUDA, this macro expands to "static __constant__ const" and Clang (correctly) complains if kTolerance is used to compute the value of a constexpr.
-
Jonas Hahnfeld authored
Clang confuses the host function and the kernel if they have the same name. I don't want to argue with the compiler if it can be easily avoided.
-
- 11 Aug, 2021 2 commits
-
-
Andrei Gheata authored
UnplacedTrd exposes now the normal interface on GPU. Normal computed even if point not exactly on surface.. The normal computation was done previously only if the point was kHalfTolerance from surface. Now further points will still compute the normal but return valid=false.
-
Andrei Gheata authored
-
- 05 Aug, 2021 2 commits
-
-
-
Andrei Gheata authored
-
- 04 Aug, 2021 1 commit
-
-
Guilherme Lima authored
-
- 03 Aug, 2021 1 commit
-
-
Andrei Gheata authored
-
- 26 Jul, 2021 3 commits
-
-
Martin Kostelnik authored
-
Andrei Gheata authored
-
Andrei Gheata authored
-
- 22 Jul, 2021 2 commits
-
-
Martin Kostelnik authored
-
Martin Kostelnik authored
-
- 15 Jul, 2021 2 commits
-
-
Jonas Hahnfeld authored
-
Jonas Hahnfeld authored
-
- 07 Jul, 2021 6 commits
-
-
Stephan Hageboeck authored
When copying host instances with bulk transfers, fewer kernels are invoked, which reduces the overhead, and construction on the GPU runs in parallel. With the trackML geometry, bulk copying placed volumes and transformations is 10x faster on a TeslaV100 GPU. The total transfer time reduces 5x. Diff of AdePT't example11 synchronising trackML.gdml: ``` New: Old: INFO: using default trackML.gdml for option -gdml_name INFO: using default trackML.gdml for option -gdml_name INFO: using default 0 for option -cache_depth INFO: using default 0 for option -cache_depth INFO: using default 1 for option -particles INFO: using default 1 for option -particles INFO: using default 100 for option -energy INFO: using default 100 for option -energy (II) vgdml::Frontend::Load: VecGeom millimeter is 1 (II) vgdml::Frontend::Load: VecGeom millimeter is 1 Starting synchronization to GPU. Starting synchronization to GPU. Allocating geometry on GPU...Allocating logical volumes... OK Allocating geometry on GPU...Allocating logical volumes... OK Allocating unplaced volumes... OK: #elems in alloc_mem=2, mem Allocating unplaced volumes... OK: #elems in alloc_mem=2, mem Allocating placed volumes... OK Allocating placed volumes... OK Allocating navigation index table... OK Allocating navigation index table... OK Allocating transformations... OK: #elems in alloc_mem=5, mem_ Allocating transformations... OK: #elems in alloc_mem=5, mem_ Allocating daughter lists... OK Allocating daughter lists... OK geometry OK: #elems in alloc_mem=7, mem_map=38013, dau_gpu_c geometry OK: #elems in alloc_mem=7, mem_map=38013, dau_gpu_c NUMBER OF PLACED VOLUMES 18789 NUMBER OF PLACED VOLUMES 18789 NUMBER OF UNPLACED VOLUMES 145 NUMBER OF UNPLACED VOLUMES 145 Copying geometry to GPU... Copying geometry to GPU... Copying logical volumes... OK; TIME NEEDED 0.000615441s | Copying logical volumes... OK; TIME NEEDED 0.000695619s Copying unplaced volumes... OK; TIME NEEDED 0.000503695s | Copying unplaced volumes... OK; TIME NEEDED 0.000411785s Copying transformations_... OK; TIME NEEDED 0.00657207s | Copying transformations_... OK; TIME NEEDED 0.0558993s Copying placed volumes... OK; TIME NEEDED 0.00866507s | Copying placed volumes... OK; TIME NEEDED 0.0913828s Copying daughter arrays... OK; TIME NEEDED 0.00384211s | Copying daughter arrays... OK; TIME NEEDED 0.00892739s Geometry synchronized to GPU in 0.036591 s. | Geometry synchronized to GPU in 0.173805 s. --- InitElectronData ... --- InitElectronData ... --- BuildELossTables ... --- BuildELossTables ... ... ... iter 221 -- tracks in flight: 2 energy deposition: 77 iter 221 -- tracks in flight: 2 energy deposition: 77 iter 222 -- tracks in flight: 0 energy deposition: 77 iter 222 -- tracks in flight: 0 energy deposition: 77 Run time: 0.0552 | Run time: 0.0532 ```
-
Stephan Hageboeck authored
-
Stephan Hageboeck authored
Instead of starting one kernel to construct one placed volume on the GPU, one can collect all instances of the same type, and construct these in a single kernel call. This drastically reduces the number of kernel calls for larger geometries. This required defining template functions that - Collect all constructor arguments in arrays - Copy those to the GPU - Run all constructors in parallel - Free the memory occupied by the constructor arguments. For each type of placed volumes, the helper ConstructManyOnGPU<Type> must be instantiated explicitly in the cxx namespace, as implicit instantiation doesn't reach it automatically. Most instantiations happen via the macros in PlacedVolume.h, but PlacedAssembly, UnplacedExtruded, UnplacedMultiUnion and UnplacedTesselated needed explicit dummy instantiations to fix linker problems.
-
Stephan Hageboeck authored
-
Stephan Hageboeck authored
CudaManager can print details about synchronising a geometry to the GPU. For some reason, parts of this go to cerr while the rest goes to cout. Here, all is directed to cout to allow for better logging. Improve formatting of the timing printouts while synchronising geometry.
-
Stephan Hageboeck authored
-