Implement canonical CUDA build with native CMake support (!869) · Merge requests · VecGeom / VecGeom

Benjamin Morgan requested to merge bmorgan/cmake-native-cuda into master Feb 04, 2022

This is the final step in splitting !866 (closed), with the non-CUDA build parts now merged through !867 (merged) and !868 (merged). The primary changes are cherry-picks of @shageboe's work from !866 (closed) to use CMake's native support for CUDA instead of the deprecated FindCUDA module (which VecGeom supplies its own patched copy that is broken for anything other than building vecgeomcuda), resulting in a simple, clean CUDA build.

A few patches are made on top of this to tidy some of the CMake commands, but do not change the functionality.

The major change is to take a step back to only building a single, static vecgeomcuda library that is not device linked. This is done to provide an implementation of Nvidia's canonical way of compiling, linking, and using, a library of device code. The key points are:

A device link step is needed for any code using device functions/kernels
The device link step can only take objects, static libraries as input
- Device links cannot cross shared library boundaries
- Any given device link must be self-contained, i.e. if we have A->B->C we have to device link all three in one step, not separate A,B, A,C steps (AFAIK).

A CUDA shared library is really an C/C++ interface with a device/kernel "backend", e.g.

// accelerated.h
void the_answer();

// accelerated.cu
__global__ void the_answer_kernel() {...}

void the_answer() {
  the_answer_kernel<<<1,1>>>();
}

Vecgeom's CUDA library seems to be a mix of the two(?). Some CUDA functionality is true device/kernels, other is C/C++ frontends to kernel calls. Either way, these considerations mean a shared library shouldn't be provided without some thought for what interfaces we're actually providing as well as downstream client use cases.

Given the existing work by @shageboe in !866 (closed) and @pcanal !822 (closed), this MR is primarily to provide a starting point for merging these efforts and moving towards a coherent, maintainable CUDA build going forward. It should also hopefully input/inform design/build for the VecGeom evolution and R&D efforts.

To these ends, it's only a draft at this stage to start this process. It can then be merged with fixes as an intermediate point in development, or we can try and full-feature it if needed (the branch is shared, so all can contribute if required). It does break how downstream clients will use/link to vecgeom/cuda, but all solutions will at some point!

Edited Feb 14, 2022 by Benjamin Morgan

Implement canonical CUDA build with native CMake support

Merge request reports