CUDA Example Package, master branch (2020.01.14.)
Created an example package showing how to use CUDA in Athena. This is to help with ongoing R&D efforts for GPU programming in the ATLAS offline software.
The code itself does the simplest thing that one can do with a GPU, it multiplies all elements of an array by some amount. The more important part is that it highlights a couple of aspects of how to build CUDA code with CMake on top of the Athena nightly.
- Since the code can only be compiled when CUDA is available in the build environment, the package's configuration uses FindCUDA.cmake to check if CUDA is available.
- If not, it returns right away, and doesn't set up anything from the package.
- If it is available, it uses enable_language(...) to turn on "first class language support" for CUDA in the build.
- After this it declares a component library with all the
*.h
,*.cxx
and*.cu
files of the package. Note that at this point the.cu
files will be handled by CMake "correctly".
Another important demonstration in the code is that .cu
files must not be exposed to practically any Athena / Gaudi headers. (Gaudi uses some C++17 formalism in some of its most basic headers these days. Which nvcc really doesn't like.) So in order to execute CUDA code from an algorithm (for instance), one must go through (at least) one extra (pure C++) interface.
Note that for testing I compiled the code like:
[bash][pcadp02]:build > asetup Athena,master,latest
Using Athena/22.0.9 [cmake] with platform x86_64-centos7-gcc8-opt
at /cvmfs/atlas-nightlies.cern.ch/repo/sw/master/2020-01-13T2133
Illegal instruction (core dumped)
Making workaround-alias for expr on this *OLD* machine
[bash][pcadp02]:build > source /afs/cern.ch/work/k/krasznaa/public/cuda/10.1.243/x86_64-centos7/setup.sh
Configured CUDA from: /afs/cern.ch/work/k/krasznaa/public/cuda/10.1.243/x86_64-centos7
[bash][pcadp02]:build > cmake -DATLAS_PACKAGE_FILTER_FILE=../package_filters.txt ../athena/Projects/WorkDir/
-- The C compiler identification is GNU 8.3.0
-- The CXX compiler identification is GNU 8.3.0
-- Check for working C compiler: /cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/bin/gcc
-- Check for working C compiler: /cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/bin/gcc -- works
...
-- Configuring the build of package: AthCUDAExample
-- Found CUDA: /afs/cern.ch/work/k/krasznaa/public/cuda/10.1.243/x86_64-centos7 (found version "10.1")
-- The CUDA compiler identification is NVIDIA 10.1.243
-- Check for working CUDA compiler: /afs/cern.ch/work/k/krasznaa/public/cuda/10.1.243/x86_64-centos7/bin/nvcc
-- Check for working CUDA compiler: /afs/cern.ch/work/k/krasznaa/public/cuda/10.1.243/x86_64-centos7/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Considering package 108 / 2156
...
Building the package without CUDA being set up looks like:
[bash][pcadp02]:build > asetup Athena,master,latest
Using Athena/22.0.9 [cmake] with platform x86_64-centos7-gcc8-opt
at /cvmfs/atlas-nightlies.cern.ch/repo/sw/master/2020-01-13T2133
Illegal instruction (core dumped)
Making workaround-alias for expr on this *OLD* machine
[bash][pcadp02]:build > cmake -DATLAS_PACKAGE_FILTER_FILE=../package_filters.txt ../athena/Projects/WorkDir/
-- The C compiler identification is GNU 8.3.0
-- The CXX compiler identification is GNU 8.3.0
-- Check for working C compiler: /cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/bin/gcc
-- Check for working C compiler: /cvmfs/sft.cern.ch/lcg/releases/gcc/8.3.0-cebb0/x86_64-centos7/bin/gcc -- works
...
-- Configuring the build of package: AthCUDAExample
CUDA_TOOLKIT_ROOT_DIR not found or specified
-- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_NVCC_EXECUTABLE CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY)
-- CUDA not found, AthCUDAExample is not built
-- Considering package 108 / 2156
...
Unfortunately right now I didn't have access to a machine having a GPU, and running CentOS 7 natively.
...
AthenaEventLoopMgr INFO ===>>> start of run 1 <<<===
AthenaEventLoopMgr INFO ===>>> start processing event #1, run #1 0 events processed so far <<<===
Failed to execute: cudaGetDeviceCount( &nCudaDevices )
Reason: CUDA driver version is insufficient for CUDA runtime version
AthCUDA::Linear... ERROR The CUDA transformation failed to run
AthCUDA::Linear... ERROR Maximum number of errors ( 'ErrorMax':1) reached.
AthAlgSeq INFO execute of [AthCUDA::LinearTransformExampleAlg] did NOT succeed
AthAlgSeq ERROR Maximum number of errors ( 'ErrorMax':1) reached.
AthAllAlgSeq INFO execute of [AthAlgSeq] did NOT succeed
AthAllAlgSeq ERROR Maximum number of errors ( 'ErrorMax':1) reached.
AthAlgEvtSeq INFO execute of [AthAllAlgSeq] did NOT succeed
AthAlgEvtSeq ERROR Maximum number of errors ( 'ErrorMax':1) reached.
AthMasterSeq INFO execute of [AthAlgEvtSeq] did NOT succeed
AthMasterSeq ERROR Maximum number of errors ( 'ErrorMax':1) reached.
AthenaEventLoopMgr INFO Execution of algorithm AthMasterSeq failed with StatusCode::FAILURE
AthenaEventLoopMgr INFO ===>>> done processing event #1, run #1 1 events processed so far <<<===
AthenaEventLoopMgr ERROR Terminating event processing loop due to errors
ApplicationMgr INFO Application Manager Stopped successfully
...
But I'll work out a way of using Singularity or Docker correctly on the SPOT machine (which currently runs Ubuntu 18.04), and will report back about the results...