Skip to content

CUDA ARM Fix, main branch (2024.04.04.)

Made the AthExCUDA example code build successfully on ARM64, with CUDA.

As it turns out, Eigen 3.4.0 doesn't work correctly with CUDA on aarch64 out of the box. 🤔 Some of its (NEON) vectorized code is exposed to nvcc in a way that makes it confused. See for instance:

https://github.com/InsightSoftwareConsortium/ITK/issues/1959

Turning off vectorization for the example code works around this issue. Which itself was introduced with !70268 (merged), when switching to LCG_104d_ATLAS_11.

See ATLINFR-5284 for some additional information.

Merge request reports