CPUDispatch.h - Limit CPU dispatching to only levels supported by compilation level
Currently in LHCbMath we will use runtime CPU capabilities detection to pick Generic/SSE4/AVX etc. implementations based on runtime CPU capabilities detection. This was fine, in Run II, when we where not making explicit builds for AVX2 etc., however in Run III the model we will use is to run complete AVX2+FMA builds on machines that support this, to gain across the whole code base.
I have been looking into trying to reduce differences in the nightlies, in particular to get the clang
builds as clean as the gcc
builds. One issue I have just discovered is the AVX implementations of the methods in LHCbMath give ever-so-slightly different results between the clang and gcc builds. I have some ideas as to why this might be (perhaps related to gaudi/Gaudi!1022 (merged)) that I will investigate, but this MR proposes we simply do not allow the CPU dispatching to pick the AVX/AVX2/AVX512 methods for builds where these instructions where not explicitly allowed. In practice, this then means the SSE4.2 version is used in the reference SSE4.2 builds, and AVX2 in the AVX2 builds, just like in the rest of the stack, which I think makes some sense now going forward.
This MR will probably slightly changes the refs., in many places perhaps, but my tests indicate we then should hopefully have much more consistent results across the different build arches.
goes together with Brunel!928 (merged)