Support SIMD/POD containers and vectorised selections in new functors
This is a sketch of a way that vectorised selections on the new SIMD-friendly POD track containers (e.g. ::LHCb::Pr::Velo::Tracks
) could work with the new functors introduced in !1541 (merged). This MR:
- Introduces "proxy" iteration over the POD track containers (cf. LHCb#37 (closed)), although (unlike the sketch there) here a proxy represents "a vector-unit-sized chunk of tracks", not "a track". This is done with a wrapper around the
::LHCb::Pr::Velo::Tracks
type. Integrating it directly into::LHCb::Pr::Velo::Tracks
is also an option... - Allows functors to filter
::LHCb::Pr::Velo::Tracks
into a new::LHCb::Pr::Velo::Tracks
(likePrFilterIPSoA
) - Makes the
MINIP
,MINIPCUT
andETA
functors work when filtering these containers, preserving @ahennequ's vectorised calculation inPrFilterIPSoA
. The definitions of these functors are not explicitly specialised, rather the same function body is reworked to be valid for both scalar and vector types. - Makes the implementation choice that different [chunk-of-]track-like objects/proxies should provide accessors named
closestToBeamStatePos
andclosestToBeamStateDir
. The rationale here is that these return basic mathematical objects (i.e. 3-vectors), rather than higher-level concepts like track states, so the number of types and operations that need to be implemented (and consistently named...) for the concrete types (here @ahennequ'sVec3<T>
andGaudi::XYZVector
) is limited. See also: this vision page on track interfaces. - Instantiates
Pr::Filter<T>
asPrFilter__PrVeloTracks
, to be used as a replacement forPrFilterIPSoA
. My tests were not exhaustive, but I did not see a significant change in speed -- this is essentially the same algorithm, but hidden behind an extra abstraction later (👎 ) and with a short-circuiting "optimisation" (maybe👍 , not conclusive but inherited from the scalar code).
There are still some problems:
- This works when run from the functor cache, but when the functors are just-in-time compiled then Cling does not set the right preprocessor macros, so the
SIMDWrapper
types are just-in-time compiled as ifavx2
is not available (I usex86_64+avx2+fma-centos7-gcc8-opt+g
). cc: @clemenci.
And some things that are not implemented:
- Proxies for upstream and forward tracks
- Short-circuiting for combinations of vectorised cuts (I think...)
Goes with LHCb!2004 (merged) and Brunel!830 (merged).
Edited by Marco Cattaneo