The new selection machinery (ThOr functors, ThOr::Combiner etc.) is explicitly vectorised and uses SOA-layout data structures. There are various examples of these, for example Composites, and some outstanding ones, for example #97 (closed).
So far there are no compatible structures for neutral particles. These should be added. This could either be a single container ("NeutralBasics"), or perhaps more likely a logical "zip" of information produced by different neutral reconstruction algorithms. However this is defined, an API compatible with the relevant ThOr functors should be exposed.
@mramospe as a starting point, the neutral ProtoParticles are currently created in FutureNeutralProtoPAlg from the CaloHypos. Two main comments come directly to my mind:
the new container should directly use the new LHCb::Event::Calo::Hypos now produced in the reconstruction, instead of the old CaloHypos still used by FutureNeutralProtoPAlg.
some of the information added in FutureNeutralProtoPAlg is obsolete. A first clean-up is being done by @calvom in !3166 (merged) so you can start from the new set of variables directly
OK, I will start by creating a container and a converter. I presume that in the future the CALO reconstruction sequence would create instances of the new container instead.
yes, the global reconstruction should create the new containers instead of ProtoParticles, so we will need an algorithm similar to FutureNeutralProtoPAlg that creates SOA neutral particles directly out of LHCb::Event::Calo::Hypos. Ideally, we shouldn't need any converters, but I imagine those are needed for the time being until the Thor functors and combiners work on the new neutral particle classes?
After some discussions with @cmarinbe, I wonder what is the actual goal of this. It seems that there is already a class that can be used with the zip machinery in order to make NeutralBasics (i.e. similar to ChargedBasics but with no track information). It is called Hypotheses and lives in https://gitlab.cern.ch/lhcb/LHCb/blob/master/Event/RecEvent/include/Event/CaloHypos_v2.h. The only thing that is needed is to define an appropriate container for neutral particles based on it (containing also the PID, assigned mass, ...). Creating a new pure SOA object (required to use SIMD types) will be quite complicated given the amount of information that is needed for the ECAL tools. Changing the data structure would probably imply changing the interfaces for all those tools and algorithms too. Is the goal of this issue to implement the container with SIMD types? From the title it seems it is not. My idea is to start by simply create a container based on Hypotheses that works similarly to ChargedBasics and can be used with the ThOr functors. If it turns to be too slow we can worry about implementing it fully on a SOA structure and/or use SIMD types.
also -- one small nit-pick/clarification: CaloHypotheses (defined in CaloHypos_v2.h) is a container -- so when you write "create a container based on Hypotheses" I would suggest checking whether CaloHypotheses can just be that container, or that it could be 'tweaked' or adapted to become that container.
In the end, the individual elements of a CaloHypotheses container are just 'clusters with an assigned hypothesis' (eg. electron, photon, (merged)pi0) with corrections/calibrations (to energy and position) specific to the type applied to them, and an 'id' (cell-id, which can be used to identify the hypo, and do eg. truth matching). The 'real' information (i.e. position, energy, covariance, spread, digits) is in the associated clusters (which are embedded inside the same container object).
The idea of having an extra container is to zip the CaloHypotheses to extra PID information that is (mostly) computed by downstream code, after the CaloHypotheses have already been written to TES. Currently, we have around 15 variables added to the ProtoParticles on top of the CaloHypos (see FutureNeutralProtoPAlg). Some of this information is already contained or can be derived from the Hypos themselves (eg ClusterE, Spread) and we don't need to store it again*, but some is really additional (eg. ClusterMatch, isNotH, isPhoton, isPhotonXGB) and has to be saved somewhere. So following the structure of the ChargedBasics, we thought we could have a NeutralBasics object zipping CaloHypotheses and PID information (various columns of floats or a more complicated object storing them, whatever we think is best).
@jonrob I'm assuming here that we will persist the whole CaloHypo object, including the clusters and digits embedded inside (not sure it's even possible to not do it, but I prefer to make sure my assumption is correct)
@graven I am struggling to create this kind of container out of LHCb::Event::Calo::v2::Hypotheses objects. It seems I can not use the zip machinery straight away
stack/LHCb/InstallArea/x86_64_v2-centos7-gcc10-opt/include/Event/Zip.h:644:5: error: no type named 'type' in 'struct std::enable_if<false, int>' 644 | Zip( ContainerTypes&... containers ) : m_containers{&containers...} {
I am trying to figure out what are the requirements that are needed to be able to do such thing, and also whether it can be combined with SIMD-based SOA objects straight away.
is there a MR with what you're trying to do that I could check out? I'm not surprised that LHCb::Event::Calo::v2::Hypotheses and zip don't interoperate (yet), as this is the first time someone tries -- but I would hope that there is no fundamental reason that a container that has an operator[] (i.e. provides random access), and which is compatible with range-based loops (i.e. it offers a begin and an end which return iterators which can be incremented, dereferenced and compared) would not work with zip.
Now, it looks like the above enable_if is checking for is_zippable which makes assumptions which are maybe a bit too constrained, and eg. requires SIMD columns, which is nice if possible, but the code should 'fall back' to something more generic if that is not possible. This also comes back to the request that in general we would like 'columns' which are more than just SIMD types, but eg. for a State we may want to SOAOS structure (as one would never use eg. tx without also using ty). So it looks like we need a bit of generalization here... (which would be useful beyond just supporting zipping Hypotheses)
I managed to make the new NeutralBasics be zippable objects. I am now trying to make the composite generation work with these, because the new ParticleVertexFitter does different things depending whether what is being combined is a composite or a track-like object.
@nnolte@ahennequ I think I have come to a point here where I might be limited by the current tools available to do conversions between standard arithmetic types and SIMD types. The current implementation of CALO objects is not strictly SOA, and making all the classes SOA would imply an enormous amount of work. Creating a new SOA class that is directly built from the objects provided after the CALO reconstruction is, in principle, not an option, because we want to persist this objects and possibly do additional corrections afterwards (using the same tools as in the reconstruction). As a consequence, I am trying to define a proxy in such a way that is able to build the SIMD types needed by the ThOr framework with the appropriate information on the run. Right now, I am aware that one can read containers of standard arithmetic types (e.g. std::vector<int>) transforming it to int_v with a simple interface. What happens if we have two vectors of objects std::vector<XYZ> and we want to make the proxy return, for example, the distance between two points? I was expecting to find a way to populate an int_v/float_v with the information from the calls to a function, but I don't find any. Do you have any ideas?
so as long as you tell your proxy how to store and retrieve your object (by splitting it up internally into int and float) you can in the front-end always deal with the object, and not with the underlying type.
However, as mentioned in the meeting, I am not sure this machinery works with the "old" proxy and the functors.
Thanks for the suggestion @decianm, but I think they are two different cases. Adding data from scalar types does not constitute a problem, the problem comes when reading data from non-SIMD types calculated on the fly and trying to return SIMD types. From what I see in your Vector example, the data is internally stored using SIMD types in 3 columns, one for each dimension. My particular case is dealing with a structure similar to AOS, where data is not stored in columns, and we need to pack quantities that do not live in an array (they are calculated on the fly using a member function, a lambda function or similar) in SIMD types. The most basic case would be to transform the magnitude calculated from an object with a vector of x, y and z values (as a struct, std::vector<xyz>) to SIMD types. I would need to call the function as many times as elements the SIMD type can hold, and mask it to account for out-of-range values (avoiding the call to the function returning the result). In order to write elements into the AOS object one can rely on the Scalar instruction set.
This could either be a single container ("NeutralBasics"), or perhaps more likely a logical "zip" of information produced by different neutral reconstruction algorithms
Please be aware of the changes in !3440 (closed) -- in order to allow access to 'decayProducts' from Composites the 'targets' of the 'pointers/references/...' can for technical reasons not be a zip (if you want more information, please check with @ahennequ) and it would also make subsequent persistency a lot harder (if not impossible). This is the reason that !3440 (closed) changes ChargedBasics to be a SOACollection and no longer a zip (*). And thus the same constraint applies to NeutralBasics... But the good news is that, just like ChargedBasics 'points' to individual v3::Tracks elements, so can NeutralBasics point back to individual elements in other SOA collections...
(*) this implies that several accessor methods that used to 'come for free' because the zip proxy inherited all the methods from the individual proxies of the zip elements have to be implemented as 'forwarding' functions in order to keep the same functionality.