WIP: iterable PVs
Main goal: allow PV MC checking with matching by tracks.
links:
- new data layout LHCb!2257 (closed)
- updates to brunel Brunel!923 (closed) (for reference as discussed in the discussion there)
- updates to moore Moore!446 (closed)
This MR updates the TrackBeamLineVertexFinderSOA (TBL) to store to output PVs references from which PVs they were built, for forward and backward tracks. The internal data structure of the TBL already holds a signed integer for reference keeping. The so-far unused range of strictly negative numbers is (ab)used for that and references are restored to normal indices when preparing the output data.
This MR furthermore introduces a converter from these new PVs to v1 RecVertices, assuming the tracks from which the PVs have been reconstructed have already been converted to Track v1s. This means that the PV MC checking algorithm can continue using the v1 rec vertex design. (presumably not a hot code path).
In my local tests this MR does introduce a small slowdown. (TODO: copy and paste the numbers together)
NB: seems some clang-format managed to sneak some whitespace changes in … be sure to view with whitespace changes disabled and/or unfold the changes to TBL.
Merge request reports
Activity
mentioned in merge request LHCb!2257 (closed)
- Resolved by Paul Seyfert
766 786 recvertex.addToTracks( nullptr, dau.second ); // TODO: make a PoD SoA RecVertex that contains indexes 767 787 } 768 788 } 769 m_nbPVsCounter += recvertexcontainer.size(); 770 return recvertexcontainer; 789 790 return retval; 791 } 792 793 template <typename tmptype, typename float_t, typename zip> 794 LHCb::Rec::PV::PVs TrackBeamLineVertexFinderSoA::create_new_vertices( tmptype& vertices, float_t const maxVertexRho2, 795 const zip fwd, const zip bkw, 796 const Beamline_t& beamline ) const { would be nice to optimize this function, it seemed compelling to run
std::partition
onvertex.tracks
before pushing into the m_fwd and m_bkw vectors (and usem_fwd…reserve
andm_bkw…reserve
properly and fill them vectorized), but the benchmark said that's slower than the lousy scalar loop below.changed this line in version 6 of the diff
286 293 }; 287 294 // 288 295 struct Vertex { 289 Gaudi::XYZPoint position; 290 Gaudi::SymMatrix3x3 poscov; 291 std::vector<std::pair<unsigned, float>> tracks; // index to track + weight in vertex fit 292 double chi2; 296 Gaudi::XYZPoint position; 297 Gaudi::SymMatrix3x3 poscov; 298 std::vector<std::pair<int, float>> tracks; // index to track + weight in vertex fit also not sure what it is used for tbh, its just pushed into the output container
but i think it's intended to mark the "not to be recorded ones" with -1 (or nl::max in this case). how do your changes affect anything there?Edited by Niklas Stefan Noltechanged this line in version 12 of the diff
-1 was used for the backward tracks. one shortcoming of the implementation was that index alone didn't specify which of the track containers they were meant for. so information about backward tracks entering a vertex was just lost and the tracks container of the pv would only contain part of the story.
mentioned in merge request Moore!333 (closed)
mentioned in merge request Brunel!923 (closed)
cc @adudziak
So for the performance. I benchmarked creating old and new PVs in separate functions and gambling on debug flags and vtune's ability to correctly attribute timing to either new or old PV creation. This was running on a quanta node in 40 threads. We spend apparently in this test 65s in
operator()
, of which 7.1s is creation of new PVs, 2.5s creation of old PVs (so roughly slowdown of a factor 3 in the creation alone, overall roughly 10% of the PV reco itself), that kinda hurts.looking into where vtune claims the time is spent, it is the storage of the track links (no complete surprise, this is extra work: the old pv creation pushes into only one vector without branch, it pushes
nullptr
for the track pointer. that it is such a big contribution is surprising to me, while this is the often blamed vector in a vector structure there shouldn't be many heap allocations going on during the loops as the inner vectors are smallbuffer optimized …)would be appreciated if @olupton is available to take a look at the functor and iterable related changes.
127 135 } 128 136 129 137 /** Default type of the TES container of PVs. */ 130 using DefaultPVContainer_t = std::vector<LHCb::Event::v2::RecVertex>; 138 using DefaultPVContainer_t = decltype( LHCb::Pr::make_zip( std::declval<LHCb::Rec::PV::PVs>() ) ); changed this line in version 7 of the diff
10 10 \*****************************************************************************/ 11 11 #include "Event/PrIterableFittedForwardTracks.h" 12 12 #include "Event/PrZip.h" 13 #include "Event/RecVertex_v2.h" 13 #include "Event/RecVertices.h" 14 14 #include "GaudiAlg/Transformer.h" 15 15 #include "SelKernel/VertexRelation.h" 16 16 17 17 namespace { 18 18 using Output = BestVertexRelations; 19 using InputVertices = std::vector<LHCb::Event::v2::RecVertex> const&; 19 using InputVertices = decltype( LHCb::Pr::make_zip( std::declval<LHCb::Rec::PV::PVs const&>() ) ) const&; changed this line in version 7 of the diff
57 59 Vec3<F> u = tracks.stateDir<F>( i, 0 ); // of the state 58 60 59 61 F min_d = 10e3; 60 for ( int j = 0; j < (int)vertices.size(); j++ ) { 61 auto PV = vertices[j].position(); 62 for ( auto pv : vertices.unwrap() ) { 81 81 template <typename Vertex1, typename Vertex2> 82 82 auto flightDistanceChi2( Vertex1 const& v1, Vertex2 const& v2 ) { 83 83 // Lifted from LoKi::DistanceCalculator 84 Gaudi::SymMatrix3x3 cov{v1.posCovMatrix() + v2.posCovMatrix()}; @nnolte could you have a look into the changes in TBLVSOA?