There has been a discussion on mattermost, on how to store trigger, TISTOS and event info with functors and FunTuple.
@graven mention's that there is currently a functor brewing that returns the decision of HLT1/2 line taking DecReport as input (see !2598 (merged)). However @sstahl points that in FunTuple the functors take candidate as input and not DecReport. Gerhard suggests in this comment that we have "wrapper" functor with a DataHandle as input, that gets configured with a set of functors and which forwards the information it grabs from the DataHandle to the embedded functors -- effectively turning a collection of functors that take say T as input from the TES into a collection that takes no (external) input -- and this can be passed to FunTuple without templating on what it has to grab from the TES, as FunTuple would no longer itself (directly) access the TES.
However, I feel this is quite a radical solution (given functor JIT compilation times) and since in FunTuple we always handle candidate level info, why not map a candidate -> DecReport, passing this map to a new functor that outputs std::map<std::string, int> with decisions on user-defined lines.
@gligorovadds that for TISTOS, one should also need SelReport but are redundant in HLT2, if we choose to always persist the candidates which caused the trigger to fire.
I think the space is multi-dimensional and I would not claim I have made my mind just yet. Let's think a bit and discuss viva voce on Friday, as it may be more effective. (Does not prevent comments and lists of use-cases to be added here, obviously.)
Let me spell out one point in a bit more (probably tedious and pedantic) detail: "which trigger lines fired" is as a question completely decoupled from knowledge about signal candidates but TISTOS information isn't. So any functor which uses the SelReports to TISTOS by construction needs to know
What is your offline signal candidate
Where can it find the SelReports
A functor which stores trigger decisions on the other hand only needs to know where to find DecReports. This is logically the same as any global event info functor (event number, run number, etc).
TISTOS studies have always been a special beast. I'm now wondering off the top of my head if such studies wouldn't be easier to "prepare" in Sprucing jobs, and the ntupling in DaVinci would then be a simpler job based on what gets stored (and related) from that Sprucing "prep job". But I admit I need to massage this fresh thought properly ... I thought I would share it anyway, just in case ...
Moving TISTOS to sprucing does mean that there needs to be thought on the flexibility of the TOS definitions still available downstream; in Run 1/2 we could, and sometimes did regularly did*, change the TOS detector fraction in the offline processing. I'm not sure you can get a solution easier than reading the reports.
@ldufour, if you are worried about flexibility or missing info then the problem is only less or equal in Sprucing compared to DaVinci. In other words, having aside the work one would be doing in a special Sprucing job (work has to be somewhere, anyway), things can only be easier than trying to do various things when tupling in DaVinci.
I don't think it will be easier: these settings typically are developed over time in an analysis and really benefit from the very fast turnaround in the user processing. For the sake of offline analysis and physics output, I would avoid having more reasons for a resprucing and holding up analyses.
I don't see why this is an either/or situation? Having the ability to centrally prepare it doesn't stop analysts who need to do something special from doing it later, it just means there is a safe and well tested baseline for everyone else.
My earlier comment is exactly that: if one moves the preparation of data needed for TISTOS to the central production, this preparation must keep the functionality for "fractional TISTOS information" at the DaVinci level still available (and the functionality there, as it's expected to be used).
Exactly. And do not forget that we have signed for fast turn-around with analysis productions as well (if relevant - Castelao jobs have been run IIRC) ... I'm not saying I have the full picture of things (I don't .. but then who does ;-)?) but I often feel too much thinking in terms of what we did in the past decade. While we did loads of things really well, our base paradigms have been evolving, and we have to think outside the box as much as possible.
Anyway, the discussions just prove that we need some live chats at our joint meetings as there's a lot to benefit from those :).
... this preparation must keep the functionality for "fractional TISTOS information" at the DaVinci level still available
Fine. Just do it :-). There is no technical show-stopper and the prep job should be what experts needs. That's obvious, no? Am not saying the opposite in any way. My point is that we should consider a different workflow and then decide what is best based on concrete on-paper designs.
Indeed, TisTos is s special beast. Let's concentrate here on candidate-indepent functors. @gligorov mentioned elsewhere event and run number, which should always be there, but there is also gpsTime, detector multiplicity, polarity (actually coming from the conditions database), TCK, etc.
In the DecayTreeTuple suite we had three kinds of TupleTools: IParticleTupleTool, IMCPartcleTupleTool and IEventTupleTool, which does not take any input (TES locations were passed by options). So here we are discussing a replacement for that.
It's not only a FunTuple issue. Whatever can be plotted can also be cut on and will be used in selections. Typically detector multiplicities.
I did a small exercise yesterday and implemented the use of IEventTupleTool in FunTuple. It was relatively easy as the interface only requires a pointer to the tuple as you say and it was quite obvious where to fill the tuple.
Even if that is not the direction of travel, I think it highlighted already some questions and features which have to be addressed in the configuration. E.g. how are TES inputs provided to FunctorCollections and should the user take care of that. There basically need to be functions to get things like ODIN or functions which return the TES locations of DecReports are now in DaVinci but collections are defined in Analysis.
Speaking for myself I reckon it would be interesting and information to see that proof-of-concept exercise. Either share here or link to a branch or even prepare 1-2 tech slides for Friday or the following joint meeting?
On the question regarding functorcollections. The data dependency on functor would natural introduce data dependency on functorcollections. I think @gunther introduced such a thing for truth functors (Analysis!846 (merged)).
Remember we wanted to move the algorithm (in this instance mctruth) into functorcollections, but functorcollections would have to output both mc_functors and mctruth, where the latter gets appended to the "sequence", but now we can exactly do the suggested thanks to you and Niklas's recent fix to PyConf (LHCb!3391 (merged)).
Indeed, once the design/workflow of how we tuple the trigger info is fixed, we can think how the corresponding collection can built.
Thanks @sstahl for the links. Yes, not a production-style change but an exercice.
Some of these things are not possible because some functions are in DaVinci and can therefore not be imported in Analysis.
Great that more people are seeing this. You may remember that DPA has temporarily various functions and standard particle definitions copied over from Moore. But we always said that's not meant to stay as it will be a nightmare for bookkeeping. Recently we created Moore#372 (closed) to move in the direction of less duplication ...
Yes, agreed that (at least IMO) plenty of filter/combiner helpers and alike should be moved upstream, most likely to Analysis.
because it is explicit on what one needs, and remains fully flexible on how a user does MC association - standard or dedicated, "cooked up" by the user. With your suggestion
one totally looses the possibility to specify to the functor collection how the association is to be set. It effectively couples the variables that get tupled to an input location, with no control over how to MC-associate. This buys one line of code (instantiation of mctruth) but is a price to pay for more obscure happenings in the back, since a functor collection would have to set up and run associators internally. I like functional and declarative programming but this seems like a step too far. (But point taken that it looks attractive.)
one totally looses the possibility to specify to the functor collection how the association is to be set.
For the few(?) use cases where a different MC association is needed, one could still use @configurable and bind to change the behaviour. I just wanted to point it out so that you have something to think about.
Sorry I really wanted to discuss this issue in today's meeting as its important to get it in!
So what is the conclusion? Build an upfront tool that builds a map or linker table e.g. of candidate -> ODIN, then pass F.MAP_INPUT(F.ODIN, “P2ODIN”) functor into FunTuple? (straight-foward option perhaps?)
Or a different tool for event info and then join offline event level root file and candidate level root file?
Thanks...The run1/2 way I think is equivalent to Gerard’s suggestion to have a “wrapper” functor e.g. F.EventInfo (calling different functors inside) and returning std::map (takes a candidate as input but does nothing with it).
Some way to switch on or off internally the functors would be good since there are a lot of event-level info.
probably the most trivial (hachky?) way to implement this is to have a functor that given a candidate, or actually 'anything' just uses an embedded data handle to the ODIN bank and returns a reference to it (i.e. a relation that relates anything given to it to the one ODIN bank ;-). And then write a few trivial functors that, given an ODIN bank return eg. run number or event number or tCK or .. -- those latter ones are probably 3 lines of C++, eg:
struct RunNumber : public Function { auto operator()( const ODIN& odin ) const { return odin.runNumber(); }};
-- which already exists actually, in Phys/FunctorCore/include/Functors/TES.h --
and then one should be able to do
An alternative would be to make these functors special with their own property and call them once for each candidate. To which branch would you add them anyhow? Then they would take nothing as input. It's basically what I did when I added the IEventTupleTool.
As this piece of the code is intended to be used by all analysts in LHCb, is it not an idea idea to follow what is done normally in software development, and first decide on the user-interface design? After that the technical implementation can follow.
The reason I say this is because it became hard to follow for me what parts are intended to be "user interface" and what parts are "technical implementation details"; as things such as "DataDepWrappers" are probably intended to be wrapped in another function such that the analyst does not have to deal with it directly? (short version of this comment: What is the intended concept option file which has this event tuple information?)
struct JustReturnOdinNoMatterWhatIAmGiven { // some magic that is constructed from a string representing a TES location, // and on 'prepare' returns a lambda which when invoked with anything returns // LHCb::ODIN const& of the current event};
actually, F.MAP_INPUT isn't quite the right thing, as it assumes a relations table and not another functor -- instead, this is just simple composition:
I think I'm thinking a bit more basic I mean, what would one add in the functor collection here to have access to eventNumber, runNumber and gpsTime? (I don't want polarity)
In the second case you could add the event variables when looping over the candidates in the event, e.g. here. Then you would not need the fake dependency on particles. But I would understand if special things should be avoided.
@amathad remind me, is the key of a FunctorCollection a prefix for the branch name? So if one does variables['Jpsi'] = {"RUN_NUMBER": F.CANDIDATE_RUN_NUMBER, , the branch is called Jpsi_RUN_NUMBER?
The event info functors are themselves defined to be DataDepWrapper, which will cause problems for composition (I think). And also I am not sure when the composition MR will be in. So have gone for this: !2737 (closed) (can discuss there).
@sstahl: Suggestion of a special branch “EVENT” is a good one. Achieved it in this MR (DaVinci!654 (merged)), but probably want to hide it from the users (i.e. make it a special branch like "ALL"). Lets discuss there with others.