Post-processing workflow
Disclaimer: I am pretty confident this is just reinventing some wheel which has already been invented many times but since there was no concrete implementation proposed over the last weeks, let me open the discussion about the elephant in the room here. In case there is some existing implementation we can easily adapt to our use case which shows up in the discussion, it will be very welcome for sure.
Context: We are looking after a way to run over the EasyJet tuples to compute some extra variables. Potential use cases cover the computation of the sum of the event weights from the CutBookKeeper histograms, the retrieval from cross-sections using the PMG tool (#114 (closed)), the computation of some extra input variables for MVA studies...
Proposed implementation: The implementation should be flexible enough to accommodate for any type of extra variables. A modular approach based on configurable tools which can be added in a main steering algorithm is probably reasonable. We could have then have for the main steering some
- initialize method: to access the TTree in the input file, load the branches for the input variables, initialise all the tools, initialise a new Tree to store the extra variables + a selection of the input variables to be copied
- execute method: loop over the events to compute all of the extra variables, using the tools + apply potential tighter selections
- finalize method: write the new Tree and close everything nicely
To access the input and output variables, we can consider using some map<AnalysisEnum, map<VariablesEnum, float>> m_input/outputVariables
structure. A base inheritance class for the steering could be developed, for variables needed in all analysis streams (sum of event weights, cross-sections...), while derived classes could be used to specialise this to any of the analysis streams for specific variables
Feedback is very welcome on this.
We can start some preliminary investigations with @mfujimot in the coming days and after the Christmas break.