Skip to content

Allow multi ntuple definitions per output file

Benjamin Rottler requested to merge allow-multi-ntuple-defs-per-file into master

Release notes

Add support for using multiple ntuple declarations for one output file

Details

This MR enables us to define ntuples in a more modular way. See the following example ntuple definition file:

common:
    float varA << $(ObservableA),
    float varB << $(ObservableB);

region1:
    float varC << $(ObservableC)

region2:
    float varD << $(ObservableD)

@CutRegion1: common >> ntuples/$(jobID)_region1.root:ntuples
@CutRegion1: region1 >> ntuples/$(jobID)_region1.root:ntuples
@CutRegion2: common >> ntuples/$(jobID)_region2.root:ntuples
@CutRegion2: region2 >> ntuples/$(jobID)_region2.root:ntuples

Previously this resulted in ntuples which looked like this (example for region1 file with dummy values):

************************************************************
*    Row   *     varA *       varB *      weight *    varC *
************************************************************
*        0 *        10 *        20 *         1 *         0 *
*        1 *         0 *         0 *         0 *       100 *
*        2 *        50 *        60 *         1 *         0 *
*        3 *         0 *         0 *         0 *       200 *

I.e. the information of the two different ntuple definitions ended up in different rows of the ntuple, which are treated as separate events in further steps.

This MR fixes this behavior, resulting in the following ntuple structure:

************************************************************
*    Row   *     varA *       varB *      varC *    weight *
************************************************************
*        0 *        10 *        20 *       100 *         1 *
*        0 *        50 *        60 *       200 *         1 *

This is done by caching all the created TQNTupleDumperAnalysisJobs. Before creating a new job we now check if there is an existing job with the same filename, treename, and cut definition. If this is the case, the existing job is used to further add more branches and no new job is created.

Merge request reports