New phase-space region scribbler, flags for controlling underflow / overflow
This MR adds two new features:
- Flags to disable / enable underflow and overflow bins as requested on issue #17 (closed)
- A new phase-space region scribbler based largely on the existing cut-flow scribbler, fixing issue #8 (closed)
Underflow and overflow bins
By default these are enabled however if you want to control them in the config, add disable_underflow
and disable_overflow
to the binned dataframe config and a boolean value which is true if you want to turn off the corresponding bin, eg:
DiMuonMass:
dataset_col: true
binning:
- {in: DiMuon_Mass, out: dimu_mass, bins: {low: 60, high: 120, nbins: 60, disable_underflow: true}}
weights: {weighted: EventWeight}
Phase-space region scribbler
This new scribbler is very similar to the CutFlow stage which removes events from the processing chain. The config is almost exactly the same and it also produces an output table showing how many events pass each cut. The difference though is that it attaches a new variable to the tree which is true or false depending on whether or not the event passed the cut. This variable can then be used to bin on in a binned dataframe or in a subsequent cut-flow / phase-space definition. The primary use-case is to allow multiple parallel regions to be defined, for example, using the following stages config:
stages:
- baseline-cuts: fast_carpenter.CutFlow
- signal_region: fast_carpenter.SelectPhaseSpace
- dimuon_control: fast_carpenter.SelectPhaseSpace
- single-mu_control: fast_carpenter.SelectPhaseSpace
which would remove events failing a baseline selection, then add three variables stating whether or not an event is in the signal region or one of two controls. There is no built-in check that a region is orthogonal to another, but this can be easily achieved by checking the variable produced in a proceeding region, eg. for dimuon_control we would check that the events are not already in the signal_region.
The only other difference between the SelctPhaseSpace scribbler and the CutFlow stage is that the config file needs to have the name of the output variable provided to it.