Make bin dropping great again
Executive summary of changes:
- Introduced a new histogram to be used for blinded bins (suffixed with
_dropBin) . Blinded bins will include dropped bins. This not dis-entangles regular binning histograms (_regBin) from histograms with blinding applied (_dropBin) (feature) - Added
AutomaticDropBinsto theRegion::HasDropBinRegionsfunction (bug fix) - Updated info/debug/warning messages to offer better info to user on what's happening
- Added a
BlindBinsoptions, for visually blinding bins manually -
DropBinsoption no longer blinds a bin visually (only drops from the fit) --BlindBinsneeded to visually blind corresponding bin - Add option
BlindBinsto documentation - 2
Common::methods are implemented and should slowly be propagated to main code --CombineVectorswhich will concatenate 2 vectors, with an option to treat the output as asetwith unique elements. ACombineHistosFromHistosVecmethod will take a vector of histograms and combine them into one -- this is to replace the many instance in the code where we create a total signal/backgrounds histogram. I did one replacement that is relevant to this MR.
Associated issue
#91 (closed) #84 (closed) #92 (closed) #99 (closed)
More details:
_dropBin Histograms
The addition of a new type of histogram suffixed with _dropBin to handle bin dropping removes a lot of ambiguity from the code and simplifies the logic and makes user experience smoother when bin dropping is being used.
Histograms
- Users may set DropBin options in their config and run the
n/h/bsteps to enforce it. This will create a_dropBinhistogram for every sample/systematic histogram, without the bins being dropped. The following is a summary of the different histograms users get:
| Suffix. | Meaning | Default when? |
|---|---|---|
no suffix |
Smoothed/rebinned | Plotting |
_regBin |
RegularBinning / same as no suffix
|
Fitting without Dropped Bins |
_dropBin |
Dropped Bins (blinded) | Fitting with Dropped Bins |
_orig |
Unsmoothed/original bins | Never |
The above logic allows the user once they have dropped bins once by running b/h/n (i.e. created _dropBin histogram), to simply turn the bin dropping on and off at their convenience without re-running those steps.
Workspace
- Running the
w-step will produce either 1 workspace or 2 workspaces, never more, depending on the setup:
| Workspace Type | VRs Exist? | Drop Bins Exist? | Suffix of Workspaces |
|---|---|---|---|
| FitOnly | - | No | _combined |
| FitOnly | - | Yes |
_combined & allBinsFitRegions_combined
|
| FitPlusPlots | No | Yes |
_combined & allBinsFitRegions_combined
|
| FitPlusPlots | No | No | _combined |
| FitPlusPlots | Yes | - |
_combined & _allRegions_combined
|
| PlotsOnly | Yes | - | _allRegions_combined |
| PlotsOnly | No | - | ERROR |
- The histograms used in the above workspaces are as follows
| Workspaces | Drop Bins ? | Histogram Used. |
|---|---|---|
_combined |
No | regBin |
_combined |
Yes | dropBin |
allBinsFitRegions_combined |
- | no suffix |
_allRegions_combined |
- | no suffix |
Fitting
- Fits will always use
_combinedworkspace ( seeTRExFit::PerformWorkspaceCombination->TRExFit::PrepareMixedDataset->TRExFit::Fit)
Plotting
-
Plotting is done in a few functions
-
TRExFit::DrawAndSaveAll- WS dependance - will handle pre- and post- fit plots -
TRExFit::DrawSummary- WS dependance - will handle the summary plots ... -
TRExFit::DrawMergedPlot- no WS dependance - will handle ... -
TRExFit::BuildYieldTable- WS dependance - will handle the yield tables -
TRExFit::PrintSystTables- WS dependance - will handle the syst tables -
TRExFit::DrawSignalRegionsPlot- no WS dependance - will handle the signal region purity plots (pre-fit only) -
TRExFit::DrawPieChartPlot- no WS dependance - will handle the region composition pie charts
-
-
All plotting functions (pre and post fits) that depend on the WS have the following logic to determine which workspace to use (implemented in executable):
std::string resultFileAddition("");
if (myFit->fHasValidationRegions){
resultFileAddition = "_allRegions";
}
else if (myFit->fHasDropBinRegions){
resultFileAddition = "_allBinsFitRegions";
}
const std::string wsPath = myFit->fName+"/RooStats/"+myFit->fInputName+resultFileAddition +"_combined_"+myFit->fInputName+myFit->fSuffix+"_model.root";
Blinding
The blinding default behaviour when bins are dropped is changed.
| Option | Bin in Fit? | Bin visually Blinded? |
|---|---|---|
BlindingThreshold |
Yes | Yes |
AutomaticDropBin |
No (where BlindingThreshold exceeded) |
Yes |
DropBin |
No | No |
BlindBin |
Yes | Yes |
The visual blinding is propagated to the following TRExFitter outputs (where data enters):
- Pre/Post fit plots: Blinded bins will be shaded
- Summary plots: If a region has a bin that is blinded or dropped, the region will be blinded by a shading
- Yield Tables: If a region has a bin that is blinded or dropped, the
Datasample row will show-1
The set of blinded bins for a region (both postfix and prefix) are set before any plotting/tabulating takes place -- in the trex-fitter.cc executable. This replaces the computation of blinded bins inside of the plotting functions many times.
To-do list before merging:
-
Changelog has been updated