Skip to content

Make bin dropping great again

Executive summary of changes:

  • Introduced a new histogram to be used for blinded bins (suffixed with _dropBin) . Blinded bins will include dropped bins. This not dis-entangles regular binning histograms (_regBin) from histograms with blinding applied (_dropBin) (feature)
  • Added AutomaticDropBins to the Region::HasDropBinRegions function (bug fix)
  • Updated info/debug/warning messages to offer better info to user on what's happening
  • Added a BlindBins options, for visually blinding bins manually
  • DropBins option no longer blinds a bin visually (only drops from the fit) -- BlindBins needed to visually blind corresponding bin
  • Add option BlindBins to documentation
  • 2 Common:: methods are implemented and should slowly be propagated to main code -- CombineVectors which will concatenate 2 vectors, with an option to treat the output as a set with unique elements. A CombineHistosFromHistosVec method will take a vector of histograms and combine them into one -- this is to replace the many instance in the code where we create a total signal/backgrounds histogram. I did one replacement that is relevant to this MR.

Associated issue

#91 (closed) #84 (closed) #92 (closed) #99 (closed)

More details:

_dropBin Histograms

The addition of a new type of histogram suffixed with _dropBin to handle bin dropping removes a lot of ambiguity from the code and simplifies the logic and makes user experience smoother when bin dropping is being used.

Histograms

  • Users may set DropBin options in their config and run the n/h/b steps to enforce it. This will create a _dropBin histogram for every sample/systematic histogram, without the bins being dropped. The following is a summary of the different histograms users get:
Suffix. Meaning Default when?
no suffix Smoothed/rebinned Plotting
_regBin RegularBinning / same as no suffix Fitting without Dropped Bins
_dropBin Dropped Bins (blinded) Fitting with Dropped Bins
_orig Unsmoothed/original bins Never

The above logic allows the user once they have dropped bins once by running b/h/n (i.e. created _dropBin histogram), to simply turn the bin dropping on and off at their convenience without re-running those steps.

Workspace

  • Running the w-step will produce either 1 workspace or 2 workspaces, never more, depending on the setup:
Workspace Type VRs Exist? Drop Bins Exist? Suffix of Workspaces
FitOnly - No _combined
FitOnly - Yes _combined & allBinsFitRegions_combined
FitPlusPlots No Yes _combined & allBinsFitRegions_combined
FitPlusPlots No No _combined
FitPlusPlots Yes - _combined & _allRegions_combined
PlotsOnly Yes - _allRegions_combined
PlotsOnly No - ERROR
  • The histograms used in the above workspaces are as follows
Workspaces Drop Bins ? Histogram Used.
_combined No regBin
_combined Yes dropBin
allBinsFitRegions_combined - no suffix
_allRegions_combined - no suffix

Fitting

  • Fits will always use _combined workspace ( see TRExFit::PerformWorkspaceCombination -> TRExFit::PrepareMixedDataset -> TRExFit::Fit)

Plotting

  • Plotting is done in a few functions

    • TRExFit::DrawAndSaveAll - WS dependance - will handle pre- and post- fit plots
    • TRExFit::DrawSummary - WS dependance - will handle the summary plots ...
    • TRExFit::DrawMergedPlot - no WS dependance - will handle ...
    • TRExFit::BuildYieldTable - WS dependance - will handle the yield tables
    • TRExFit::PrintSystTables - WS dependance - will handle the syst tables
    • TRExFit::DrawSignalRegionsPlot - no WS dependance - will handle the signal region purity plots (pre-fit only)
    • TRExFit::DrawPieChartPlot - no WS dependance - will handle the region composition pie charts
  • All plotting functions (pre and post fits) that depend on the WS have the following logic to determine which workspace to use (implemented in executable):

        std::string resultFileAddition("");
        if (myFit->fHasValidationRegions){
            resultFileAddition =  "_allRegions";
        }
        else if (myFit->fHasDropBinRegions){
            resultFileAddition = "_allBinsFitRegions";
        }

        const std::string wsPath = myFit->fName+"/RooStats/"+myFit->fInputName+resultFileAddition +"_combined_"+myFit->fInputName+myFit->fSuffix+"_model.root";

Blinding

The blinding default behaviour when bins are dropped is changed.

Option Bin in Fit? Bin visually Blinded?
BlindingThreshold Yes Yes
AutomaticDropBin No (where BlindingThreshold exceeded) Yes
DropBin No No
BlindBin Yes Yes

The visual blinding is propagated to the following TRExFitter outputs (where data enters):

  • Pre/Post fit plots: Blinded bins will be shaded
  • Summary plots: If a region has a bin that is blinded or dropped, the region will be blinded by a shading
  • Yield Tables: If a region has a bin that is blinded or dropped, the Data sample row will show -1

The set of blinded bins for a region (both postfix and prefix) are set before any plotting/tabulating takes place -- in the trex-fitter.cc executable. This replaces the computation of blinded bins inside of the plotting functions many times. To-do list before merging:

Edited by Mohamed Aly

Merge request reports

Loading