Histogram-specific plotting at visualize.py
I wanted to address a personal (but almost certainly by others as well) annoyance when trying to make many plots at visualize.py
step, and one that I think ultimately improves the out-of-the-box plotting functionality.
The issue I faced was that often times I wanted to plot histograms in very different styles. Fortunately, the existing tags supported by the TQDefaultPlotter
class can be quite fine-tunable through the plotter.<key>: <val>
lines in the master visualize config. But even in the simplest scenario of plot-tweaking, e.g. pT spectrum in log-scale and eta in lin-scale, the config eventually ends up looking something like this:
# part of visualize.py master config
makePlots: */leadLep* # producing leadLepPt & leadLepEta plots
makeLinPlots: true # default, produces hist-lin.pdf
makeLogPlots: true # produces hist-log.pdf
# pT-specific settings
plotter.style.logScaleX: true
plotter.style.logMin: 1.0 # log plots can't go down to 0.0
plotter.style.max.scale: 100.0 # push histogram yield down in log-scale.
# eta-specific settings
# commented out when producing pT plots and vice versa.
#plotter.style.logScaleX: false
#plotter.style.linMin: 0.0 # lin plots can go down to 0.0
#plotter.style.max.scale: 1.5 # push histogram yield down in log-scale.
Notice the config "blocks" that are meant to be for plotting a specific observable. The issues that come up with this workflow are:
- Each block is fine-tuned for a specific observable, but one typically produces many other plots all at once during
visualize.py
: so any block ofplotter.<key>: <val>
configs enabled at any given run is not suitable for others. - As a corollary, the user must run
visualize.py
N times to get N plots in their respectively desired styles, all the while commenting & un-commenting from one config block to the next. - Once run, the previously styled plots are "lost" (file overwritten with wrong settings by the current run).
- The master config becomes very crowded when one
plotter.<key>: <val>
line can only change one tag at a time. This really hinders the otherwise wonderful "all the plots you ever want!" machinery that CAF provides.
The solution I want to propose in this MR is to apply these tags on a per-histogram basis, specifically for each CutX/HistY
listed by makePlots
. A working example in this branch uses a separate config file brought in via histogramPlotFiles
in the master config:
# example of proposed 'plot-histograms.cfg' file to specify plotter options on a per-cut/hist basis
# pT spectrum
.name="*/leadLepPt", style.logScale=true, style.logScaleX=true, style.logMin=1., style.max.scale=100.
# eta spectrum
.name="*/leadLepEta", style.logScale=false, style.logScaleX=false, style.linMin=0., style.max.scale=1.5
The main benefits of this approach are:
- The syntax structure is much more concise than having blocks of
plotter.<key>: <val>
lines, as multiple tags can be applied at once to a plot in a single line. - Only the tags matching the
Cut/Hist
being processed are applied to the plot, i.e. runvisualize.py
once, and fine-tune each histogram plot separately.
Here is a README.md that I wrote to explain how this config file works in more detail, which should probably be added to CAFExample
if this MR is indeed fulfilled. I would love to receive feedback on this, as I think this is a promising way to improve plotting in CAF!