Skip to content

Rewrite DQM histogram merging

Walter Lampl requested to merge wlampl/athena:histMergeRewrite into 24.0

Rewrite the code for Histogram merging (DataQualityUtils/MonitoringFile.cxx/.h)

Advantages:

  • Reduced memory consumption (~3 Gbytes instead of 6 for the final merging step).
  • Fewer code lines
  • More modern look & feel, like systematically use std::unique_ptr, std::optional and move semantics for memory managment
  • Unify output-level handling

The execution time is roughly the same, it's dominated by the underlying ROOT code

Dropped functionality:

  • No longer allow multiple runs per Histogram file, expect exactly one run_NNNNNN directory

There is more code in MonitoringFile.cxx that looks obsolete to me, but this is not cleaned up in this MR.

Checks done so far: I tried using the rootcomp.py utility to do bin-by-bin comparisons. Doing that on all >100k histograms turned out to be too slow (killed to job after 48h). I did use it on a few (hopefully representative) subsets of the directories in the monitoring file. I did verify (using a purpose-written script) that the number of entries in every single histogram matches.

Next steps: The way we merge the lb_nnnn into lowStat_LBnnnn-mmmm and finally into run_NNNNNN looks somewhat error-prone to me. I see the risk that we merge already-merged files and thus double-count entries. I suggest to do this steps only once, on the final, super-merged histogram file, together with the post-processing. To obtain histograms of partially-processed runs, I suggest to copy an intermediate file, run lumiblock- and lowStat merging + post-processing on the copy.

cc @ponyisi @ebergeas @sawyer @nairz

Merge request reports

Loading