Skip to content
Snippets Groups Projects

AthenaMonitoringKernel: Performance improvement for 1D histograms

Merged Frank Winklmeier requested to merge fwinkl/athena:mon_perf into master
All threads resolved!

A first set of fixes to improve the performance of the monitoring framework. There is currently significant overhead (factor 10 or more depending on the use-case). See ATR-21202 for an example as reported by @rbielski.

This MR is mainly dealing with reducing the overhead on 1D histograms for the case when no cut mask and/or weight is being used. Use lambda's wherever possible instead of function pointers to take advantage of compiler optimizations. However, much more fundamental changes will be required to get the number of memory allocations reduced.

In addition:

  • Add GenericMonPerf_test.exe test executable that can be used to compare basic filling operations with ROOT (see --help for options). It can also be used to profile the code with valgrind/callgrind. See the header of the source file for details.

cc @rbielski @tbold @ponyisi @cburton

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Peter Onyisi resolved all threads

    resolved all threads

  • Hi Frank,

    Thanks for taking a look at this. I spent a fair amount of time on optimization using VTune a while ago in the context of offline monitoring (where in the end I stopped because the runtime was completely dominated by, e.g., T/P conversion, not by histogram fills ...) but for your case it may matter. (Also agreed that it would be better to directly use the backing data instead of making additional vectors ... though it wasn't obvious from my tests that this was super-important, it seemed that we were more limited by branch prediction thrashing). Do you have profiling results that you can share?

    Best, Peter

  • Author Maintainer

    Indeed, the hotspots might be quite different in trigger and offline DQ. I attached a callgrind profile of the micro benchmark in this MR to ATR-21210 (we can continue the discussion there). This micro-benchmark is certainly not representative for a DQ application. But it's coming close to the trivial monitoring that Rafal was trying to do in the description of this MR and that started this work.

  • Joerg Stelzer
  • Joerg Stelzer
  • Changing label back to 'user-action-required' until the all discussions are resolved.

    Xiaozhong (L1)

  • Frank Winklmeier resolved all threads

    resolved all threads

  • Push to L2 since there is a lot of changes.

    Xiaozhong (L1)

  • added review-approved label and removed review-pending-level-2 label

  • Looks good. Approving & accepting.

  • merged

  • Walter Lampl mentioned in commit 1fcded18

    mentioned in commit 1fcded18

  • Please register or sign in to reply
    Loading