Skip to content

Add SchedulerMonSvc to monitor Scheduler status in online histograms

Rafal Bielski requested to merge rbielski/athena:scheduler-mon into master

Add SchedulerMonSvc to produce online histograms showing statistics of algorithm states inside scheduler at fixed timestamps (snapshots). This is using the newly implemented Scheduler API for this - gaudi/Gaudi!1072 (merged) - thanks to @bwynne. The histograms show algorithm state counts and number of free slots - both as a function of snapshot number and as a function of time in seconds. Additionally, a timing histogram for the monitoring function is added.

The solution is lock-free and seems to work fine, which I verified with numerous print-outs, which are now removed. The histograms look reasonable as presented below.

Right now the monitoring is hooked directly into the online HLT event loop manager and, if enabled, starts just before the event loop begins and stops just after the event loop ends. In principle it is also possible to hook this into the offline AthenaHiveEventLoopMgr or make the service start/stop monitoring based on some incidents, but that's not done here. Tagging @amete in case this is interesting for offline.

A "Modifier" is added to enable this easily with runHLT_standalone.py by adding -c "enableSchedulerMon=True".

Can be tested with:

athenaHLT.py \
-c "setMenu='PhysicsP1_pp_run3_v1';doBeamspotSlice=False;doMonitorSlice=False;doStreamingSlice=False;enableSchedulerMon=True;" \
--threads=12 --concurrent-events=6 --nprocs=1 \
--file=/afs/cern.ch/work/r/rbielski/public/smallMenuPerfStudy/data18_13TeV.00360026.physics_EnhancedBias.mergedBS._0001.data \
TriggerJobOpts/runHLT_standalone.py

The following plots are created with the above command:



Tagging @bwynne, @tbold, @fwinkl, @tamartin, @smh

Edited by Rafal Bielski

Merge request reports