Skip to content

Use prmon in Trigger ART tests

Rafal Bielski requested to merge rbielski/athena:tvs-prmon into master

As discussed in ATR-19462, adding prmon to monitor resource usage of all "exec" steps of Trigger ART tests.

Advantages wrt PerfMon:

  • separate process, doesn't interfere with the athena job
  • able to deal with multi-process programs like athenaHLT and provides a reasonable metric (PSS) for them

Disadvantages wrt PerfMon:

  • doesn't know which part of the process is event loop, so cannot easily search for memory leaks

To overcome the disadvantage, a simple analysis is implemented which fits the memory usage distribution with two straight lines corresponding to initialisation and event loop. For now, keeping both prmon and PerfMon in the tests to evaluate if prmon can give reasonable estimates on its own. If this turns our to work stably enough, we will be able to retire PerfMon at least from some of the Trigger ART tests, particularly those with athenaHLT.

Even if the absolute values of the slopes don't correspond exactly to the event loop memory leak, the most important point to us will be how this changes day-to-day and its potential to detect new leaks.

Example of the fit:
prmon_memfit_pss

Example summary of the test (extra-results.json):

{
    "num-errors": "0", 
    "num-warnings": "18684", 
    "num-histograms": "2816", 
    "prmon": {
        "vmem": "5849.578", 
        "rss": "4460.379", 
        "pss": "4449.083", 
        "delta-vmem": "312.486", 
        "delta-rss": "313.697", 
        "delta-pss": "313.889"
    }, 
    "memory-usage": {
        "vmem": "5535.552", 
        "delta-vmem": "36.151", 
        "rss": "4280.662", 
        "delta-rss": "37.330"
    }

Merge request reports