Skip to content

Improve on a major performance issue with jTowerContainer::findTower()

Rafal Bielski requested to merge rbielski/athena:jtower-find into master

We are currently suffering from very bad performance of Trigger tests, leading to multiple problems including occasional timeouts of Trigger CI tests. As discovered in ATR-24078 half of the problem is LVL1 simulation. It was also reported by @smh in the ticket that the most expensive function in the tests is LVL1::jTowerContainer::findTower.

There were three major performance issues with this function:

  • jTowerContainer used std::map for an internal map where the more efficient std::unordered_map could be used
  • the findTower method called std::map::find on its internal map twice on every call for no reason
  • the clients excessively called this expensive method in nested loops even where the result didn't change inside the loop or where the output was only used for debug messages

Change the internal map type, clean up the findTower implementation and the unnecessary excessive calls in the clients.

This speeds up the entire LVL1 simulation by ~670 ms per event. It's still really slow, but it's a very significant and very needed improvement.

TotalTime

Tested by running a job with just L1Sim on data:

athenaHLT.py -c "setMenu='LS2_v1';doL1Sim=True;doEmptyMenu=True;BFieldAutoConfig=False;doWriteBS=False;" --dump-config-reload --threads=1 --nprocs=10 -n 1000 -f/cvmfs/atlas-nightlies.cern.ch/repo/data/data-art/TrigP1Test/data18_13TeV.00360026.physics_EnhancedBias.merge.RAW._lb0151._SFO-1._0001.1 TriggerJobOpts/runHLT_standalone.py

Note: findTower had a bug where it was relying on undefined behaviour and effectively returning the first element for out-of-range index (at least as far as I checked in a simple limited test, but it is really undefined behaviour). I keep the returning of the first element here because the client code is unprotected and crashes if I correctly return nullptr for out-of-range index. This is irrelevant to the performance issues I want to urgently fix, so I just leave a FIXME comment in the code and leave it to @afaulkne to organise proper fixes for this bug and the clients.

FYI also @wiedenma, @mark

Merge request reports