TrigConfHLTUtils: performance improvement to string2hash
Background
We use the string2hash
function to calculate a 32-bit hash for each HLT identifier (e.g. chain names) that is stored in the data instead of the plain string. This hash function has been unchanged since Run-1. And while it would be much safer and faster to use a vectorized 64-bit hash (e.g. CxxUtils::crc64) that will have to be post-poned for Phase-II. In principle, the hash values should only be calculated and stored once during the job startup. However, it's easy to mistakenly calculate and/or lookup the hashes during execution (see ATR-27765).
Status quo
On each invocation of string2hash
, the hash value is calculated, a collision detection is performed and if not available yet, the hash->name mapping is stored in a tbb::concurrent_hash_map
.
Improvement 1
In addition to the hash->name mapping also store the reverse name->hash mapping. This avoids having to re-calculate the hash and perform the collision detection each time. This is labeled as "TBB (with lookup)" in the following.
Improvement 2
Replace the TBB hash map with the concurrent maps from CxxUtils (by @ssnyder) that are optimized for the write-once, read-often use-case. This is labeled as "CxxUtils (with lookup)" in the following.
Implementation details
-
CxxUtils::ConcurrentStrMap
is used for the name->hash mapping -
CxxUtils::ConcurrentMap
is used for the reverse hash->name mapping. Note that this map only supports storage of 64-bit (pointer) values. So we cannot store the string directly but have to store a pointer to a string. - An additional wrapper class (
HashStore
) is needed to trigger the memory cleanup at the end of the job (this is mostly cosmetic to avoid one-time leaks in valgrind, etc.)
Performance comparison
The following plot shows the results of a synthetic benchmark. The ~10k unique identifiers (L1, HLT, leg names) were taken from the v1Dev menu. The hashes for all strings are calculated once and then read 1M times from 1-8 threads. The results show:
- The additional lookup speeds up the TBB implementation by a factor 2.7 with 1 thread
- The CxxUtils and improved TBB implementations perform the same with 1 thread
- For multiple threads the CxxUtils implementation vastly outperforms the (improved) TBB implementation (factor 4-7)
Performance impact
Despite these nice results, the impact on a real trigger job is however entirely negligible. Running an HLT job with 4 threads in vtune I don't see a measurable difference between the two implementations. This is probably because unlike for the synthetic benchmark there is almost no concurrent access for the hash map in a real trigger job. So the TBB maps perform just fine.
Merge request reports
Activity
- Resolved by Frank Winklmeier
Thanks @fwinkl - I'm surprised how poorly the TBB is scaling here, it was sold as non-locking for
const_accessor
access modes!The two-way map is a nice refactor too.
This merge request affects 1 package:
- Trigger/TrigConfiguration/TrigConfHLTUtils
This merge request affects 4 files:
- Trigger/TrigConfiguration/TrigConfHLTUtils/CMakeLists.txt
- Trigger/TrigConfiguration/TrigConfHLTUtils/Root/HLTUtils.cxx
- Trigger/TrigConfiguration/TrigConfHLTUtils/TrigConfHLTUtils/HLTUtils.h
- Trigger/TrigConfiguration/TrigConfHLTUtils/test/test_string2hash.cxx
Adding @jmharris ,@oliveirg ,@orlando ,@cmorenom ,@mark ,@tamartin ,@paulama ,@asonay as watchers
added 23.0 Trigger analysis-review-required labels
CI Result SUCCESS (hash a3c04316)Athena AnalysisBase AthAnalysis externals cmake make tests Full details available on this CI monitor view. Check the JIRA CI status board for known problems
Athena: number of compilation errors 0, warnings 0
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 73501]Even if it doesn't have real world benefits. I am still in favour of making the change as it can then be discounted as a reason for bad performance in scaling experiments.
Edited by Adam Edward Bartonadded review-pending-level-1 label
This merge request affects 1 package:
- Trigger/TrigConfiguration/TrigConfHLTUtils
This merge request affects 4 files:
- Trigger/TrigConfiguration/TrigConfHLTUtils/CMakeLists.txt
- Trigger/TrigConfiguration/TrigConfHLTUtils/Root/HLTUtils.cxx
- Trigger/TrigConfiguration/TrigConfHLTUtils/TrigConfHLTUtils/HLTUtils.h
- Trigger/TrigConfiguration/TrigConfHLTUtils/test/test_string2hash.cxx
Adding @jmharris ,@oliveirg ,@orlando ,@cmorenom ,@mark ,@tamartin ,@paulama ,@asonay as watchers
CI Result SUCCESS (hash a3c04316)Athena AnalysisBase AthAnalysis externals cmake make tests Full details available on this CI monitor view. Check the JIRA CI status board for known problems
Athena: number of compilation errors 0, warnings 0
AnalysisBase: number of compilation errors 0, warnings 0
AthAnalysis: number of compilation errors 0, warnings 0
For experts only: Jenkins output [CI-MERGE-REQUEST-CC7 73639]added review-approved label and removed review-pending-level-1 label
added review-approved-point1 label
added analysis-review-approved label and removed analysis-review-required label
added review-approved-tier0 label
mentioned in commit fc282277
- Resolved by Frank Winklmeier
Hi @fwinkl, this one causes issues in
TrigCostMonitor_AlgorithmIdentifier_test_ctest
as you changed logging.
mentioned in merge request !64130 (merged)
mentioned in merge request !64153 (merged)