Draft: Use mutexes instead of atomic ops on shared_ptr in CachedParticlePtr (!66554) · Merge requests · atlas / athena

Beojan Stanislaus requested to merge bstanisl/athena:cachedparticleptr-mtx into main Oct 17, 2023

This is a study into using mutexes instead of shared_ptr atomics in CachedParticlePtr to reduce high CPU usage due to a very suboptimal implementation of those atomic operations in libstdc++.

Using TBB spin_rw_mutex means the additional memory required is 8 bytes per CachedParticlePtr (as opposed to 40 bytes with std::mutex). The mutex implementation shows much lower CPU usage than the atomics implementation, though higher than I would expect.

Surprisingly, about 13% of CPU time during MT pileup digitization execute is spent waiting on these mutexes. This is made up almost entirely of about 21.4% of time in the pixel digitization tool and 14.5% of time in the ITk strips digitization tool.

Pinging @jchapman @ssnyder @tadej

Pinging JIRA ticket ATLASSIM-4814

Edited Oct 17, 2023 by Beojan Stanislaus

Draft: Use mutexes instead of atomic ops on shared_ptr in CachedParticlePtr

Merge request reports