An error occurred while fetching this tab.
Fix thread-safety problem in GaudiHandle
- Jun 11, 2020
-
-
scott snyder authored
-
scott snyder authored
-
scott snyder authored
a rare crash in Athena. The component retrieval code in GaudiHandle can lead to operator->() returning a null pointer for a valid handle, if it is called simultaneously from multiple threads. The relevant code in GaudiHandle is: ``` T* operator->() { assertObject(); return m_pObject; } void assertObject() const { if ( UNLIKELY( !isValid() ) ) { throw GaudiException ... } } bool isValid() const { // not really const, because it may update m_pObject return m_pObject || retrieve().isSuccess(); } StatusCode retrieve() const { // not really const, because it updates m_pObject StatusCode sc = StatusCode::SUCCESS; if ( m_pObject && release().isFailure() ) { sc = StatusCode::FAILURE; } if ( sc && retrieve( m_pObject ).isFailure() ) { m_pObject = nullptr; sc = StatusCode::FAILURE; } return sc; } ``` Note that if the pointer is already set when retrieve() is called, it clears it and redoes the lookup. So we could have: ``` thread 1 calls operator->. m_pObject is not set, so it executes through to retrieve() after the test on m_pObject. thread 2 calls operator->. m_pObject is not set, so it executes through to the start of retrieve(). thread 1 continues, does the lookup, sets m_pObject and returns. thread 2 continues, tests m_pObject, finds it set, and calls reset() to clear it. thread 1 continues, returning through operator->. m_pObject is now null, so it returns null. thread 2 continues with the retrieve, and returns a valid pointer from operator->. ``` I've tentatively tried to fix this by making m_pObject atomic, and changing retrieve() so that it will not change m_pObject if it is already set. I think that's consistent with what's implied by retrieve() being const. ATLAS software seems fine with this change, but this is, however, a potential change of behavior, so i point it out here.
-