Explicitly enable TGeoManager multi-threading for DD4hepSvc to stop seg-faults
Required to make FTHitEfficiencyMonitor work when multi-threaded, which is urgently needed to add FTHitEfficiencyMonitor to CalibMon.
The ROOT geometry manger in Detector/DD4hep requires some explicit configuration for multi-threading. If SetMaxThreads
is not set, the gGeoManager
assumes it is running single-threaded, and calls geometry navigation functions without any thread-safety (meaning this function can cause a seg-fault).
TGeoManager::SetMaxThreads(n_threads)
needs to be called a single time in the main thread, and should be set to the number of threads passed to the application options.
After this is set, a TGeoNavigator
can be added per thread, and function calls to the TGeoManager
will provide the thread-local navigators.
See also Detector!608 which adds a check there is a TGeoNavigator for the thread before calling the problem function
To do:
-
What value should max threads be set to -
Is it possible to run some code once per thread?
TGeoManger::SetMaxThreads(N)
is doing:
What Inside the function setMaxThreads(N)
ROOT::EnableThreadSafety()
is called, and the current (if it exists) TGeoNavigator
is replaced with one mapped to the current thread id. It then sets the flag kMultiThread = kTrue
and loops over the detector volumes to create thread-local data for N threads. If there is existing thread-local data (i.e. if setMaxThreads
was previously called) then the existing thread-local data is removed first. For this reason setMaxThreads(N)
should be called once in the main thread, before any multi-threaded code runs, and will not work if called from within the multi-threading since the thread<->data maps would be repeatedly deleted and not shared between threads.
This does not create N TGeoNavigator
s. Therefore, when calling functions that use geometry navigation, it should be ensured that there is a thread-local TGeoNavigator
with e.g.
auto& manager = dd4hep::Detector::getInstance().manager();
TGeoNavigator* nav = manager.GetCurrentNavigator();
if ( !nav ) nav = manager.AddNavigator();
myFunctionThatUsesGeometryNavigation() ;
These functions get a thread-local TGeoNavigator
only if kMultiThread = kTrue
. If kMultiThread = kFalse
, i.e. if setMaxThreads
was not previously called, then TGeoManager
assumes there is only a single thread and does not search to the thread<->navigator map, which causes a segfault if the application is actually being run with multiple threads, since there is only a TGeoNavigator
for the main thread. If kMultiThread = kTrue
and there is no TGeoNavigator
for the current thread, manager.GetCurrentNavigator()
returns a null pointer, which is why you have to check and add one if this is the case.
See the functions below
https://root.cern.ch/doc/master/classTGeoManager.html#ab5cfc0292200e4d941676d353e7308d5
void TGeoManager::SetMaxThreads(Int_t nthreads)
{
if (!fClosed) {
Error("SetMaxThreads", "Cannot set maximum number of threads before closing the geometry");
return;
}
if (!fMultiThread) {
ROOT::EnableThreadSafety();
std::thread::id threadId = std::this_thread::get_id();
NavigatorsMap_t::const_iterator it = fNavigators.find(threadId);
if (it != fNavigators.end()) {
TGeoNavigatorArray *array = it->second;
fNavigators.erase(it);
fNavigators.insert(NavigatorsMap_t::value_type(threadId, array));
}
}
if (fMaxThreads) {
ClearThreadsMap();
ClearThreadData();
}
fMaxThreads = nthreads + 1;
if (fMaxThreads > 0) {
fMultiThread = kTRUE;
CreateThreadData();
}
}
https://root.cern.ch/doc/master/classTGeoManager.html#a4f37cb2eb0cdfb67ce89c5fd783c33c3
TGeoNavigator *TGeoManager::GetCurrentNavigator() const
{
TTHREAD_TLS(TGeoNavigator *) tnav = nullptr;
if (!fMultiThread)
return fCurrentNavigator;
TGeoNavigator *nav = tnav; // TTHREAD_TLS_GET(TGeoNavigator*,tnav);
if (nav)
return nav;
std::thread::id threadId = std::this_thread::get_id();
NavigatorsMap_t::const_iterator it = fNavigators.find(threadId);
if (it == fNavigators.end())
return nullptr;
TGeoNavigatorArray *array = it->second;
nav = array->GetCurrentNavigator();
tnav = nav; // TTHREAD_TLS_SET(TGeoNavigator*,tnav,nav);
return nav;
}
https://root.cern.ch/doc/master/classTGeoManager.html#a052ff41b02b7962e6edce8f4f0ab33a5
TGeoNavigator *TGeoManager::AddNavigator()
{
if (fMultiThread) {
TGeoManager::ThreadId();
fgMutex.lock();
}
std::thread::id threadId = std::this_thread::get_id();
NavigatorsMap_t::const_iterator it = fNavigators.find(threadId);
TGeoNavigatorArray *array = nullptr;
if (it != fNavigators.end())
array = it->second;
else {
array = new TGeoNavigatorArray(this);
fNavigators.insert(NavigatorsMap_t::value_type(threadId, array));
}
TGeoNavigator *nav = array->AddNavigator();
if (fClosed)
nav->GetCache()->BuildInfoBranch();
if (fMultiThread)
fgMutex.unlock();
return nav;
}
Then finally the problematic Contains
function, which is only thread-safe if all the above have been called:
https://root.cern/doc/master/classTGeoShapeAssembly.html#acc50f9347ef358a13e5161c2e92f97eb
Bool_t TGeoShapeAssembly::Contains(const Double_t *point) const
{
if (!fBBoxOK)
((TGeoShapeAssembly *)this)->ComputeBBox();
if (!TGeoBBox::Contains(point))
return kFALSE;
TGeoVoxelFinder *voxels = fVolume->GetVoxels();
TGeoNode *node;
TGeoShape *shape;
Int_t *check_list = nullptr;
Int_t ncheck, id;
Double_t local[3];
if (voxels) {
// get the list of nodes passing thorough the current voxel
TGeoNavigator *nav = gGeoManager->GetCurrentNavigator();
TGeoStateInfo &td = *nav->GetCache()->GetInfo();
check_list = voxels->GetCheckList(point, ncheck, td);
if (!check_list) {
nav->GetCache()->ReleaseInfo();
return kFALSE;
}
for (id = 0; id < ncheck; id++) {
node = fVolume->GetNode(check_list[id]);
shape = node->GetVolume()->GetShape();
node->MasterToLocal(point, local);
if (shape->Contains(local)) {
fVolume->SetCurrentNodeIndex(check_list[id]);
fVolume->SetNextNodeIndex(check_list[id]);
nav->GetCache()->ReleaseInfo();
return kTRUE;
}
}
nav->GetCache()->ReleaseInfo();
return kFALSE;
}
Int_t nd = fVolume->GetNdaughters();
for (id = 0; id < nd; id++) {
node = fVolume->GetNode(id);
shape = node->GetVolume()->GetShape();
node->MasterToLocal(point, local);
if (shape->Contains(local)) {
fVolume->SetCurrentNodeIndex(id);
fVolume->SetNextNodeIndex(id);
return kTRUE;
}
}
return kFALSE;
}